Re: [unrev-II] Use Cases and Ontologies, including Collective Mental Maps (CMM) with algorithms for weighted links for "Bootstrapping"

From: John J. Deneen (JJDeneen@ricochet.net)
Date: Thu Dec 21 2000 - 23:41:34 PST

  • Next message: John J. Deneen: "Re: [unrev-II] Use Cases and Ontologies"

    Ref.
    ftp://ftp.vub.ac.be/pub/projects/Principia_Cybernetica/Texts_General/PCP_Web.txt

    Collective Mental Maps (CMM) with algorithms for weighted links for
    "Bootstrapping" the Principia Cybernetica Website:

    .... "This philosophy can be straightforwardly applied to the
    development of knowledge in a system such as Principia Cybernetica Web.
    Individual nodes can be seen as pieces of knowledge, and webs of linked
    nodes can be seen as knowledge systems. Recombination takes place when
    links are changed, so that a node which was connected to one node, is
    now connected to another node. Mutation happens when the content of a
    node is changed, or a new node is created. Creation of new nodes and
    links by different contributors provides a continuous source of
    variation. If we want to maintain or improve the quality of knowledge,
    we will also have to apply selection. Selection could be done
    "manually", by members of the editorial board using their own judgement,
    or automatically, by a computer program applying certain formalized
    selection rules. The first option is limited by subjectivity and the
    bounded cognitive capacities of a human being, the second is limited by
    the difficulty to express quality judgements in a formal way, and by the
    rigidity of the resulting rules. The most effective approach seems to
    consist in a mixed human-computer system, where intuitive judgement is
    complemented by computing power (Heylighen, 1991b).

    A possible implementation of such an approach could be found in
    Thagard's (1992) ECHO program. Here, the judgement of human participants
    determines whether two pieces of knowledge (say, propositions or
    arguments in a discussion) are either coherent (confirm each other) or
    incoherent (contradict each other). The neural network-like computer
    program uses these binary coherence relations to decide which of two or
    more competing knowledge systems is best, in the sense that its
    elements are most coherent with each other and with the whole of the
    other knowledge. Although this program was until now only used to
    reconstruct some key debates from the history of science, explaining why
    one theory eventually replaced its rival, it would seem very promising
    to apply such an approach to steer an on-going discussion. Although this
    is not really used as yet, PCP Web's annotation function allows
    contributors to choose a link type for their annotations, distinguishing
    arguments that support a thesis from arguments that refute it. A complex
    web of such arguments and counter-arguments could be analysed by
    ECHO-like algorithms, attracting the attention towards the more
    generally coherent approaches.

    Another selection criterion for knowledge, besides coherence, is
    simplicity. Paraphrasing Ockham's Razor: all other things being
    equal,the simpler a knowledge system is, the better it is (Heylighen,
    1994). Key aspects of this criterion can be implemented in a relatively
    simple way in a knowledge web. The most straightforward aspect is the
    ease with which a piece of knowledge can be located within a web. This
    is of particular importance for distributed hypertext systems, which can
    contain millions of interlinked nodes, making it virtually impossible to
    find a particular node without a priori information. Hierarchical
    classification, as discussed earlier, has fundamental shortcomings, and
    is poorly suited for a system developed in parallel by different
    contributors without co-ordination. Moreover, its model of the
    organization of knowledge is inadequate for cognitive systems based on
    semantic networks. Free creation of associative links can provide a much
    richer model, but is more likely to produce labyrinthine anarchy.

    We have recently developed a method that allows an associative hypertext
    network to "self-organize" into a simpler, more meaningful, and more
    easily usable network. The term "self-organization" is appropriate to
    the degree that there is no external control or editor deciding which
    node to link to which other node: better linking patterns emerge
    spontaneously. The information used to create new links is not internal
    to the network, though: it comes from all users collectively. In that
    sense one might say that the network "adapts" to its users, or "learns"
    from the way it is used.

    Algorithms for such an adaptive web can be very simple. Every potential
    link is assigned a certain "fitness" or "strength". For a given node A,
    only the links with the highest fitness are actualized, i.e. are
    accessible to the user. Within the node, these links are ordered by
    strength, so that the user will encounter the strongest link first. We
    have formulated three separate learning rules for adapting the
    strengths:

    1) A link, say A -> B, which is directly chosen by the user, increases
    its strength. This rather obvious rule can only reorder the links that
    are already available within the node. By definition it cannot actualize
    new links, since these are not accessible to the user. This necessitates
    another rule.

    2) A user might follow an indirect connection between two nodes, say A
    -> B, B -> C. In that case the potential link A -> C increases its
    strength. This is a weak form of transitivity. It opens up an unlimited
    realm of new links. Indeed, one or several increases in strength of A ->
    C may be sufficient to make the potential link actual. The user can now
    directly select A -> C, and from there perhaps C -> D. This increases
    the strength of the potential link A -> D, which may in turn become
    actual, providing a starting point for an eventual further link A -> E,
    and so on. Eventually, an indefinitely extended path may thus be
    replaced by a single link A -> Z. Of course, this assumes that a
    sufficient number of users effectively follow that path. Otherwise it
    will not be able to overcome the competition from paths chosen by other
    users, which will also increase their strengths. The underlying
    principle is that the paths that are most popular, i.e. followed most
    often, will eventually be replaced by direct links, thus minimizing the
    average number of links a user must follow in order to reach his or her
    preferred destination.

    3) A similar rule can be used to implement a weak form of symmetry. When
    a user chooses a link A -> B, implying that there exists some
    association between the nodes A and B, we may assume that this also
    implies some association between B and A. Therefore, the reverse link B
    -> A gets a strength increase. This symmetry rule on its own is much
    more limited than transitivity, since it can only actualize a single new
    link for each existing link.

    However, the collective effect of symmetry and transitivity is much more
    powerful than that of any single rule. For example, consider two links
    A1 -> B, A2 -> B. The fact that A1 and A2 point to the same node seems
    to indicate that A1 and A2 have something in common, i.e. are related in
    some way. However, none of the rules will directly generate a link
    between A1 and A2. Yet, the repeated selection of the link A2 -> B may
    actualize the link B -> A2 by symmetry. The repeated selection of the
    already existing link A1 -> B followed by this new link can then
    actualize the link A1 -> A2 through transitivity. Similar scenarios can
    be conceived for different orientations or different combinations of the
    links.

    A remaining issue is the relative importance of the three above rules.
    In other words, how large should the increase in strength be for each of
    the rules? If we choose unity (1) to be the bonus given by the first
    rule, there are two remaining parameters or degrees of freedom: t is the
    bonus for transitivity, s for symmetry. Since the direct selection of a
    link by a user seems a more reliable indication of its usefulness than
    an indirect selection, we assume t < 1 , s < 1. The actual values will
    determine the efficiency of the learning process, but it seems that this
    matter cannot be settled by pure theoretical reasoning.

    In order to test these ideas in practice we have set up two experiments.
    We built a web consisting of 150 nodes, corresponding to the 150 most
    frequent nouns of the English language (derived from the frequency list
    of Johansson & Hofland, 1989). Every node was assigned 10 links to other
    nodes. These links were randomly selected from the 149 remaining nodes
    to initialize the web, but would then evolve according to the above
    learning rules (with t = 0.5 and s = 0.3). We made the web available on
    the Internet, and invited volunteers to browse through it, selecting
    those links from a given node which seemed somehow most related to it.

    For example, if the start node represented the noun "dog", a user would
    choose a link to an associated word, such as "cat", "animal", or "fur",
    but not to a totally unrelated word, such as "mathematics". Of course,
    in the beginning of the experiment, there would be very few good
    associations available in the lists of 10 random words, and users might
    have to be satisfied with a rather weak association, such as "meat".
    However, when reaching the node "meat", they might be able to select
    there another association, such as "carnivore". Through transitivity, a
    new link to "carnivore" might then appear in the node "dog", displacing
    the weakest link in the list, while providing a much better association
    than the previously best one, "meat".

    In the first experiment we only used the direct and the transitive
    rules. After some 6000 link selections made by several hundreds of
    users, the network seemed to have settled in a relatively stable state.
    The most frequented nodes had gathered a list of 10 strongest links that
    quite well reflected their direct semantic environment, with words that
    were near synonyms of the node name at the top of the list (see Table
    1). However, this positive result was not reflected in the less
    frequented nodes, because of what we termed the "attractor effect".
    Nodes that had many incoming links, by accident, or because they had
    many associations with other words in the list, would tend to attract
    more users, which would result in increasing strength of their incoming
    paths, and their replacement by even stronger direct links. In the end
    almost all paths would end up in a cluster of semantically related,
    strongly cross-linked nodes, forming an approximate attractor for the
    network. Although new users were randomly assigned to a node when
    entering the network, so that all nodes would be consulted on first
    entry with the same average frequency, the subsequent moves would very
    quickly end up in the attractor cluster, so that nodes and links outside
    the attractor would get little chance to learn, and as a result would
    remain poorly connected.

    KNOWLEDGE
    0 200 800 4000
    view education education education
    health experience experience experience
    theory example development research
    face theory theory development
    book training research mind
    line development example life
    world history life theory
    side view training training
    government situation order thought
    trade work effect interest

    Table 1: self-organization of the list of 10 strongest links from the
    word "knowledge", in different stages: initial random linking pattern,
    after 200 steps, after 800 steps, and after 4000 steps. A step
    corresponds to a user selecting a link on one of the 150 nodes, in a web
    that evolves according to the direct, transitive and symmetric learning
    rules (2nd experiment). Note that only one of the initial links
    ("theory") survives after 4000 steps, and that the evolution slows down
    considerably.

    In our second experiment, we added the symmetry rule to the two other
    rules. This led to a faster initial learning, since a user passing
    through a node with zero-strength links would immediately generate two
    new links (one by symmetry and one by transitivity), which were on
    average much better than the initial random links. In the longer term,
    symmetry moreover attenuated the attractor effect, since strong links
    leading into an attractor would necessarily produce weaker, inverse
    links leading to the periphery. This gave nodes outside the attractor
    the chance to develop some links of their own, generating local
    attracting clusters weakly connected to other clusters. The overall
    learning seemed more efficient in the sense that less time was needed to
    develop good associations, and the result was more balanced, in the
    sense that the differences in frequentation between nodes were less
    strong. Still, the differences were still large enough to makes consider
    additional mechanisms for reducing the attractor effect.

    We are now trying to determine to what degree these results from the
    adaptive network correlate with different word associations derived by
    other means (e.g. free association experiments, or letting people judge
    the degree of synonymity). We also plan to test the usefulness of the
    self-organization, by checking in how far users find knowledge more
    effectively in a self-organized network, as compared to a network that
    did not undergo learning. This can be done by measuring the average
    number of steps needed to find a node, or the average time needed to
    choose a link. We are further considering additional learning rules,
    such as similarity (nodes sharing several links would get stronger
    cross-connections), that may make the learning more effective.

    Although this research is still in its initial stage, and will need much
    empirical testing to confirm its usefulness, it seems like a very
    promising approach to quickly and easily develop complex knowledge webs
    that are more adequate than webs built manually. It may become
    especially helpful for the World-Wide Web as a whole, by allowing the
    automatic creation of links between servers maintained by different
    people in different parts of the world.

    Conclusion

    Although the system we presented is still under development, changing
    almost every day, it has already become quite clear that the World-Wide
    Web paradigm, augmented with dynamic restructuring of the network,
    provides an extremely powerful tool for the publication and
    collaborative development of complex knowledge systems. Such a tool can
    be especially useful for the discipline of Systems and Cybernetics,
    characterized by a very rich but unstable and ill-structured knowledge
    base, which is difficult to access by traditional means. We hope that
    many researchers in that domain will join these efforts, either by
    contributing directly to the Principia Cybernetica Web, or by starting
    parallel services, which can then be cross-linked to the PCP Web. If
    that happens, the domain's historical mission of transdisciplinary
    integration (Boulding, 19; von Bertalanffy, 1968) may again become a
    practical issue, rather than a far-away dream."

    Jack Park wrote:

    > Consider these common use cases
    > EMAIL
    > User receive email
    > User send email
    > User annotate email
    > User replyTo email
    > OHS archive email
    > OHS autoLink email
    > SDS
    > <note>email annotation already covered</note>
    > User align records
    > OHS autoLink records
    > <note>I'm sure Rod will have lots more here</note>
    > WEB
    > User browse webpage
    > User annotate webpage
    > OHS autoLink webpage
    > COLLABORATE
    > <note>email and web fit in here</note>
    > User create document
    > Usser edit document
    > User shareDocumentWith OtherUser
    > User pose IBISQuestion
    > User respondTo IBISQuestion
    > OHS maintain IBISQuestion
    > OHS maintain IBISResponse
    > OHS autoLink IBISQuestion
    > OHS autoLink IBISResponse
    >
    > Under these are some really primitive use cases
    > OHS access webpage
    > OHS access email
    > User access OHS
    >
    > Let us examine these use cases.
    > Actors:
    > User, document, OHS, OtherUser, IBISQuestion, IBISResponse, email,
    > records
    > Verbs:
    > receive, send, respondTo, archive, autoLink, align, create, edit,
    > shareDocumentWith,
    > pose, maintain, browse, annotate
    > We can see that there is great similarity between 'create', 'pose', and
    > 'send'
    > 'autoLink' is a really exciting verb. Some verbs require user action,
    > others are purely OHS behaviors. Some verbs need rethinking.
    >
    > Notice that, when we begin to flesh these use cases out, we are beginning to
    > imagine the underlying mechanics of an OHS. We can now take these nouns and
    > verbs, refine them, refine our use cases, develop an ontology that narrows
    > the range of words we choose to those necessary to accomplish the design
    > task, construct scenarios with the new ontology, perhaps refine the ontology
    > and use cases, and iterate until we believe we are ready to hack some code.
    >
    > I recognize the fact that the use cases mentioned above appear to ignore the
    > vast amount of energy this group has already put into the development of use
    > cases. It is my hope that the two apparently disparate activities will
    > ultimately enhance each other. It would seem that we could take my
    > minimalist list and begin to flesh out an OHS.
    >
    > Once we get all this common stuff fleshed out, we can begin to look at the
    > two specialty tracks: research collaboration (NIH), and software
    > productivity. That will likely call for new iterations in the common stuff
    > because ideas generated in the specialty field will be seen to have value
    > across many domains.
    >
    > ============================================================================
    > This message is intended only for the use of the Addressee(s) and may
    > contain information that is PRIVILEGED and CONFIDENTIAL. If you are not
    > the intended recipient, dissemination of this communication is prohibited.
    > If you have received this communication in error, please erase all copies
    > of the message and its attachments and notify postmaster@verticalnet.com
    > immediately.
    > ============================================================================
    >
    >
    > Community email addresses:
    > Post message: unrev-II@onelist.com
    > Subscribe: unrev-II-subscribe@onelist.com
    > Unsubscribe: unrev-II-unsubscribe@onelist.com
    > List owner: unrev-II-owner@onelist.com
    >
    > Shortcut URL to this page:
    > http://www.onelist.com/community/unrev-II

    -------------------------- eGroups Sponsor -------------------------~-~>
    Big News - eGroups is becoming Yahoo! Groups
    Click here for more details:
    http://click.egroups.com/1/10801/0/_/444287/_/977470882/
    ---------------------------------------------------------------------_->

    Community email addresses:
      Post message: unrev-II@onelist.com
      Subscribe: unrev-II-subscribe@onelist.com
      Unsubscribe: unrev-II-unsubscribe@onelist.com
      List owner: unrev-II-owner@onelist.com

    Shortcut URL to this page:
      http://www.onelist.com/community/unrev-II



    This archive was generated by hypermail 2b29 : Thu Dec 21 2000 - 23:51:56 PST