[Date Prev] [Date Next] [Thread Prev] [Thread Next] Indexes: Main | Date | Thread | Author

Re: [ba-ohs-talk] Keyword Indexing to Improve Email and IT


Keywords are worth trying (if you can get users to add them).
*Subject* keywords *alone* do have a stumbling block, a type
of Zipfian distribution known as Bradford's Law which states that
from either the classification or the searching end, a relatively
small number of terms tend to be used repeatedly, and a huge
percentage quite rarely (like a rather flat bell curve with a big bump
in the middle). This is to the benefit of those searching the tails,
but leaves those searching the center with quite a mountain
to delve through. There's a brief description at:
http://dlis.gseis.ucla.edu/research/mjbates2.html
INDEXING AND ACCESS FOR DIGITAL LIBRARIES
AND THE INTERNET: HUMAN, DATABASE, AND
DOMAIN FACTORS, Marcia J. Bates    (01)

Which leads one to a couple of conclusions:
1. You need categories
2. You indeed need to be able to add categories and
subcategories    (02)

but, 3. categorizing the "bump in the middle" can drive you
crazy as you descend into finer subcategories -- finally creating
subcategories so fine that no searcher could conceivably come
up with the search term needed to precisely return the right posts.    (03)

Which was the impetus for posting
http://www.hastingsresearch.com/net/03-dkr-ir-metadata.shtml
Cataloging of DKR Objects (by the author or creator) -- the
belief that information retrieval becomes radically better when
the searcher is able to step outside the traditional author/subject/
title set of search terms, and search by a broad range of associated
facts such as "was the post pro or con?" or "where did the author live?"
These are facets that were impossible to search with most physical
card catalogs; with computers, there are no technical obstacles.
(Plenty of human ones, though .)    (04)

Nicholas
--
________________________________
Nicholas Carroll
ncarroll@hastingsresearch.com
Travel: ncarroll1000@yahoo.com
http://www.hastingsresearch.com
________________________________    (05)


----- Original Message -----
From: Garold (Gary) L. Johnson <dynalt@dynalt.com>
To: <ba-ohs-talk@bootstrap.org>
Sent: Saturday, April 20, 2002 9:57 AM    (06)

> I would suggest that we use a special keyword section at the beginning of
> the email body rather than try to do all of this in the subject line.
>
> Thanks,
>
> Gary
>
> -----Original Message-----
> From: owner-ba-ohs-talk@bootstrap.org
> [mailto:owner-ba-ohs-talk@bootstrap.org]On Behalf Of Eric Armstrong
> Sent: Friday, April 19, 2002 1:32 PM
> To: ba-ohs-talk@bootstrap.org
> Subject: Re: [ba-ohs-talk] Keyword Indexing to Improve Email and IT
>
> Wow. Rod. How do you find all these references?
> I'd spend days looking up all that stuff -- if I happened to remember
> it. There is clearly something about your system that works.
>
> Now if we could combine it with Radio Userland, so the references
> were in outline form, and clicking on them transcluded them into
> the message... Well, that would sure be something.
>
> On the subject of catagorizing. Yes. We need categories. There
> are two requirements to make them useful:
>    1) The requirement that one select from a list of predefined
>         categories.
>         --that keeps everyone on the same page, category-wise
>         --some day, a maleable system that let people add and change
>            categories dyamically would be cool
>         --in the meantime, as long as it was possible to add categories
>            every once in a while, we could limp along and experiment
>
>     2) The ability to add categories retroactively.
>          --it's not enough for the original author to do it
>          --I'm wondering if this could be an addition to Radio Userland
>
>
> Rod Welch wrote:
>
> > Alex,
> >
> > Jack Park proposed on 000623 using an engine to generate an ontology
> > for organizing the record....
> >
> > http://www.welchco.com/sd/08/00101/02/00/06/23/114155.HTM#2915
> >
> > ...so it would seem that your idea is in the works.  As well, Eugene
> > Kim proposed on 001126 that the group use more diligence to apply
> > Jack's idea to the email record, perhaps in a less sophisticated form,
> > as you have in mind today....
> >
> > http://www.welchco.com/sd/08/00101/02/00/11/26/214933.HTM#QW8I
> >
> > The challenge of organizing free form text as occurs in
> > correspondence, books, articles and other narrative is it is only a
> > small part of adding intelligence to information to create knowledge,
> > and second it is a very complex task that is not addressed by indexing
> > key words, but rather key phrases need to be indexed using an organic
> > structure, and this structure needs to be added at a unit level, which
> > can be a sentence, paragraph, or in some cases many paragraphs.  This
> > makes the task complex and time consuming as Jack pointed out on
> > 000221....
> >
> > http://www.welchco.com/sd/08/00101/02/00/02/21/113701.HTM#7455
> >
> > The degree of complexity of this task is masked (hidden) from the
> > minds eye by the architecture of human thought explained in POIMS....
> >
> > http://www.welchco.com/03/00050/01/09/01/02/00030.HTM#0367
> >
> > ...based on earlier work on 890523...
> >
> > http://www.welchco.com/sd/08/00101/02/89/05/23/065052.HTM#SQ5L
> >
> > Without a solution, eventually people feel overwhelmed, as Eric
> > Armstrong reported on 010916....
> >
> > http://www.welchco.com/sd/08/00101/02/01/09/16/213549.HTM#KA6H
> >
> > Eric made the same point in a letter on 011003 that underlies your
> > suggestion today....
> >
> > http://www.welchco.com/sd/08/00101/02/01/10/03/160603.HTM#EC5N
> >
> > ...where he points out that productivity is paralyzed without a
> > solution to the problem you pose today, at least that is an element.
> > In fact to solve the problem requires a critical mass of functions
> > working together.  If only key words are provided, it does not improve
> > anything enough for people to bother, so very soon people fall off the
> > wagon and revert to fast and easy methods in the moment, as Jack
> > related on 010908....
> >
> > http://www.welchco.com/sd/08/00101/02/01/09/08/093820.HTM#YF5O
> >
> > In sum, your idea today has a lot of merit.  Experience shows that
> > organization is a key element of KM that can be effective when
> > deployed in a larger context, as Eric noted on 010916...
> >
> > http://www.welchco.com/sd/08/00101/02/01/09/16/190429.HTM#0001
> >
> > Application within the narrow context of email has been raised several
> > times, and is available but will not likely yield either favorable
> > response over an extended period, nor favorable results if
> > successfully applied, based on the project record, per Eric's comments
> > on 001122....
> >
> > http://www.welchco.com/sd/08/00101/02/00/11/22/150637.HTM#0001
> >
> > What is needed is a comprehensive breakthrough design that integrates
> > a critical mass of capability.  This necessarily takes more than 20
> > minutes to learn and so presents a KM dilemma of how to transition
> > from things people like to things people need.
> >
> > Rod
> >
> > *************
> >
> > Alex Shapiro wrote:
> > >
> > > Hi everyone, and especially Eugene, who is the moderator for this
list.
> > >
> > > I just had a good idea for an eazy way to improve this forum.
> > >
> > > Why not make a list of metadata keywords that could optionally be
> prepended
> > > to the subject line of posts to this group.
> > >
> > > For instance, the subject for this post could have been
> > > "[ListEnhancementIdea] Great? idea ..."
> > >
> > > Then, a script or whatever could check the headers, and put the posts
> into
> > > groups.  If someone replies to a post, then it would not be put into a
> > > group because it begins with "RE: " (the "[" symbol has to come first
> for
> > > it to be recognized as containing a keyword).
> > >
> > > The reason for this idea is that I wanted to mark Peter's post below
as
> > > containing a piece of software that I wanted to investigate.  If Peter
> had
> > > prepended a keyword to the subject line, say [CollaborationSoftware],
> then
> > > this post could have been put into a group with other
> CollaborationSoftware
> > > where it would be easy to find.
> > >
> > > See what I am saying?  With these keywords Eugene could make a webpage
> that
> > > looked like http://www.fury.com (or at least some rough interface that
> > > would look like the category list on the right).
> > >
> > > The great thing about this idea is that participation would be
> voluntary,
> > > but peer pressure would probably get everyone to participate.  Also, a
> > > slightly annoying but functional thing to do in the case where someone
> did
> > > not put a keyword in the subject would be to reply to the post with
the
> > > keyword in the subject line. (Or maybe we could think of a better
> web-based
> > > interface for doing so).
> > >
> > > Also, (back on a techical tangent) multiple keywords could be
supported,
> > > i.e. [Keyword1, Keyword2].
> > >
> > > What do you guys think?
> > >
> > > ... Worst case, instead of a list of keywords, lets at least make a
> single
> > > [SoftwareAnnounce] or [SA] keyword to keep track of just the posts
that
> do
> > > as the keyword says.
> > >
> > > --Alex
> > >
> > > At 10:54 PM 4/18/02 +0100, Peter Jones wrote:
> > > >http://fle3.uiah.fi/
> > > >"Fle3 is a web-based learning environment. To be more specific Fle3
is
> a
> > > >server software for computer supported collaborative learning (CSCL).
> > > >With the Fle3 Knowledge Building Tool groups may carry out knowledge
> > > >building dialogues, theory building and debates by storing their
> > > >thoughts into a shared database. In the knowledge building groups may
> > > >use knowledge types (also called thinking types) to scaffold and
> > > >structure their dialogues.
> > > >Fle3 WebTops can be used by teachers and students to store different
> > > >items (documents, files, links, knowledge building notes) related to
> > > >their studies, organize them to folders and share them with others.
The
> > > >items in the WebTops can be called learning objects - if you wish.
> > > >For teachers and administrators Fle3 offers tools to manage users and
> > > >courses. The administrator may also export and import the content of
> the
> > > >Fle3 database in XML format (compatible with the Educational
Modelling
> > > >Language - EML).
> > > >Fle3 is Open Source and Free Software released under GNU the General
> > > >Public Licence (GPL). The licence is protecting your freedom to use,
> > > >modify and distribute Fle3.
> > > >
> > > >Fle3 is a Zope product, written in Python. Zope is the leading Open
> > > >Source application server. Zope and Fle3 run on almost all Operating
> > > >Systems (Linux, MacOS X, *BSD, etc.) and Microsoft Windows. "
> > > >
> > > >--
> > > >Peter Jones
>
>
>
>
>    (07)