[Date Prev] [Date Next] [Thread Prev] [Thread Next] Indexes: Main | Date | Thread | Author

[ba-ohs-talk] Keyword Indexing

http://www.bootstrap.org/lists/ba-ohs-talk/0204/msg00194.html    (01)

I strongly agree with the above messages, especially Chris Dent's last 
message which I was going to quote.  However, I  found myself saying 
Exactly, Exactly, Exactly to all the lines :) so I guess that I'll just 
spare the clutter.    (02)

*1* KWD LOCATION: I think that we are in agreement that the keywords should 
be the first line of the email.    (03)

*2* KWD ENVELOPE: I was tempted to agree with Murray's suggestion that the 
envelope for the keywords should be more complex then [], because of the 
argument that some emails might be html based.  However, I think that 
everyone on this list mostly uses text, and I see that one of my messages 
which had some bold text (which I assume needs html), came out looking like 
plain text in the ba-ohs archives.  So, as long as parsing is not a 
problem, which it looks like it won't be, I say that using square brackets 
around the keywords is fine.    (04)

*3* KWD FORMAT: We need to agree on some sort of word separation standard 
for keywords.  The above thread has contained the following formats: 
FooBar, Foo_Bar, Foo-Bar.  I don't have much of a preference, but I think 
that either the first or the second is better.  The undescored version 
Foo_Bar seemes to be the most readable.    (05)

*4* KWD SELECTION: In message #126 Eric suggests that we take the time to 
come up with a list of keywords.  We could do this, but first I think that 
we might experiment with comming up with a minimal set of basic keywords, 
and then having every new keyword automatically added to the DB.    (06)

*4.1* BASIC KEYWORDS: I think that as a group we should come up with a 
minimal set of three or four keywords that would give a general type to the 
message.  For instance, I am thinking that messages which announce a new 
type of software should be given the Software_Announce, or SA, (or some 
variation) keyword.  The thing is that about 1/3 of the root messages (not 
the followups) posted to this group announce software, and it would be nice 
to be able to filter those out from the general discussion.  Other basic 
keywords might include Document_Announce, Conference_Announce, 
Fun_Announce, and Seeking_Software.    (07)

In theory there should be one basic keyword per message.  The purpose of 
these keywords is to provide an intermediate level of specification between 
the current thread structure, and the fine grain keyboarding to be 
discussed next.  I envison using archives of messages aggregated by these 
keywords to queries of "I just saw some cool software mentioned recently, 
but I don't remember what it was".    (08)

*4.2* FINE GRAINED KEYWORDS:  Besides basic keywords there can also be fine 
grained keywords, such as IBIS, Google, Graphs, etc.  My suggestion is that 
instead of wasting time arguing about these, we allow any user to use any 
keyword.  New keywords will automatically be added to a database.    (09)

The idea is that we should eventually settle on some common keywords by 
convention.  I am sure that there will be some tension here, but I think 
that spreading the tension out over the first few weeks of use is better 
then arguing about this stuff before we are even completely sure what we 
are working with.  It is simpler to see how to categorize new messages, 
then to have long arguments about how we should have categorized older 
messages.    (010)

An idea that I had for keywords is that they could be placed in 
hierarchies, for instance Google.API.  A message tagged with the Google.API 
could be seen by viewing both "Google" and "Google.API", but not the other 
way around.  This way, if you chose a subcategory which other users did not 
agree on, your message would still be captured by the parent category.    (011)

Multiple keywords should work in the same way.  A message tagged with 
multiple keywords would be viewable by looking at any one of the listings.    (012)

*5* INTERFACE SUGGESTIONS:  I like Mark's suggestion of enriching the 
messages, and then delivering them to our inboxes.  If we do this, then I 
suggest converting the keywords into hyperlinks which would again appear at 
the beginning of the email.  Clicking on the hyperlinks would take you to 
the chronological list of messages tagged with that keyword (the listings 
which I mentioned above).    (013)

We could also use the keywords to enrich the current 
archive.  Currently all you see is an indented list with the post hyperlink 
and perhaps one more bit of information, the author name for the thread 
view.  I would suggest putting the keyword links adjacent to the post 
hyperlink.    (014)

Also, perhaps we could add a Keyword view to the author/thread/date 
views.  There are many ways that this could look, but one is something like 
the author view, but with keywords instead of authors.    (015)

A problem with this approach is that data would not be accumulated month to 
month, as it currently isN'T :)  I don't know what it would take to get 
away from this, maybe setting periods for each keyword and breaking the 
data down that way.  I am not even sure if the breakdown is necessary.  My 
guess is that it's only there now for practical purposes, and not out of 
necessity.    (016)

==========    (017)

Ok?  So the actionable items are to come up with a list of basic keywords, 
decide on the multi-word format, and figure out how to implement this type 
of system.  I think that from a technical standpoint this problem is not 
too bad at all.  And it would be a big help managing the 200 or so messages 
that we receive each month, who besides Rod could remember all that :)    (018)

--Alex    (019)