[Date Prev] [Date Next] [Thread Prev] [Thread Next] Indexes: Main | Date | Thread | Author

Re: [ba-ohs-talk] Re: Semantic web meta data


Johannes Ernst wrote:    (01)

> At 13:59 -0700 2002/08/02, Eric Armstrong wrote:
>
> >That tends to argue for natural-language queries. Otherwise I'm
> >going to have a huge ontology to wade through to figure out how
> >to express my question.
>
> That assumes that the only option is a "horizontal" search engine
> like Google. "Horizontal" meaning here: regardless what domain it is
> all about, same search field. I agree that will never be usable to
> anyone except for the most hard-core techies, some of the time.    (02)

I think it is fair to assume that all searches start out there. After
all,
I can't "drill down" until I find a site that covers the area I'm
searching on.    (03)

Example: Recently I've been doing a lot of searching on "cartilage
restoration & regeneration" to see if I can get my knees back. Found
some folks in Miami and Chicago who are having success -- but it
turns out they are restoring *articular* cartilage (covers the end of
the bone) rather than the meniscus (spongy pad). The process of
searching was a matter first, of finding the relevant sites and second,
of coming up to speed on the relevant ontology.    (04)

> However, if one has context -- eg I'm visiting a health website, and
> I'm in the "advice for pre-school kids" section already -- then
> assembling the right query with the respective ontology may just be a
> bunch of multiple-choice questions and may look identical to web user
> interfaces today.    (05)

But finding the relevant site (or set of sites) is the hardest part of
the
problem. And for many large scale sites (Apple, Sun, MS, being
there is a lot like being "on the web" -- there is a lot of stuff.    (06)

> Otherwise we'd have to embark on a project similar to the making of
> the Oxford English dictionary, which took generations, except that
> this would be much harder I'd think, and the results would, based on
> currently knowledge, essentially be unusable.    (07)

I'm worried that the problem of tagging may actually be on that scale.
The Topic Maps model proudly points to the provision and requrement
for eliminating redundancies when TM's are merged, but that seems to
me to push the problem down one level -- to the requiring a non-
redundant set of "subject URIs".    (08)

I have to admit to being enthralloed by the propect of unnammed topic
groupings. I think they could go a long way towards query reduction.    (09)

I see several mechanisms for identifying such groups. One is network
analysis. Another is semantic analysis. A third is a combination.    (010)

Network analysis groups things that link to each other. So many unrev
pages would form a "group". Semantic analysis might be able to define
and find broader categories like "marketing hype", "technical paper",
"tutorial". A combination could be used in some cases. For example,
if there was a site that listed tutorials of all kinds, then a link from
that
site could be used to categorize material.    (011)

Or maybe the tagging problem is already sorting itself out? Don't
know. Have to rummage around some more, I expect.    (012)