[Date Prev] [Date Next] [Thread Prev] [Thread Next] Indexes: Main | Date | Thread | Author

[ba-ohs-talk] Greenstone information management


http://www.nzdl.org/cgi-bin/library    (01)

I posted this URL earlier.  Now, I've downloaded the Greenstone software 
(for my Wintel box).  Had to download and install ActivePerl before 
Greenstone would work properly.    (02)

What is Greenstone?
GPL.  (may not be all that bad because it has a corba interface so we can 
call it from non-gpl software).    (03)

It's an engine that indexes collections of information.  It can handle:
	html
	word
	postscript
	pdf
	email
	other    (04)

I first downloaded the base system MG (managing gigabytes) and found that I 
couldn't compile it with cygwin.
So, I just installed the entire Greenstone package, which included 
everything except a perl engine.    (05)

Greenstone is a web-based system.  It will run with apache or whatever, but 
the download included a server.  It found IE 5 and set that as its default 
browser.  Everything is done from a browser. Installation was quite easy 
with the exe file that downloaded, though I must admit that Norton 
AntiVirus was very unhappy with it; I had to authorize the install, much to 
Norton's chagrin.    (06)

What does Greenstone do?
	Greenstone
		imports/converts files from local/web/ftp to internal html format
		indexes html formatted documents
		saves internal files compressed
	You are able to
		create new collections
		edit/add to/delete existing collections
		manage users, configurations, etc
		browse collections
	You are also able to
		Add new file types by creating perl scripts    (07)

Why is Greenstone interesting to an OHS/DKR crowd?
In one sense, it's already a kind of HyperScope.  It reads most all kinds 
of file types, and if it doesn't, you can fix it so it does. And, it is Web 
based.  I run mine locally, but if I were on a network, the installer 
detects that and makes it Web-enabled.  You have the option of declaring 
collections private or public.    (08)

In another sense (actually, the same sense as above), it's a kind of Grove 
engine, because it has adaptors (plugs) that give it the ability (though, 
not perfectly -- comment below) to handle most all important file types.    (09)

Imperfection exists because of the many different versions of Word file 
formats and so forth.  Imperfection may also exist as evidenced in the 
following:
	my installation has sucked up one pdf file and my entire Eudora in.mbx.  I 
know this because it says it was successful.  But, I'm suspicious because I 
am unable to browse any information contained in those items.  Could be me 
(classic newbe); I've just subscribed to the email list.  Time will tell.    (010)

In any case, given that there is supposed to be a corba interface (I 
haven't seen it yet), in theory, we can just allow Greenstone to suck up 
entire directories (it doesn't do this automatically, yet), and make them 
available to datamining tools in an OHS environment.    (011)

Imagine having it suck up D3E documents and the like.    (012)

I would like to see as many ohs-talkers as possible download and begin to 
experiment with Greenstone.  There is certainly some more software to be 
developed for it in order to make it useable as a foundation in an OHS 
environment.    (013)

Cheers
Jack    (014)