Re: [unrev-II] WebSPHINX

From: Jack Park (jackpark@thinkalong.com)
Date: Wed Oct 10 2001 - 10:38:57 PDT

  • Next message: Jack Park: "Re: Rethinking TouchGraph was Re: [unrev-II] WebSPHINX"

    I just ran the applet itself on the website (subgraph) itself.
    It ran for 9 minutes 17 seconds until it ran out of links to check.
    It used a peak of 26994 kb of memory.
    It visited 107 web pages and checked 2502 links.

    Deepest nesting level I can see on the outline form is 5.
    The graph produced looks like a bunch of wiggling blobs. Double click on a
    node and, like TouchGraph, it takes you there.

    I just started the applet again, this time on the server, not the website
    subtree. I have no idea where that will go.

    Cheers
    Jack

    At 09:40 AM 10/10/2001 -0700, you wrote:
    >http://www-2.cs.cmu.edu/~rcm/websphinx/
    >
    >At 09:36 AM 10/10/2001 -0700, you wrote:
    > >"WebSPHINX ( Website-Specific Processors for HTML INformation eXtraction)
    > >is a Java class library and interactive development environment for Web
    > >crawlers. A Web crawler (also called a robot or spider) is a program that
    > >browses and processes Web pages automatically."
    > >
    > >A building block for complex systems, no?
    > >
    > >--
    > >-- Grant Bowman <grantbow@svpal.org>

    ------------------------ Yahoo! Groups Sponsor ---------------------~-->
    Pinpoint the right security solution for your company- Learn how to add 128- bit encryption and to authenticate your web site with VeriSign's FREE guide!
    http://us.click.yahoo.com/yQix2C/33_CAA/yigFAA/IHFolB/TM
    ---------------------------------------------------------------------~->

    Community email addresses:
      Post message: unrev-II@onelist.com
      Subscribe: unrev-II-subscribe@onelist.com
      Unsubscribe: unrev-II-unsubscribe@onelist.com
      List owner: unrev-II-owner@onelist.com

    Shortcut URL to this page:
      http://www.onelist.com/community/unrev-II

    Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/



    This archive was generated by hypermail 2.0.0 : Wed Oct 10 2001 - 10:28:51 PDT