[unrev-II] Preliminary Meeting with Ken Holman

From: Eric Armstrong (eric.armstrong@eng.sun.com)
Date: Mon Nov 13 2000 - 14:05:59 PST

  • Next message: Henry van Eyken: "Re: [unrev-II] Preliminary Meeting with Ken Holman"

    Several of us had a chance to meet with Ken
    Holman over the weekend. He was brought to
    the party by John Deneen, and he was quite
    happy to meet Doug. He very much wants to
    make whatever contribution he can, which
    pretty much makes him "one of the team".

    Ken is very knowledgable about XML and related
    disciplines. And he is, or has been, very
    active at OASIS (Organization for the Advancement
    of Structured Information Standards). He is
    looking forward to helping us define an interchange
    standard, and shepard it through the various
    committees, and so on.

    He also has a remarkable flair for design.
    He picked up a rough sense of what we were about
    in fairly short order, and began making insightful
    observations based on his past design experience.

    Here are some of the technical points he developed
    during the meeting...

    XML Basics
    ----------
      * XPATH is a basic structure-identification
        mechanism

      * XPOINTER uses that representation mechanism,
        and builds on it to add concepts like a
        structure-range (from struct X to struct Y)

      * XSL/XSLT also uses XPATH as part of its
        representation mechanisms

      * XSLT is a translation mechanism that can
        generate XML, which can then be parsed.

      * XSL is the format-presentation layer. It defines
        a ton of constructs that can be used to specify
        how material prints, or is displayed.

      * RELAX is a very nice schema definition mechanism
        that defines a theory-based representation
        mechanism that lets you construct DTD *diffs*
        and DTD *unions*. Unions let you modularize
        DTDs, and ensure that a document conforms to
        the result of combining them.)

      * SCHEMATRON is an assertion-based validation
        mechanism. Using that mechanisms makes it possible
        to validate assertions like "mixed content
        containing text and inline elements occurs only
        before substructure elements, never between or
        after".

        [For me, this one was worth the price of
         dmission. It totally solves the XML limitation
         described in my paper on XML Editor Design.]

    Design Principles
    -----------------
      * Most application designs define an application-
        specific language, and parse that. They tend
        to consider XSLT as an afterthought. To make
        use of it, a different representatiion is
        parsed, written out as XML, and then reparsed
        into the app.

      * But XSLT can quite easily produce SAX or
        DOM output *directly*. So the kind of design
        Ken recommends, uses XSL and a style sheet
        to process any particular XML data. The result
        becomes SAX events or a DOM in the app, so
        that part of the app doesn't change. But now
        you can process any other variant of the
        XML that encodes the information, simply by
        creating a new stylesheet, without a big
        peformance hit -- the result is roughly
        equivalent to having defined that language (or
        any other variant) as the "reference langauge"
        for the application.
     
      * Ken declared emphatically that DEFINING THE XML
        EARLY ON IS INAPPROPRIATE. He's seen the mistake
        made dozens of times, and counsels his clients
        against it. His take on the matter is that XML
        IS AN INTERCHANGE STANDARD and that the core of
        the application is the services it provides.
        Therefore, the only sequence that works in the
        real world is to define those services, and *then*
        come up with an XML form for the data that needs
        to be interchanged.

    OHS Design
    ----------
      * In terms of the OHS, Ken's approach had some
        remarkable implications for the design. Rather
        than attempting to define a DTD for a "normal
        form" OHS document, Ken suggests focusing on
        the services, and building (or at least desiging)
        those services. So for example, we need granular
        addressibility. And we want it to apply to legacy
        documents. Ok, then, the system requires
        mechanisms for adding addresses to a legacy
        document! The orginal document continues its
        existence, unchanged. The OHS contains a pointer
        to it, along with a collection of addresses that
        point into it. The "HyperDocument" you view in
        the "HyperScope" is then the product of those
        addresses applied to that document.
        
      * Note that we have *not* defined a DTD for a
        HyperDocument. We have defined functionality.
        Now, when it comes to interchange data, how
        does that happen? Well, what do you need to
        send? You need to send a pointer to the original
        document, at a minimum -- or possibly the
        document itself if it is inaccessible. And you
        need to send the additional information (like
        the addresses) that are necessary to carry out
        HyperScope functions!

      * Ken's point here, is that XML definition is
        dictated by functional needs -- by what you
        need to transmit to provide the desired services,
        and the resulting XML definition is far removed
        from any sort of "HyperDocument definition" we
        may construct at the outset.

        [Note: From personal experience, I concur
         wholeheartedly. The orginal stab I took at
         XML syntax for such a document looks nothing
         like the node library I am currently constructing.
         More instructively, none of the last 4 versions
         of that library look very much like any of the
         others.]

    Topic Maps
    ----------
    Ken also talked about topic maps for a bit.
    (Although I have yet to "get" them, Ken was very big
    on them, and mentioned Jack Park's advocacy several
    times in this context.)

    What I gleaned from our short forays into the
    subject was:
      * Topic maps provide a way of defining the
        semantic content of a structure or, perhaps
        more accurately, it is a way of specifying the
        syntax that is used to represent different
        semantic constructs. (I believe that is
        accurate, although I didn't quite get how
        it works.
        More info: http://www.topicmaps.org

      * Ken suspects we want to use topic maps to
        define the OHS interchange mechansims.
        (Again, I don't see how that works, exactly,
         but I suspect that he and Jack will be
         able to arrive at a meeting of the minds.)

      * My one little "aha" on the subject is that
        if XSLT + a stylesheet can be used as the
        input to an application, then if the input
        is defined using a topic map, then anyone
        can use any syntax they want to encode the
        data -- the syntax will be transformed by
        XSLT for use by the application anyway, and
        that translation will be governed by the
        topic map. (I think that is somewhere
        within a Silicon-valley commute of being
        correct, but...)

    -------------------------- eGroups Sponsor -------------------------~-~>
    eGroups eLerts
    It's Easy. It's Fun. Best of All, it's Free!
    http://click.egroups.com/1/9698/2/_/444287/_/974153179/
    ---------------------------------------------------------------------_->

    Community email addresses:
      Post message: unrev-II@onelist.com
      Subscribe: unrev-II-subscribe@onelist.com
      Unsubscribe: unrev-II-unsubscribe@onelist.com
      List owner: unrev-II-owner@onelist.com

    Shortcut URL to this page:
      http://www.onelist.com/community/unrev-II



    This archive was generated by hypermail 2b29 : Mon Nov 13 2000 - 14:16:55 PST