Re: [unrev-II] Re: HtmlDOM -- XML -- Xmail

From: Eric Armstrong (eric.armstrong@eng.sun.com)
Date: Tue Mar 14 2000 - 16:41:38 PST

  • Next message: Jon Winters: "Re: [unrev-II] The perils of high technology... (fwd)"

    From: Eric Armstrong <eric.armstrong@eng.sun.com>

    Jeff Miller wrote:

    > While I agree that html does not meet our needs. I don't think that a
    > conversion tool is out of the question as most people I know that hand
    > code
    > html document do so without use of <p><h3>wwww</h3> wierdness that you
    > were
    > talking about and composer style editors generate know styles. (Of
    > course,
    > it's only as good as the user at the keyboard). In, lets pick a
    > number,
    > 80% of html out there could be converted with a conversion tool and
    > then
    > cleaned up. Leaving the miss interpreted and plain ugly for the luckly
    >
    > human.

    I understand that most people don't. I don't, which is why those cases
    never
    occurred to me. But the program you write has to be prepared for a
    number
    of eventualities. Basically, you can't depend on finding an </h3> as a
    terminator.
    So you have to figure out what to do if you see <h2>, <h1>, <h4>, <h5>,
    <table>, or one of the other possible terminators. (I'm not really
    certain that
    h3 doesn't require a terminator, but I know that a <dd> entry, for
    example,
    can be terminated by </dd>, <dt>, <dd>, or </dl>.) As the number of
    possible
    combinations goes up, program complexity rises dramatically -- a
    situation that
    is no doubt responsible for xml's insistence that every element be
    terminated.

    When you see <p>...<h3>, for example, you could simply assume that the
    <h3> starts a new header. Right? But what do you do when </p> *is*
    present, as in <p>...<h3>...</h3>...</p>? Do you simply ignore the </p>?

    Do you throw an error that says the document is not well constructed? Or

    do you convert the h3 to a font tag? Ignoring the </p> seems like the
    simplest
    course, but then where do you put the text, and how does it relate to
    what
    comes after?

    The problems are solvable -- all problems are. As a matter of
    engineering,
    though, does it make sense to pour a lot of time and energy into solving

    them, or does it make more sense to skip forward one generation and take

    advantage of the structuring that XML provides?

    If XML does *not* become the lingua franca of the Web, it would seem
    that solving the HTML problems would be worthwhile. But if it *does*
    become the standard, then solving HTML's problems is so much wasted
    energy.

    My crystal ball is dusty. How's yours?
    :_)

    ------------------------------------------------------------------------
    MAXIMIZE YOUR CARD, MINIMIZE YOUR RATE!
    Get a NextCard Visa, in 30 seconds! Get rates as low as
    0.0% Intro or 9.9% Fixed APR and no hidden fees.
    Apply NOW!
    http://click.egroups.com/1/2122/2/_/444287/_/953080887/
    ------------------------------------------------------------------------

    Community email addresses:
      Post message: unrev-II@onelist.com
      Subscribe: unrev-II-subscribe@onelist.com
      Unsubscribe: unrev-II-unsubscribe@onelist.com
      List owner: unrev-II-owner@onelist.com

    Shortcut URL to this page:
      http://www.onelist.com/community/unrev-II



    This archive was generated by hypermail 2b29 : Tue Mar 14 2000 - 16:48:36 PST