[Date Prev] [Date Next] [Thread Prev] [Thread Next] Indexes: Main | Date | Thread | Author

Re: [ba-ohs-talk] bootstrap list message content & purple numbers

On Mon, 10 Dec 2001, Peter  Jones wrote:    (01)

> I've just hacked a desperate perl script (yep, I need the practice) that
> accesses the HTML archives for
> ba-unrev-talk, in the hopes of being able to add some interesting metadata
> to the backlink db... eventually.    (02)

Now _this_ is the kind of message I like to see! :-)    (03)

Let me save you some trouble Peter (and anybody else who wants to hack on
this).  The code I wrote to do the purple numbers and backlink extraction
is a filter for MHonArc (http://www.mhonarc.org/).  I've been meaning to
release the code; I've just been lazy.    (04)

If you want to do this kind of hacking, it's better to start with the
MHonArc filter.  It'll give you nice, programmatic access to the e-mail
metadata; no need to deserialize ugly, serialized HTML data.  I'll be
happy to step you through the code.  MHonArc is nice and powerful, but its
internals leave something to be desired.    (05)

> And then I noticed something incidentally potentially irksome about purple
> numbering in this message
> http://www.bootstrap.org/lists/ba-unrev-talk/0111/msg00014.html
> Lots of sentences and paragraphs, but only 1 purple number because the '>'s
> cloud the issue.    (06)

That is correct.  A consequence of my least-effort algorithm. :-(    (07)

> Would it be better to replace >s with indents in the HTML prior to adding
> purple numbering?    (08)

Not sure I understand the suggestion.  Are you suggesting not purple
numbering these quotes at all?    (09)

-Eugene    (010)

+=== Eugene Eric Kim ===== eekim@eekim.com ===== http://www.eekim.com/ ===+
|       "Writer's block is a fancy term made up by whiners so they        |
+=====  can have an excuse to drink alcohol."  --Steve Martin  ===========+    (011)