[Date Prev] [Date Next] [Thread Prev] [Thread Next] Indexes: Main | Date | Thread | Author

[ba-ohs-talk] Spam filtering tools


http://razor.sourceforge.net/    (01)

"Vipul's Razor is a distributed, collaborative, spam detection and 
filtering network. Through user contribution, Razor establishes a 
distributed and constantly updating catalogue of spam in propagation that 
is consulted by email clients to filter out known spam. Detection is done 
with statistical and randomized signatures that efficiently spot mutating 
spam content. User input is validated through reputation assignments based 
on consensus on report and revoke assertions which in turn is used for 
computing confidence values associated with individual signatures. "    (02)

There is an enormous amount of technology in this project that, I think, 
has broader implications and applicability in an OHS environment. For instance:    (03)

"The Razor v2 protocol has been completely redesigned. The new protocol is 
based on exchange of Structured Information Strings, that are similar to 
URIs and can be parsed with URI decoding libraries. v2 protocol supports 
Pipelining, which means Razor Agents can keep a connection open with server 
to eliminate the latency introduced by TCP 3-way handshake and 4-way 
breakdown for every connection. The new protocol semantics allow seamless 
introduction of new signature schemes. Razor v2 protocol specification will 
be available shortly"    (04)

and    (05)

"Nilsimsa is a fuzzy signature algorithm based on statistical models of 
n-gram occurrence in a piece of text. Nilsimsa disregards small changes 
(mutations) in text that are statistically irrelevant. Nilsimsa signatures 
can be compared to determine the similarity (between 0 - 100%) in source 
texts. Razor v2 includes support for Nilsimsa signatures. "    (06)