Substring URLFilter using Bayes Moore

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Substring URLFilter using Bayes Moore

Paul Dhaliwal
I am looking for an easier to use URLFilter implementation, preferably
substring URLFilter that using something as good as Bayes Moore algorithm.

I know RegexURLFilter is there, but I want to make things easier for myself
and people who don't know much about regular expessions.

I read this paper:
http://www-128.ibm.com/developerworks/java/library/j-text-searching.html. It
seems like a good technique,  but it also seems like that technique using
Collators is patented?

Freenet project seems to have a BayesMoore implementation.
http://www.koders.com/java/fid129780C8224463CFCC5167D9010794BAD50894EB.aspx.
Don't know the implications of using this code as it doesn't seem to the
freenet project code itself.

Can anyone suggest a good direction I should follow for this? I wouldn't
mind contributing it back to the project, especially if this substring
URLFilter is faster than RegexURLFilter in some  cases.

Paul