Log Newly Found Urls - Patch

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

Log Newly Found Urls - Patch

Rod Taylor-2
The ability to figure out what new URLs are being added to the database
is more important than the ones being fetched.

When changing the regex-urlfilter or regex-normalize files this gives
instantaneous feedback where the urls being retrieved may still be
crawling old junk for some time after the edits took place.

Rod Taylor <[hidden email]>

CrawlDbReducer.java.patch (1K) Download Attachment