DomainUrlFilter with 10K domains?

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

DomainUrlFilter with 10K domains?

Otis Gospodnetic-2-2

How well does Nutch work for vertical, yet relatively wide crawls?  By
"vertical, yet relatively wide" I mean having to limit crawl to a specific set
of domains, but that set of domains being about 10K, so relatively big.  

Is the DomainUrlFilter capable of dealing with a set of 10K domains?

Sematext :: :: Solr - Lucene - Nutch
Lucene ecosystem search ::