Is it possible to add new urls while nutch crawler is still running?

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Is it possible to add new urls while nutch crawler is still running?

riyal

Hi,

Is there any possible way so that i can add new urls for crawling though the nutch crawler is still running??

Thanks in advance.

--monirul



     
Reply | Threaded
Open this post in threaded view
|

Re: Is it possible to add new urls while nutch crawler is still running?

Dennis Kubes-2
No.  The urls are pulled from the crawldb to create segments to crawl.
Urls can be added to the crawldb while the fetcher is running and you
can pull new segments to crawl while the fetcher is running.

There are some options on the generate tool for whether a generated
segment prevents those same urls from being regenerated from the crawldb
before those urls are updated in the crawldb via the updatedb tool.

Dennis


Mohammad Monirul Hoque wrote:

> Hi,
>
> Is there any possible way so that i can add new urls for crawling though the nutch crawler is still running??
>
> Thanks in advance.
>
> --monirul
>
>
>
>