No urls to fetch

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

No urls to fetch

Volkan Ebil
Hİ,

 

I have setup nutch and hadoop succesfully.

No problem at start.sh and stop.sh.

I create a dir name urls with a txt file as seed.

After I run the command

 

bin/hadoop dfs -put urls urls

it works .I check the list with the command

bin/hadoop dfs -ls

 

After that i have edited the crawl-urlfilter.txt and nutch-site.xml
hadoop-site.xml and other configurations

At last i ran bin/nutch crawl command but it gives

 

No urls to fetch check your filter and seed list error

 

I have observed the content of the webdb with the command readdb -stats

There is no problem at generate ,inject.

I am sure there is no problem in crawl-url filter and other configuration
xml files

İs anyone know any possible problem????

 

Thanks in advance.

Reply | Threaded
Open this post in threaded view
|

Re: No urls to fetch

Dennis Kubes-2
The most common problem is not setting the agent name in the
nutch-site.xml file.  First off check the log files for the task and see
if any errors are occuring and it would be good to see more of your
configuration for crawl-urlfilter and nutch-site.

Dennis

Volkan Ebil wrote:

> Hİ,
>
>  
>
> I have setup nutch and hadoop succesfully.
>
> No problem at start.sh and stop.sh.
>
> I create a dir name urls with a txt file as seed.
>
> After I run the command
>
>  
>
> bin/hadoop dfs -put urls urls
>
> it works .I check the list with the command
>
> bin/hadoop dfs -ls
>
>  
>
> After that i have edited the crawl-urlfilter.txt and nutch-site.xml
> hadoop-site.xml and other configurations
>
> At last i ran bin/nutch crawl command but it gives
>
>  
>
> No urls to fetch check your filter and seed list error
>
>  
>
> I have observed the content of the webdb with the command readdb -stats
>
> There is no problem at generate ,inject.
>
> I am sure there is no problem in crawl-url filter and other configuration
> xml files
>
> İs anyone know any possible problem????
>
>  
>
> Thanks in advance.
>
>