Quantcast

Behavior of fetcher.follow.outlinks

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Behavior of fetcher.follow.outlinks

jjmendes
If you choose to use fetcher.follow.outlinks.depth,
fetcher.follow.outlinks.num.links and
fetcher.follow.outlinks.depth.divisor to fetch an additional link from
every page you crawl, how exactly is that link chosen by Nutch.
Randomly? Or is there any specific behavior behind the choice?

Thanks,

JJAM

 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: Behavior of fetcher.follow.outlinks

Markus Jelsma-2
Hello - it can depend on the HTML parser you use, but it is usually in the same order as they appear on the web page.

Regards,
Markus

 
 
-----Original message-----

> From:jjmendes <[hidden email]>
> Sent: Saturday 11th March 2017 15:24
> To: [hidden email]
> Subject: Behavior of fetcher.follow.outlinks
>
> If you choose to use fetcher.follow.outlinks.depth,
> fetcher.follow.outlinks.num.links and
> fetcher.follow.outlinks.depth.divisor to fetch an additional link from
> every page you crawl, how exactly is that link chosen by Nutch.
> Randomly? Or is there any specific behavior behind the choice?
>
> Thanks,
>
> JJAM
>
>  
Loading...