Bottleneck of my crawls: NativeCodeLoader

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Bottleneck of my crawls: NativeCodeLoader

James Ford
Hello,

I am trying to optimize my crawls as much as possible. The current bottleneck is the step after adding segments to the linkdb, where Nutch is trying to load the natiive-hadoop library:

2012-03-26 13:20:59,089 WARN  util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

This step takes about 15 minutes, compared to all other steps which takes about 25 minutes in total. How can I make this step faster?

Thanks,
James Ford
Reply | Threaded
Open this post in threaded view
|

Re: Bottleneck of my crawls: NativeCodeLoader

Sebastian Nagel
Hi James,

there is a description on how to install native libraries:
   lib/native/README.txt
If installed appropriately native libs are loaded and the
warnings will disappear.

But are you sure that it's really the library loading that
takes the time and not the step run after but without an
initial log message? LinkDb may take some time because it
reads all (or all newly created) segments.

Sebastian

On 03/26/2012 01:35 PM, James Ford wrote:

> Hello,
>
> I am trying to optimize my crawls as much as possible. The current
> bottleneck is the step after adding segments to the linkdb, where Nutch is
> trying to load the natiive-hadoop library:
>
> 2012-03-26 13:20:59,089 WARN  util.NativeCodeLoader - Unable to load
> native-hadoop library for your platform... using builtin-java classes where
> applicable
>
> This step takes about 15 minutes, compared to all other steps which takes
> about 25 minutes in total. How can I make this step faster?
>
> Thanks,
> James Ford
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Bottleneck-of-my-crawls-NativeCodeLoader-tp3857929p3857929.html
> Sent from the Nutch - User mailing list archive at Nabble.com.