nutch 0.9, fetch2, fetcher.parse conf value not used

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

nutch 0.9, fetch2, fetcher.parse conf value not used

John Mendenhall
I tried to run fetch without parsing by setting the
fetcher.parse property to false.  When I ran parse,
it said the segment had already been parsed, by the
fetch process.

It appears NUTCH-337 only fixed the unused
fetcher.parse configuration value in the Fetcher.java
class.  I have tried using fetch2 (Fetcher2.java) and
it appears the fetcher.parse configuration value is
not being used.

I will try with my same setup to use the fetch class
and see if this works as it should and does not parse
after fetching.

Is the Fetcher2 class not recommended?

Or, is it possible I have some other problem?

Thanks in advance for any assistance you can provide.

JohnM

--
john mendenhall
[hidden email]
surf utopia
internet services
Reply | Threaded
Open this post in threaded view
|

Re: nutch 0.9, fetch2, fetcher.parse conf value not used

John Mendenhall
> I tried to run fetch without parsing by setting the
> fetcher.parse property to false.  When I ran parse,
> it said the segment had already been parsed, by the
> fetch process.
>
> It appears NUTCH-337 only fixed the unused
> fetcher.parse configuration value in the Fetcher.java
> class.  I have tried using fetch2 (Fetcher2.java) and
> it appears the fetcher.parse configuration value is
> not being used.
>
> I will try with my same setup to use the fetch class
> and see if this works as it should and does not parse
> after fetching.
>
> Is the Fetcher2 class not recommended?
> Or, is it possible I have some other problem?
>
> Thanks in advance for any assistance you can provide.

I changed my script to call `nutch fetch` instead
of `nutch fetch2`, using Fetcher.java rather than
Fetcher2.java.  Now the fetcher.parse configuration
value is being used.

I recommend we modify Fetcher2.java to use this
value instead of requiring it to be on the command
line.

JohnM

--
john mendenhall
[hidden email]
surf utopia
internet services
Reply | Threaded
Open this post in threaded view
|

JDK 1.5 & Tomcat 5.5

Duan, Nick
Does the latest Nutch work with JDK 1.5 or 1.6, and Tomcat 5.5 or 6.0?

Thanks!

Nick
Reply | Threaded
Open this post in threaded view
|

RE: JDK 1.5 & Tomcat 5.5

Christopher Bader-2
I'm using Nutch 0.9 (the latest stable release) with JDK 1.5, and Tomcat
6.0.  I had a problem with JDK 1.6.

CB


-----Original Message-----
From: Duan, Nick [mailto:[hidden email]]
Sent: Wednesday, January 30, 2008 4:50 PM
To: [hidden email]
Subject: JDK 1.5 & Tomcat 5.5

Does the latest Nutch work with JDK 1.5 or 1.6, and Tomcat 5.5 or 6.0?

Thanks!

Nick

Reply | Threaded
Open this post in threaded view
|

running out of space in /tmp

Christopher Bader-2
In reply to this post by John Mendenhall
All,

Nutch is crashing on a crawl when it runs out of disk space in /tmp.

There is lots of space in other partitions.  Aside from re-installing Linux,
is there a way of getting Nutch and/or the Java VM to use a different
directory for temporary storage?

CB



Reply | Threaded
Open this post in threaded view
|

Re: running out of space in /tmp

Susam Pal
In the file 'conf/hadoop-site.xml', please add this:-

<property>
 <name>hadoop.tmp.dir</name>
 <value>/path/to/your/new/tmp/directory</value>
 <description>Base for Nutch Temporary Directories</description>
</property>

Regards,
Susam Pal

On Jan 31, 2008 9:12 PM, Christopher Bader <[hidden email]> wrote:

> All,
>
> Nutch is crashing on a crawl when it runs out of disk space in /tmp.
>
> There is lots of space in other partitions.  Aside from re-installing Linux,
> is there a way of getting Nutch and/or the Java VM to use a different
> directory for temporary storage?
>
> CB
>
>
>
>