latest build throws error - critical

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

latest build throws error - critical

Raghavendra Prabhu
This is an erro which i did not face yesterday. When i ran the crawl with
updated build today, i got this error

060406 204147 parsing file:/G:/nutch-april6/conf/hadoop-site.xml
java.lang.NullPointerException
        at org.apache.nutch.crawl.CrawlDatum.set(CrawlDatum.java:206)
        at org.apache.nutch.crawl.CrawlDbReducer.reduce(CrawlDbReducer.java
:86)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:283)
        at org.apache.hadoop.mapred.LocalJobRunner$Job.run(
LocalJobRunner.java:1
21)
060406 204148  map 100%  reduce 0%
Exception in thread "main" java.io.IOException: Job failed!
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:322)
        at org.apache.nutch.crawl.CrawlDb.update(CrawlDb.java:54)
        at org.apache.nutch.crawl.Crawl.main(Crawl.java:114)

Rgds
PRabhu
Reply | Threaded
Open this post in threaded view
|

Re: latest build throws error - critical

Raghavendra Prabhu
Ran and tested with the same site for which it is crashing

With Revision 391003, it is working fine

The problem occurs in the later Revisions i guess. Anyone has an idea

Rgds
Prabhu



On 4/6/06, Raghavendra Prabhu <[hidden email]> wrote:

>
>  This is an erro which i did not face yesterday. When i ran the crawl with
> updated build today, i got this error
>
> 060406 204147 parsing file:/G:/nutch-april6/conf/hadoop-site.xml
> java.lang.NullPointerException
>         at org.apache.nutch.crawl.CrawlDatum.set(CrawlDatum.java:206)
>         at org.apache.nutch.crawl.CrawlDbReducer.reduce (
> CrawlDbReducer.java:86)
>         at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:283)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(
> LocalJobRunner.java:1
> 21)
> 060406 204148  map 100%  reduce 0%
> Exception in thread "main" java.io.IOException: Job failed!
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:322)
>         at org.apache.nutch.crawl.CrawlDb.update(CrawlDb.java:54)
>         at org.apache.nutch.crawl.Crawl.main(Crawl.java:114)
>
> Rgds
>  PRabhu
>
Reply | Threaded
Open this post in threaded view
|

Re: latest build throws error - critical

Andrzej Białecki-2
Raghavendra Prabhu wrote:
> Ran and tested with the same site for which it is crashing
>
> With Revision 391003, it is working fine
>
> The problem occurs in the later Revisions i guess. Anyone has an idea
>  

Yes, sorry for that - I introduced a bug in CrawlDbReducer, I'll fix it
shortly.

--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Reply | Threaded
Open this post in threaded view
|

Re: latest build throws error - critical

Raghavendra Prabhu
HI

It is not a problem
I just wanted to ensure that it works
As users, we are supposed to test it and report and be of help.

Thanks
Rgds
Prabhu


On 4/6/06, Andrzej Bialecki <[hidden email]> wrote:

>
> Raghavendra Prabhu wrote:
> > Ran and tested with the same site for which it is crashing
> >
> > With Revision 391003, it is working fine
> >
> > The problem occurs in the later Revisions i guess. Anyone has an idea
> >
>
> Yes, sorry for that - I introduced a bug in CrawlDbReducer, I'll fix it
> shortly.
>
> --
> Best regards,
> Andrzej Bialecki     <><
> ___. ___ ___ ___ _ _   __________________________________
> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> ___|||__||  \|  ||  |  Embedded Unix, System Integration
> http://www.sigram.com  Contact: info at sigram dot com
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: latest build throws error - critical

Thomas Delnoij-3
I am seeing the same problem. Now I have to ask, what is the procedure
for these kind of bugs. Do you always create Jira issues so we (the
users) know in what revision a bug was  fixed?

I mean, how do others keep uptodate with the main codeline? Do you
advice updating everyday?

Rgrds, Thomas

On 4/6/06, Raghavendra Prabhu <[hidden email]> wrote:

> HI
>
> It is not a problem
> I just wanted to ensure that it works
> As users, we are supposed to test it and report and be of help.
>
> Thanks
> Rgds
> Prabhu
>
>
> On 4/6/06, Andrzej Bialecki <[hidden email]> wrote:
> >
> > Raghavendra Prabhu wrote:
> > > Ran and tested with the same site for which it is crashing
> > >
> > > With Revision 391003, it is working fine
> > >
> > > The problem occurs in the later Revisions i guess. Anyone has an idea
> > >
> >
> > Yes, sorry for that - I introduced a bug in CrawlDbReducer, I'll fix it
> > shortly.
> >
> > --
> > Best regards,
> > Andrzej Bialecki     <><
> > ___. ___ ___ ___ _ _   __________________________________
> > [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> > ___|||__||  \|  ||  |  Embedded Unix, System Integration
> > http://www.sigram.com  Contact: info at sigram dot com
> >
> >
> >
>
>
Reply | Threaded
Open this post in threaded view
|

Re: latest build throws error - critical

Raghavendra Prabhu
Check out with svn and update the svn
So you should be in synch with the main code

And these bugs are fixed usually by the group in the blink of an eye with
good stability.

Update using svn client
Hope that helps.

Rgds
Prabhu


On 4/6/06, TDLN <[hidden email]> wrote:

>
> I am seeing the same problem. Now I have to ask, what is the procedure
> for these kind of bugs. Do you always create Jira issues so we (the
> users) know in what revision a bug was  fixed?
>
> I mean, how do others keep uptodate with the main codeline? Do you
> advice updating everyday?
>
> Rgrds, Thomas
>
> On 4/6/06, Raghavendra Prabhu <[hidden email]> wrote:
> > HI
> >
> > It is not a problem
> > I just wanted to ensure that it works
> > As users, we are supposed to test it and report and be of help.
> >
> > Thanks
> > Rgds
> > Prabhu
> >
> >
> > On 4/6/06, Andrzej Bialecki <[hidden email]> wrote:
> > >
> > > Raghavendra Prabhu wrote:
> > > > Ran and tested with the same site for which it is crashing
> > > >
> > > > With Revision 391003, it is working fine
> > > >
> > > > The problem occurs in the later Revisions i guess. Anyone has an
> idea
> > > >
> > >
> > > Yes, sorry for that - I introduced a bug in CrawlDbReducer, I'll fix
> it
> > > shortly.
> > >
> > > --
> > > Best regards,
> > > Andrzej Bialecki     <><
> > > ___. ___ ___ ___ _ _   __________________________________
> > > [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> > > ___|||__||  \|  ||  |  Embedded Unix, System Integration
> > > http://www.sigram.com  Contact: info at sigram dot com
> > >
> > >
> > >
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: latest build throws error - critical

Andrzej Białecki-2
In reply to this post by Raghavendra Prabhu
Raghavendra Prabhu wrote:
> This is an erro which i did not face yesterday. When i ran the crawl with
> updated build today, i got this error
>
>  


Apply the attached patch (go to the trunk/ dir and execute 'patch -p0 <
patch.txt'), and please report if it fixes the problem.

--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



Index: src/java/org/apache/nutch/crawl/CrawlDbReducer.java
===================================================================
--- src/java/org/apache/nutch/crawl/CrawlDbReducer.java (revision 391269)
+++ src/java/org/apache/nutch/crawl/CrawlDbReducer.java (working copy)
@@ -52,6 +52,7 @@
       switch (datum.getStatus()) {                // find old entry, if any
       case CrawlDatum.STATUS_DB_UNFETCHED:
       case CrawlDatum.STATUS_DB_FETCHED:
+      case CrawlDatum.STATUS_DB_GONE:
         old = datum;
         break;
       case CrawlDatum.STATUS_LINKED:
Reply | Threaded
Open this post in threaded view
|

Re: latest build throws error - critical

Raghavendra Prabhu
It works perfectly
Crawl is as smooth as it ever was.

Rgds
Prabhu


On 4/7/06, Andrzej Bialecki <[hidden email]> wrote:

>
> Raghavendra Prabhu wrote:
> > This is an erro which i did not face yesterday. When i ran the crawl
> with
> > updated build today, i got this error
> >
> >
>
>
> Apply the attached patch (go to the trunk/ dir and execute 'patch -p0 <
> patch.txt'), and please report if it fixes the problem.
>
> --
> Best regards,
> Andrzej Bialecki     <><
> ___. ___ ___ ___ _ _   __________________________________
> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> ___|||__||  \|  ||  |  Embedded Unix, System Integration
> http://www.sigram.com  Contact: info at sigram dot com
>
>
>
>
> Index: src/java/org/apache/nutch/crawl/CrawlDbReducer.java
> ===================================================================
> --- src/java/org/apache/nutch/crawl/CrawlDbReducer.java (revision 391269)
> +++ src/java/org/apache/nutch/crawl/CrawlDbReducer.java (working copy)
> @@ -52,6 +52,7 @@
>       switch (datum.getStatus()) {                // find old entry, if
> any
>       case CrawlDatum.STATUS_DB_UNFETCHED:
>       case CrawlDatum.STATUS_DB_FETCHED:
> +      case CrawlDatum.STATUS_DB_GONE:
>         old = datum;
>         break;
>       case CrawlDatum.STATUS_LINKED:
>
>
Reply | Threaded
Open this post in threaded view
|

RE: latest build throws error - critical

Dennis Kubes
In reply to this post by Andrzej Białecki-2
Crawl completed successfully and was able to search index successfully.

Dennis

-----Original Message-----
From: Andrzej Bialecki [mailto:[hidden email]]
Sent: Thursday, April 06, 2006 2:03 PM
To: [hidden email]
Subject: Re: latest build throws error - critical

Raghavendra Prabhu wrote:
> This is an erro which i did not face yesterday. When i ran the crawl
> with updated build today, i got this error
>
>  


Apply the attached patch (go to the trunk/ dir and execute 'patch -p0 <
patch.txt'), and please report if it fixes the problem.

--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web ___|||__||  \|
||  |  Embedded Unix, System Integration http://www.sigram.com  Contact:
info at sigram dot com



Reply | Threaded
Open this post in threaded view
|

Re: latest build throws error - critical

Andrzej Białecki-2
Raghavendra Prabhu wrote:
> It works perfectly
> Crawl is as smooth as it ever was.
>
>  

Dennis Kubes wrote:
> Crawl completed successfully and was able to search index successfully.
>  

Thanks for testing! Patch applied.

--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com