[jira] [Commented] (NUTCH-2597) NPE in updatehostdb

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[jira] [Commented] (NUTCH-2597) NPE in updatehostdb

JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/NUTCH-2597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16511429#comment-16511429 ]

ASF GitHub Bot commented on NUTCH-2597:

sebastian-nagel commented on a change in pull request #349: NUTCH-2597: fixed cleanup()
URL: https://github.com/apache/nutch/pull/349#discussion_r195162047

 File path: src/java/org/apache/nutch/indexer/CleaningJob.java
 @@ -64,10 +64,12 @@ public void setConf(Configuration conf) {
       Mapper<Text, CrawlDatum, ByteWritable, Text> {
     private ByteWritable OUT = new ByteWritable(CrawlDatum.STATUS_DB_GONE);
+    @Override
     public void setup(Mapper<Text, CrawlDatum, ByteWritable, Text>.Context context) {
-    public void cleanup() throws IOException {
+    @Override
+    public void cleanup(Context context) throws IOException {
 Review comment:
   Could also remove the method implementation. The superclass Reducer already implements already a do-nothing cleanup(context).
   If you have time: there are a couple of other cleanup() methods without the context argument. Probably same mistake but harmless as they "do nothing". Need to check in detail but `git grep -A2 'cleanup()'` finds a couple of them. Thanks, @sju!

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[hidden email]

> NPE in updatehostdb
> -------------------
>                 Key: NUTCH-2597
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2597
>             Project: Nutch
>          Issue Type: Bug
>          Components: hostdb
>    Affects Versions: 1.15
>            Reporter: Jurian Broertjes
>            Priority: Critical
> I get an NPE on updatehostdb. I start with a clean crawlDB & hostDB. After an inject, I do an updatehostdb with -checkAll and get the following stacktrace:
> {code}
> 2018-06-13 10:45:21,958 WARN hostdb.ResolverThread - java.lang.NullPointerException
>  at org.apache.hadoop.io.SequenceFile$Writer.checkAndWriteSync(SequenceFile.java:1359)
>  at org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1400)
>  at org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat$1.write(SequenceFileOutputFormat.java:83)
>  at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:558)
>  at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
>  at org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.write(WrappedReducer.java:105)
>  at org.apache.nutch.hostdb.ResolverThread.run(ResolverThread.java:82)
>  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> {code}
> Is this related to NUTCH-2375?
> If further testing is needed, please let me know!

This message was sent by Atlassian JIRA