Does Nutch still index to Elasticsearch

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Does Nutch still index to Elasticsearch

D Atherton
When I first started using Nutch, it was 1.10 and used ES 1x. Everything
worked fine.

Now I'm trying to update my system a bit and am having trouble getting
Nutch v1.13/1.14 to index the crawl data to ES 5.3.0.

I've been trying to get either version of Nutch to index the crawl data to
either ES2.3.3 (Nutch 1.13) or ES 5.3.0(Nutch 1.14) and I keep getting the
same error either way:

ElasticIndexWriter
elastic.cluster : elastic prefix cluster
elastic.host : hostname
elastic.port : port
elastic.index : elastic index command
elastic.max.bulk.docs : elastic bulk index doc counts. (default 250)
elastic.max.bulk.size : elastic bulk index length in bytes. (default
2500500)
elastic.exponential.backoff.millis : elastic bulk exponential backoff
initial delay in milliseconds. (default 100)
elastic.exponential.backoff.retries : elastic bulk exponential backoff max
retries. (default 10)
elastic.bulk.close.timeout : elastic timeout for the last bulk in seconds.
(default 600)


Indexer: java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:865)
at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:147)
at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:230)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:239)

Error running:
  /home/david/tutorials/nutch/nutch-1.14/runtime/local/bin/nutch index
-Delastic.server.url=http://localhost:9300/search-index/ searchcrawl//crawldb
-linkdb searchcrawl//linkdb searchcrawl//segments/20180824164559
Failed with exit value 255.

I'm not even sure what that error mean! Is there a patch for this, I
noticed that Nutch 1.15 just came out as well.

--
Thanks,
David