[ANNOUNCE] Apache Nutch 1.15 Release

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[ANNOUNCE] Apache Nutch 1.15 Release

Sebastian Nagel-2
The Apache Nutch [0] Project Management Committee are pleased to announce
the immediate release of Apache Nutch v1.15. We advise all current users
and developers of the 1.X series to upgrade to this release.

Nutch is a well matured, production ready Web crawler. Nutch 1.x enables
fine grained configuration, relying on Apache Hadoop™ [1] data structures,
which are great for batch processing.

As usual in the 1.X series, release artifacts are made available as both
source and binary and also available within Maven Central [2] as a Maven
dependency. The release is available from our downloads page [3].

This release includes more than 100 bug fixes and improvements, the full
list of changes can be seen in the release report [4], the most notable
ones are:

 NUTCH-1480 Multiple index writer instances with different configurations
  It's now possible to index documents into multiple Solr or Elasticsearch

  Please note that this feature changed the way indexers are configured,
  see https://wiki.apache.org/nutch/IndexWriters for more information.

 NUTCH-2412 Exchange component for indexing job
  Configurable routing of documents to indexes

 NUTCH-2375 Use the new MapReduce API

 NUTCH-2583 Overall upgrade of library dependencies
  which also makes Nutch run and compile on Java 9 and 10

 NUTCH-2549 Multiple fixes and improvements to the protocol-http plugin

 NUTCH-2576 A new HTTP protocol implementation based on the okhttp library
  Supports HTTP/2 if used with Java 9 or higher.

 NUTCH-1129 A new plugin to extract linked data based on the Any23 project

Thanks to all Nutch contributors which made this release possible,
Sebastian (on behalf of the Nutch PMC)

[0] http://nutch.apache.org/
[1] http://hadoop.apache.org/
[2] http://search.maven.org/#search%7Cgav%7C1%7Cg%3A%22org.apache.nutch%22%20AND%20a%3A%22nutch%22
[3] http://nutch.apache.org/downloads.html
[4] https://s.apache.org/nczS