Nutch - Dev

This forum is an archive for the mailing list dev@nutch.apache.org (more options) Messages posted here will be sent to this mailing list.
If you'd like to contribute to Nutch, please subscribe to the Nutch developer mailing list.
1 ... 3456789 ... 601
Topics (21031)
Replies Last Post Views
[jira] [Resolved] (NUTCH-2603) Bring back legacy pre-Tika parsers and use them as back up parsers by David Eric Pugh (Jir...
0
by David Eric Pugh (Jir...
[jira] [Updated] (NUTCH-2608) Reduce size of Nutch job file and package by David Eric Pugh (Jir...
0
by David Eric Pugh (Jir...
[jira] [Updated] (NUTCH-2634) Some links marked as "nofollow" are followed anyway. by David Eric Pugh (Jir...
0
by David Eric Pugh (Jir...
[jira] [Updated] (NUTCH-2662) index-jexl-filter plugin throws a RuntimeException if its enabled but not configured by David Eric Pugh (Jir...
0
by David Eric Pugh (Jir...
[jira] [Updated] (NUTCH-2720) ROBOTS metatag ignored when capitalized by David Eric Pugh (Jir...
0
by David Eric Pugh (Jir...
[jira] [Resolved] (NUTCH-2750) Improve CrawlDbReader & LinkDbReader reader handling by David Eric Pugh (Jir...
0
by David Eric Pugh (Jir...
[jira] [Updated] (NUTCH-2750) Improve CrawlDbReader & LinkDbReader reader handling by David Eric Pugh (Jir...
0
by David Eric Pugh (Jir...
[jira] [Commented] (NUTCH-2750) Improve CrawlDbReader & LinkDbReader reader handling by David Eric Pugh (Jir...
0
by David Eric Pugh (Jir...
[jira] [Updated] (NUTCH-2750) Improve CrawlDbReader & LinkDbReader reader handling by David Eric Pugh (Jir...
0
by David Eric Pugh (Jir...
[jira] [Commented] (NUTCH-2750) improve CrawlDbReader & LinkDbReader reader handling by David Eric Pugh (Jir...
0
by David Eric Pugh (Jir...
[jira] [Created] (NUTCH-2753) Add -listen option to command-line help of CrawlDbReader and LinkDbReader by David Eric Pugh (Jir...
0
by David Eric Pugh (Jir...
[jira] [Commented] (NUTCH-2748) Fetch status gone (redirect exceeded) not to overwrite existing items in CrawlDb by David Eric Pugh (Jir...
0
by David Eric Pugh (Jir...
Static source code anlysis via sonarcloud.io by lewis john mcgibbney...
1
by BlackIce
[jira] [Commented] (NUTCH-2748) Fetch status gone (redirect exceeded) not to overwrite existing items in CrawlDb by David Eric Pugh (Jir...
0
by David Eric Pugh (Jir...
[jira] [Commented] (NUTCH-2748) Fetch status gone (redirect exceeded) not to overwrite existing items in CrawlDb by David Eric Pugh (Jir...
0
by David Eric Pugh (Jir...
[jira] [Commented] (NUTCH-2748) Fetch status gone (redirect exceeded) not to overwrite existing items in CrawlDb by David Eric Pugh (Jir...
0
by David Eric Pugh (Jir...
[jira] [Updated] (NUTCH-2748) Fetch status gone (redirect exceeded) not to overwrite existing items in CrawlDb by David Eric Pugh (Jir...
0
by David Eric Pugh (Jir...
[jira] [Commented] (NUTCH-2739) indexer-elastic: Upgrade ES and migrate to REST client by David Eric Pugh (Jir...
0
by David Eric Pugh (Jir...
[jira] [Commented] (NUTCH-2739) indexer-elastic: Upgrade ES and migrate to REST client by David Eric Pugh (Jir...
0
by David Eric Pugh (Jir...
[jira] [Commented] (NUTCH-2739) indexer-elastic: Upgrade ES and migrate to REST client by David Eric Pugh (Jir...
0
by David Eric Pugh (Jir...
[jira] [Commented] (NUTCH-2739) indexer-elastic: Upgrade ES and migrate to REST client by David Eric Pugh (Jir...
0
by David Eric Pugh (Jir...
[jira] [Commented] (NUTCH-2739) indexer-elastic: Upgrade ES and migrate to REST client by David Eric Pugh (Jir...
0
by David Eric Pugh (Jir...
[jira] [Commented] (NUTCH-2739) indexer-elastic: Upgrade ES and migrate to REST client by David Eric Pugh (Jir...
0
by David Eric Pugh (Jir...
[jira] [Commented] (NUTCH-2739) indexer-elastic: Upgrade ES and migrate to REST client by David Eric Pugh (Jir...
0
by David Eric Pugh (Jir...
[jira] [Updated] (NUTCH-1337) WebGraph to follow redirects by David Eric Pugh (Jir...
0
by David Eric Pugh (Jir...
[jira] [Updated] (NUTCH-1337) WebGraph to follow redirects by David Eric Pugh (Jir...
0
by David Eric Pugh (Jir...
[jira] [Commented] (NUTCH-1559) parse-metatags duplicates extracted metatags by David Eric Pugh (Jir...
0
by David Eric Pugh (Jir...
[jira] [Commented] (NUTCH-2747) Replace remaining o.a.commons.logging by org.slf4j by David Eric Pugh (Jir...
0
by David Eric Pugh (Jir...
[jira] [Resolved] (NUTCH-1559) parse-metatags duplicates extracted metatags by David Eric Pugh (Jir...
0
by David Eric Pugh (Jir...
[jira] [Updated] (NUTCH-1559) parse-metatags duplicates extracted metatags by David Eric Pugh (Jir...
0
by David Eric Pugh (Jir...
[jira] [Commented] (NUTCH-1559) parse-metatags duplicates extracted metatags by David Eric Pugh (Jir...
0
by David Eric Pugh (Jir...
[jira] [Updated] (NUTCH-2746) Basic URL normalizer to normalize Unicode domain names by David Eric Pugh (Jir...
0
by David Eric Pugh (Jir...
[jira] [Resolved] (NUTCH-2747) Replace remaining o.a.commons.logging by org.slf4j by David Eric Pugh (Jir...
0
by David Eric Pugh (Jir...
[jira] [Commented] (NUTCH-2747) Replace remaining o.a.commons.logging by org.slf4j by David Eric Pugh (Jir...
0
by David Eric Pugh (Jir...
[jira] [Commented] (NUTCH-2751) nutch clean does not work with secured solr cloud by David Eric Pugh (Jir...
0
by David Eric Pugh (Jir...
1 ... 3456789 ... 601