Nutch - Dev

This forum is an archive for the mailing list dev@nutch.apache.org (more options) Messages posted here will be sent to this mailing list.
If you'd like to contribute to Nutch, please subscribe to the Nutch developer mailing list.
1 ... 516517518519520521522 ... 582
Topics (20365)
Replies Last Post Views
Compilation errors at revision 638548 by Andrew York
0
by Andrew York
Current OPIC implementation by Siddhartha Reddy
1
by Andrzej Białecki-2
[jira] Commented: (NUTCH-220) PDF Box can't parse document: java.lang.NullPointerException by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] Created: (NUTCH-615) Redirected URL are fetched wihtout setting any FetchInterval by JIRA jira@apache.org
7
by JIRA jira@apache.org
[jira] Commented: (NUTCH-223) Crawl.java uses Integer.MAX_VALUE for -topN where Generator.java uses Long.MAX_VALUE for -topN by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] Created: (NUTCH-616) Reset Fetch Retry counter when fetch is successful by JIRA jira@apache.org
6
by JIRA jira@apache.org
[jira] Created: (NUTCH-610) Can't Update or modify an index while web gui is running by JIRA jira@apache.org
6
by JIRA jira@apache.org
[jira] Closed: (NUTCH-243) Some meta-refresh urls get ignored due to matching regular expression by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] Commented: (NUTCH-243) Some meta-refresh urls get ignored due to matching regular expression by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] Closed: (NUTCH-223) Crawl.java uses Integer.MAX_VALUE for -topN where Generator.java uses Long.MAX_VALUE for -topN by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] Commented: (NUTCH-223) Crawl.java uses Integer.MAX_VALUE for -topN where Generator.java uses Long.MAX_VALUE for -topN by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] Commented: (NUTCH-220) PDF Box can't parse document: java.lang.NullPointerException by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] Closed: (NUTCH-220) PDF Box can't parse document: java.lang.NullPointerException by JIRA jira@apache.org
0
by JIRA jira@apache.org
Retire the original Fetcher before the release? by Andrzej Białecki-2
4
by Andrzej Białecki-2
(nutch 1.0) Query processing problem: NutchBeans and webapps search fail, but Luke sucess by Vinci
0
by Vinci
Cached page - can it be changed? by Vinci
0
by Vinci
Chnage the Analyzer by plugin - how to dealing with the query? by Vinci
1
by Vinci
Write back to the segment? by Vinci
0
by Vinci
How can I change the analyzer of nutch query by plugin? by Vinci
0
by Vinci
zh.ngp by Vinci
0
by Vinci
[jira] Created: (NUTCH-619) Another Language Identifier Plugin using Unicode code point range by JIRA jira@apache.org
0
by JIRA jira@apache.org
Thread behaviour in Nutch Crawl by naveen.goswami
0
by naveen.goswami
Problem in running Nutch where proxy authentication is required. by naveen.goswami
2
by naveen.goswami
[jira] Created: (NUTCH-612) URL filtering is always disabled in Generator when invoked by Crawl by JIRA jira@apache.org
4
by JIRA jira@apache.org
[jira] Commented: (NUTCH-126) Fetching via https does not work with a proxy (patch) by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] Created: (NUTCH-613) Empty Summaries and Cached Pages by JIRA jira@apache.org
6
by JIRA jira@apache.org
[jira] Created: (NUTCH-601) Recrawling on existing crawl directory using force option by JIRA jira@apache.org
12
by JIRA jira@apache.org
[jira] Created: (NUTCH-575) NPE in OpenSearchServlet when summary is null by JIRA jira@apache.org
9
by JIRA jira@apache.org
[jira] Closed: (NUTCH-189) Injection infinite loop by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] Commented: (NUTCH-189) Injection infinite loop by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] Commented: (NUTCH-168) setting http.content.limit to -1 seems to break text parsing on some files by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] Closed: (NUTCH-168) setting http.content.limit to -1 seems to break text parsing on some files by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] Commented: (NUTCH-157) Problem during parsing msword document . It fetching properly but parsing is not working. Please show me the way how can i parse it by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] Closed: (NUTCH-157) Problem during parsing msword document . It fetching properly but parsing is not working. Please show me the way how can i parse it by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] Commented: (NUTCH-126) Fetching via https does not work with a proxy (patch) by JIRA jira@apache.org
0
by JIRA jira@apache.org
1 ... 516517518519520521522 ... 582