Nutch - Dev

This forum is an archive for the mailing list dev@nutch.apache.org (more options) Messages posted here will be sent to this mailing list.
If you'd like to contribute to Nutch, please subscribe to the Nutch developer mailing list.
1 ... 556557558559560561562 ... 595
Topics (20798)
Replies Last Post Views
[jira] Created: (NUTCH-351) Protocol forward proxy by Sebastian Nagel (Jir...
3
by Sebastian Nagel (Jir...
Searching on fields with uppercase letters by Enrico Triolo-2
2
by Enrico Triolo-2
[jira] Created: (NUTCH-373) Fetcher halting and throttling by Sebastian Nagel (Jir...
1
by Sebastian Nagel (Jir...
[jira] Created: (NUTCH-372) Fetcher halting and throttling by Sebastian Nagel (Jir...
1
by Sebastian Nagel (Jir...
[jira] Created: (NUTCH-368) Message queueing system by Sebastian Nagel (Jir...
7
by Sebastian Nagel (Jir...
Modifications necessary to upgrade to Hadoop 0.6.2 by Marcel Petrisor
0
by Marcel Petrisor
[jira] Created: (NUTCH-370) Generator loosed urls when run with LocalJobRunner by Sebastian Nagel (Jir...
2
by Sebastian Nagel (Jir...
[jira] Created: (NUTCH-344) Fetcher threads blocked on synchronized block in cleanExpiredServerBlocks by Sebastian Nagel (Jir...
7
by Sebastian Nagel (Jir...
[jira] Created: (NUTCH-338) Remove the text parser as an option for parsing PDF files in parse-plugins.xml by Sebastian Nagel (Jir...
7
by Sebastian Nagel (Jir...
[jira] Created: (NUTCH-318) log4j not proper configured, readdb doesnt give any information by Sebastian Nagel (Jir...
12
by Sebastian Nagel (Jir...
[jira] Created: (NUTCH-266) hadoop bug when doing updatedb by Sebastian Nagel (Jir...
23
by Sebastian Nagel (Jir...
[jira] Created: (NUTCH-105) Network error during robots.txt fetch causes file to be ignored by Sebastian Nagel (Jir...
6
by Sebastian Nagel (Jir...
0.8.1 by Sami Siren-2
4
by Sami Siren-2
[jira] Created: (NUTCH-205) Wrong 'fetch date' for non available pages by Sebastian Nagel (Jir...
4
by Sebastian Nagel (Jir...
[jira] Created: (NUTCH-276) db.score.link.internal problem by Sebastian Nagel (Jir...
1
by Sebastian Nagel (Jir...
[jira] Created: (NUTCH-350) urls blocked db.fetch.retry.max * http.max.delays times during fetching are marked as STATUS_DB_GONE by Sebastian Nagel (Jir...
2
by Sebastian Nagel (Jir...
[jira] Created: (NUTCH-253) Normalize Host during Generate by Sebastian Nagel (Jir...
2
by Sebastian Nagel (Jir...
[jira] Created: (NUTCH-337) Fetcher ignores the fetcher.parse value configured in config file by Sebastian Nagel (Jir...
7
by Sebastian Nagel (Jir...
[jira] Created: (NUTCH-336) Harvested links shouldn't get db.score.injected in addition to inbound contributions by Sebastian Nagel (Jir...
3
by Sebastian Nagel (Jir...
[jira] Created: (NUTCH-332) doubling score causes by page internal anchors. by Sebastian Nagel (Jir...
4
by Sebastian Nagel (Jir...
[jira] Created: (NUTCH-365) Flexible URL normalization by Sebastian Nagel (Jir...
9
by Sebastian Nagel (Jir...
Ant tasks/build.xml file for running Nutch in debug mode? by Jp Mutch
0
by Jp Mutch
Empty "incoming anchor text" by Zhen Zhen
3
by Richard Braman-2
CrawlDatum.modifiedTime ? by Kim, Greg
0
by Kim, Greg
[jira] Created: (NUTCH-367) DistributedSearch thown ClassCastException by Sebastian Nagel (Jir...
1
by Sebastian Nagel (Jir...
[jira] Created: (NUTCH-364) Javascript parser creates some fairly bogus URLs by Sebastian Nagel (Jir...
1
by Sebastian Nagel (Jir...
I cann't fetch wml page by yin chunhui
0
by yin chunhui
Speed of reading local files by Zhen Zhen
0
by Zhen Zhen
Time of Reading Local Files by Jane Zhen
0
by Jane Zhen
[jira] Created: (NUTCH-369) StringUtil.resolveEncodingAlias is unuseful. by Sebastian Nagel (Jir...
0
by Sebastian Nagel (Jir...
A Problem about Nutch Plugin by yin chunhui
0
by yin chunhui
ask a problem about nutch (from China) by yin chunhui
2
by Howie Wang
Re: Any plans to move to build Nutchusing Maven? by sshingler
8
by Otis Gospodnetic-2-2
Patch Available status? by chrismattmann
11
by Otis Gospodnetic-2-2
File system watching for intranets by Ben Ogle
2
by Ben Ogle
1 ... 556557558559560561562 ... 595