Nutch - Dev

This forum is an archive for the mailing list dev@nutch.apache.org (more options) Messages posted here will be sent to this mailing list.
If you'd like to contribute to Nutch, please subscribe to the Nutch developer mailing list.
1 ... 519520521522523524525 ... 558
Topics (19496)
Replies Last Post Views
[jira] Created: (NUTCH-344) Fetcher threads blocked on synchronized block in cleanExpiredServerBlocks by JIRA jira@apache.org
7
by JIRA jira@apache.org
[jira] Created: (NUTCH-338) Remove the text parser as an option for parsing PDF files in parse-plugins.xml by JIRA jira@apache.org
7
by JIRA jira@apache.org
[jira] Created: (NUTCH-318) log4j not proper configured, readdb doesnt give any information by JIRA jira@apache.org
12
by JIRA jira@apache.org
[jira] Created: (NUTCH-266) hadoop bug when doing updatedb by JIRA jira@apache.org
23
by JIRA jira@apache.org
[jira] Created: (NUTCH-105) Network error during robots.txt fetch causes file to be ignored by JIRA jira@apache.org
6
by JIRA jira@apache.org
0.8.1 by Sami Siren-2
4
by Sami Siren-2
[jira] Created: (NUTCH-205) Wrong 'fetch date' for non available pages by JIRA jira@apache.org
4
by JIRA jira@apache.org
[jira] Created: (NUTCH-276) db.score.link.internal problem by JIRA jira@apache.org
1
by JIRA jira@apache.org
[jira] Created: (NUTCH-350) urls blocked db.fetch.retry.max * http.max.delays times during fetching are marked as STATUS_DB_GONE by JIRA jira@apache.org
2
by JIRA jira@apache.org
[jira] Created: (NUTCH-253) Normalize Host during Generate by JIRA jira@apache.org
2
by JIRA jira@apache.org
[jira] Created: (NUTCH-337) Fetcher ignores the fetcher.parse value configured in config file by JIRA jira@apache.org
7
by JIRA jira@apache.org
[jira] Created: (NUTCH-336) Harvested links shouldn't get db.score.injected in addition to inbound contributions by JIRA jira@apache.org
3
by JIRA jira@apache.org
[jira] Created: (NUTCH-332) doubling score causes by page internal anchors. by JIRA jira@apache.org
4
by JIRA jira@apache.org
[jira] Created: (NUTCH-365) Flexible URL normalization by JIRA jira@apache.org
9
by JIRA jira@apache.org
Ant tasks/build.xml file for running Nutch in debug mode? by Jp Mutch
0
by Jp Mutch
Empty "incoming anchor text" by Zhen Zhen
3
by Richard Braman-2
CrawlDatum.modifiedTime ? by Kim, Greg
0
by Kim, Greg
[jira] Created: (NUTCH-367) DistributedSearch thown ClassCastException by JIRA jira@apache.org
1
by JIRA jira@apache.org
[jira] Created: (NUTCH-364) Javascript parser creates some fairly bogus URLs by JIRA jira@apache.org
1
by JIRA jira@apache.org
I cann't fetch wml page by yin chunhui
0
by yin chunhui
Speed of reading local files by Zhen Zhen
0
by Zhen Zhen
Time of Reading Local Files by Jane Zhen
0
by Jane Zhen
[jira] Created: (NUTCH-369) StringUtil.resolveEncodingAlias is unuseful. by JIRA jira@apache.org
0
by JIRA jira@apache.org
A Problem about Nutch Plugin by yin chunhui
0
by yin chunhui
ask a problem about nutch (from China) by yin chunhui
2
by Howie Wang
Re: Any plans to move to build Nutchusing Maven? by sshingler
8
by Otis Gospodnetic-2-2
Patch Available status? by chrismattmann
11
by Otis Gospodnetic-2-2
File system watching for intranets by Ben Ogle
2
by Ben Ogle
[jira] Created: (NUTCH-366) Merge URLFilters and URLNormalizers by JIRA jira@apache.org
1
by Federico Dal Maso
I use eclipse to run NutchAnalysis.java, but it meet QueryFilter RunTime error by heack
0
by heack
Help: DistributedSearch thown ClassCastException by emanihc
0
by emanihc
How could I test my modify to NutchAnalysis.jj? by heack
2
by heack
[jira] Created: (NUTCH-363) Fetcher normalizes everything at least twice by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] Created: (NUTCH-339) Refactor nutch to allow fetcher improvements by JIRA jira@apache.org
18
by JIRA jira@apache.org
Ontology compile bug by Michael Wechner
3
by Michael Wechner
1 ... 519520521522523524525 ... 558