Nutch - Dev

This forum is an archive for the mailing list dev@nutch.apache.org (more options) Messages posted here will be sent to this mailing list.
If you'd like to contribute to Nutch, please subscribe to the Nutch developer mailing list.
1 ... 519520521522523524525 ... 580
Topics (20290)
Replies Last Post Views
takes the URI info, Content, headers, ect into a MYSQL database during crawl. by xingjian
2
by xingjian
[jira] Created: (NUTCH-548) Move URLNormalizer from Outlink to ParseOutputFormat by JIRA jira@apache.org
12
by JIRA jira@apache.org
[jira] Created: (NUTCH-538) Delete unused classes under o.a.n.util by JIRA jira@apache.org
5
by JIRA jira@apache.org
Hudson build is back to normal: Nutch-Nightly #262 by hudson-6
0
by hudson-6
wiki faq by misc
0
by misc
Generator speed by misc
0
by misc
Auto complete by misc
0
by misc
Can we add this to nutch? by misc
1
by Dennis Kubes-2
EOF exception while fetching by Ned Rockson-3
0
by Ned Rockson-3
Build failed in Hudson: Nutch-Nightly #261 by hudson-6
1
by Doğacan Güney-3
[jira] Created: (NUTCH-547) Redirection handling: YahooSlurp's algorithm by JIRA jira@apache.org
15
by JIRA jira@apache.org
[jira] Created: (NUTCH-494) FindBugs: CrawlDbReader and DeleteDuplicates by JIRA jira@apache.org
4
by JIRA jira@apache.org
Usage of mapred-default.xml is deprecated in hadoop0.15.0 by Ned Rockson-3
0
by Ned Rockson-3
[jira] Created: (NUTCH-465) I download nutch 0.9 used tar zxvf nutch-0.9.tar.gz at last A lone zero block by JIRA jira@apache.org
2
by JIRA jira@apache.org
[jira] Created: (NUTCH-411) Parse ignores meta refresh redirection by JIRA jira@apache.org
3
by JIRA jira@apache.org
db.ignore.internal.links and ranking algorithms by Rajasekar Karthik
4
by Rajasekar Karthik
NullPointerException in FetchedSegments.getSummary() by John Doe-37
0
by John Doe-37
[jira] Commented: (NUTCH-572) Scoring and redirected Urls by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] Commented: (NUTCH-356) Plugin repository cache can lead to memory leak by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] Commented: (NUTCH-572) Scoring and redirected Urls by JIRA jira@apache.org
0
by JIRA jira@apache.org
Tika API by Ned Rockson-3
5
by Ned Rockson-3
JIRA emails and Nutch by Dennis Kubes-2
4
by Dennis Kubes-2
adding dmoz meta data to index. by ned@bcit
1
by Sebastian Steinmetz
MD5 vs TextProfile Signature by Rajasekar Karthik
0
by Rajasekar Karthik
[jira] Issue Comment Edited: (NUTCH-356) Plugin repository cache can lead to memory leak by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] Issue Comment Edited: (NUTCH-356) Plugin repository cache can lead to memory leak by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] Issue Comment Edited: (NUTCH-356) Plugin repository cache can lead to memory leak by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] Issue Comment Edited: (NUTCH-356) Plugin repository cache can lead to memory leak by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] Issue Comment Edited: (NUTCH-356) Plugin repository cache can lead to memory leak by JIRA jira@apache.org
0
by JIRA jira@apache.org
How dose the Nutch-0.9 read the configuration file? by Xin Zhang-2
1
by Tranquil
How to extract specified information from html? by jqq
4
by jqq
Nutch automatically deleting sites from search results by Rajasekar Karthik
0
by Rajasekar Karthik
plugin analyzer by Robert Benea
4
by Rajasekar Karthik
When is the Clause.getQuery().getBoost == 0? by Ned Rockson-3
1
by Andrzej Białecki-2
Next move with JIRA ticket by Ned Rockson-3
2
by Ned Rockson
1 ... 519520521522523524525 ... 580