Nutch - Dev

This forum is an archive for the mailing list dev@nutch.apache.org (more options) Messages posted here will be sent to this mailing list.
If you'd like to contribute to Nutch, please subscribe to the Nutch developer mailing list.
1 ... 557558559560561562563 ... 578
Topics (20211)
Replies Last Post Views
[jira] Created: (NUTCH-176) Using -dir: creates an error, when the directory already exists by JIRA jira@apache.org
1
by JIRA jira@apache.org
[jira] Created: (NUTCH-177) Default installation seems to produce working entity of nutch by JIRA jira@apache.org
3
by JIRA jira@apache.org
[jira] Created: (NUTCH-179) Proposition: Enable Nutch to use a parser plugin not just based on content type by JIRA jira@apache.org
4
by JIRA jira@apache.org
[jira] Created: (NUTCH-102) jobtracker does not start when webapps is in src by JIRA jira@apache.org
3
by JIRA jira@apache.org
Pagination for the Web App by Tyrell Perera-2
0
by Tyrell Perera-2
Class MultiProperties by Rida Benjelloun
0
by Rida Benjelloun
Re: Per-page crawling policy by Ken Krugler-3
2
by kkrugler
question/suggestion on nutch file format by Tom-28-2
0
by Tom-28-2
[jira] Created: (NUTCH-181) mapred.local.dir temp dir. space allocation limited by smallest area by JIRA jira@apache.org
0
by JIRA jira@apache.org
Seperating mapred/ndfs and nutch search engine by Dominik Friedrich
0
by Dominik Friedrich
Suggestions on plugin repository by Thomas Jaeger
2
by Thomas Jaeger
[jira] Created: (NUTCH-174) Problem encountered with ant during compilation by JIRA jira@apache.org
1
by JIRA jira@apache.org
Problem with latest SVN during reduce phase by Byron Miller-2
8
by Byron Miller-2
java.io.EOFException ... at org.apache.nutch.ndfs.DataNode$DataXceiver.run... by Rafi Iz
0
by Rafi Iz
Nutch/Lucene Document Model by Chih How Bong
0
by Chih How Bong
NutchQuery adding non required Terms by Stefan Groschupf-2
3
by Doug Cutting-2
MapReduce and segment merging by Mike Alulin
5
by Byron Miller-2
Where is org.apache.nutch.protocol.http.api.HttpBase? by Jack.Tang
1
by Stefan Groschupf-2
quit the maillist by Su Yan
0
by Su Yan
Normalizing URLs with anchors by kkrugler
3
by luti
Does the data size in 0.8 vesion should be much smaller than in version 0.7? by Rafi Iz
0
by Rafi Iz
Bug - Freezes if the last line in the url file does not finish with EOL symbol by Mike Alulin
0
by Mike Alulin
weird fetcher behavior by Florent Gluck
2
by Florent Gluck
Crawl and parse exceptions by Matt Zytaruk
3
by Matt Zytaruk
[jira] Created: (NUTCH-151) CommandRunner can hang after the main thread exec is finished and has inefficient busy loop by JIRA jira@apache.org
8
by JIRA jira@apache.org
PluginManifestParser should be NutchConfigurable by Jack.Tang
0
by Jack.Tang
[jira] Created: (NUTCH-159) Specify temp/working directory for crawl by JIRA jira@apache.org
3
by JIRA jira@apache.org
OpenOffice and Excel parsers by Rida Benjelloun
1
by Andrzej BiaƂecki-2
Reporter interface by Andrew McNabb
14
by Gal Nitzan
XmlInputFortmat ? by Jack.Tang
1
by Doug Cutting-2
[jira] Created: (NUTCH-162) country code "jp" is used instead of language code "ja" for Japanese by JIRA jira@apache.org
3
by JIRA jira@apache.org
ParserFactory test fail by Stefan Groschupf-2
2
by Stefan Groschupf-2
NDFS / map tasks by Byron Miller-2
0
by Byron Miller-2
[jira] Created: (NUTCH-153) TextParser is only supposed to parse plain text, but if given postscript, it can take hours and then fail by JIRA jira@apache.org
8
by JIRA jira@apache.org
[jira] Created: (NUTCH-168) setting http.content.limit to -1 seems to break text parsing on some files by JIRA jira@apache.org
0
by JIRA jira@apache.org
1 ... 557558559560561562563 ... 578