Nutch - Dev

This forum is an archive for the mailing list dev@nutch.apache.org (more options) Messages posted here will be sent to this mailing list.
If you'd like to contribute to Nutch, please subscribe to the Nutch developer mailing list.
1 ... 516517518519520521522 ... 537
Topics (18768)
Replies Last Post Views
[jira] Created: (NUTCH-181) mapred.local.dir temp dir. space allocation limited by smallest area by JIRA jira@apache.org
0
by JIRA jira@apache.org
Seperating mapred/ndfs and nutch search engine by Dominik Friedrich
0
by Dominik Friedrich
Suggestions on plugin repository by Thomas Jaeger
2
by Thomas Jaeger
[jira] Created: (NUTCH-174) Problem encountered with ant during compilation by JIRA jira@apache.org
1
by JIRA jira@apache.org
Problem with latest SVN during reduce phase by Byron Miller-2
8
by Byron Miller-2
java.io.EOFException ... at org.apache.nutch.ndfs.DataNode$DataXceiver.run... by Rafi Iz
0
by Rafi Iz
Nutch/Lucene Document Model by Chih How Bong
0
by Chih How Bong
NutchQuery adding non required Terms by Stefan Groschupf-2
3
by Doug Cutting-2
MapReduce and segment merging by Mike Alulin
5
by Byron Miller-2
Where is org.apache.nutch.protocol.http.api.HttpBase? by Jack.Tang
1
by Stefan Groschupf-2
quit the maillist by Su Yan
0
by Su Yan
Normalizing URLs with anchors by kkrugler
3
by luti
Does the data size in 0.8 vesion should be much smaller than in version 0.7? by Rafi Iz
0
by Rafi Iz
Bug - Freezes if the last line in the url file does not finish with EOL symbol by Mike Alulin
0
by Mike Alulin
weird fetcher behavior by Florent Gluck
2
by Florent Gluck
Crawl and parse exceptions by Matt Zytaruk
3
by Matt Zytaruk
[jira] Created: (NUTCH-151) CommandRunner can hang after the main thread exec is finished and has inefficient busy loop by JIRA jira@apache.org
8
by JIRA jira@apache.org
PluginManifestParser should be NutchConfigurable by Jack.Tang
0
by Jack.Tang
[jira] Created: (NUTCH-159) Specify temp/working directory for crawl by JIRA jira@apache.org
3
by JIRA jira@apache.org
OpenOffice and Excel parsers by Rida Benjelloun
1
by Andrzej Białecki-2
Reporter interface by Andrew McNabb
14
by Gal Nitzan
XmlInputFortmat ? by Jack.Tang
1
by Doug Cutting-2
[jira] Created: (NUTCH-162) country code "jp" is used instead of language code "ja" for Japanese by JIRA jira@apache.org
3
by JIRA jira@apache.org
ParserFactory test fail by Stefan Groschupf-2
2
by Stefan Groschupf-2
NDFS / map tasks by Byron Miller-2
0
by Byron Miller-2
[jira] Created: (NUTCH-153) TextParser is only supposed to parse plain text, but if given postscript, it can take hours and then fail by JIRA jira@apache.org
8
by JIRA jira@apache.org
[jira] Created: (NUTCH-168) setting http.content.limit to -1 seems to break text parsing on some files by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] Created: (NUTCH-160) Use standard Java Regex library rather than org.apache.oro.text.regex by JIRA jira@apache.org
4
by JIRA jira@apache.org
Re: svn commit: r367137 - in /lucene/nutch/trunk/src: java/org/apache/nutch/net/protocols/ plugin/ plugin/lib-http/ plugin/lib-http/src/ plugin/lib-http/src/java/ plugin/lib-http/src/java/org/ plugin/lib-http/src/java/org/apache/ plugin/lib-http/src/ by Jérôme Charron
0
by Jérôme Charron
wiki:commandline options classpaths by Jerry Russell
1
by Otis Gospodnetic-2-2
why index not in segment anymore by Stefan Groschupf-2
1
by Doug Cutting-2
Re: svn commit: r367137 - in /lucene/nutch/trunk/src: java/org/apache/nutch/net/protocols/ plugin/ plugin/lib-http/ plugin/lib-http/src/ plugin/lib-http/src/java/ plugin/lib-http/src/java/org/ plugin/lib-http/src/java/org/apache/ plugin/lib-http/src/java/ by Doug Cutting-2
0
by Doug Cutting-2
test suite fails? by Stefan Groschupf-2
2
by Jérôme Charron
no static NutchConf by Stefan Groschupf-2
26
by Stefan Groschupf-2
injection infinite loop by Andy Liu-3
1
by Stefan Groschupf-2
1 ... 516517518519520521522 ... 537