Nutch - Dev

This forum is an archive for the mailing list dev@nutch.apache.org (more options) Messages posted here will be sent to this mailing list.
If you'd like to contribute to Nutch, please subscribe to the Nutch developer mailing list.
1 ... 598599600601602603604 ... 620
Topics (21667)
Replies Last Post Views
[jira] Created: (NUTCH-186) mapred-default.xml is over ridden by nutch-site.xml by Sebastian Nagel (Jir...
8
by Sebastian Nagel (Jir...
Searchable mailing lists on nutch.org? by Andy Liu-3
3
by Doug Cutting-2
Optimizing which links to fetch by kkrugler
1
by Doug Cutting-2
[jira] Created: (NUTCH-136) mapreduce segment generator generates 50 % less than excepted urls by Sebastian Nagel (Jir...
7
by Sebastian Nagel (Jir...
[jira] Created: (NUTCH-183) MapReduce has a series of problems concerning task-allocation to worker nodes by Sebastian Nagel (Jir...
7
by Sebastian Nagel (Jir...
Two possible extensions by Guenter, Matthias
2
by Stefan Groschupf-2
xml-parser plugin contribution by Rida Benjelloun
3
by Stefan Groschupf-2
lang identifier and nutch analyzer in trunk by Jack.Tang
15
by Andrzej Białecki-2
Nutch merge problem after fetch is aborted with hung threads. by Lukáš Vlček
0
by Lukáš Vlček
patch for nutch and nutch-daemon.sh by Zaheed Haque
0
by Zaheed Haque
Patch for NDFS's df.java by Dominik Friedrich
2
by Stefan Groschupf-2
protocol-httpclient; maximum total connections by orkunt.sabuncu
1
by Stefan Groschupf-2
[jira] Created: (NUTCH-127) uncorrect values using -du, or ls does not return items by Sebastian Nagel (Jir...
2
by Sebastian Nagel (Jir...
Using org.apache.nutch.indexer.IndexMerger (Nutch 0.7) by Chun Wei Ho
0
by Chun Wei Ho
[jira] Closed: (NUTCH-45) Log corrupt segments in SegmentMergeTool by Sebastian Nagel (Jir...
0
by Sebastian Nagel (Jir...
[jira] Created: (NUTCH-68) A tool to generate arbitrary fetchlists by Sebastian Nagel (Jir...
2
by Sebastian Nagel (Jir...
number of block duplicated by Stefan Groschupf-2
5
by Pashabhai
[jira] Created: (NUTCH-182) Log when db.max configuration limits reached by Sebastian Nagel (Jir...
1
by Sebastian Nagel (Jir...
[jira] Created: (NUTCH-87) Efficient site-specific crawling for a large number of sites by Sebastian Nagel (Jir...
14
by Sebastian Nagel (Jir...
Authentication / Content-type by Thushara Wijeratna
0
by Thushara Wijeratna
Generating multiple fetchlists between updates by Andrzej Białecki-2
1
by Doug Cutting-2
[jira] Created: (NUTCH-176) Using -dir: creates an error, when the directory already exists by Sebastian Nagel (Jir...
1
by Sebastian Nagel (Jir...
[jira] Created: (NUTCH-177) Default installation seems to produce working entity of nutch by Sebastian Nagel (Jir...
3
by Sebastian Nagel (Jir...
[jira] Created: (NUTCH-179) Proposition: Enable Nutch to use a parser plugin not just based on content type by Sebastian Nagel (Jir...
4
by Sebastian Nagel (Jir...
[jira] Created: (NUTCH-102) jobtracker does not start when webapps is in src by Sebastian Nagel (Jir...
3
by Sebastian Nagel (Jir...
Pagination for the Web App by Tyrell Perera-2
0
by Tyrell Perera-2
Class MultiProperties by Rida Benjelloun
0
by Rida Benjelloun
Re: Per-page crawling policy by Ken Krugler-3
2
by kkrugler
question/suggestion on nutch file format by Tom-28-2
0
by Tom-28-2
[jira] Created: (NUTCH-181) mapred.local.dir temp dir. space allocation limited by smallest area by Sebastian Nagel (Jir...
0
by Sebastian Nagel (Jir...
Seperating mapred/ndfs and nutch search engine by Dominik Friedrich
0
by Dominik Friedrich
Suggestions on plugin repository by Thomas Jaeger
2
by Thomas Jaeger
[jira] Created: (NUTCH-174) Problem encountered with ant during compilation by Sebastian Nagel (Jir...
1
by Sebastian Nagel (Jir...
Problem with latest SVN during reduce phase by Byron Miller-2
8
by Byron Miller-2
java.io.EOFException ... at org.apache.nutch.ndfs.DataNode$DataXceiver.run... by Rafi Iz
0
by Rafi Iz
1 ... 598599600601602603604 ... 620