Nutch - Dev

This forum is an archive for the mailing list dev@nutch.apache.org (more options) Messages posted here will be sent to this mailing list.
If you'd like to contribute to Nutch, please subscribe to the Nutch developer mailing list.
1 ... 577578579580581582583 ... 612
Topics (21405)
Replies Last Post Views
[jira] Created: (NUTCH-309) Uses commons logging Code Guards by Parth (Jira)
6
by Parth (Jira)
[jira] Created: (NUTCH-331) Fetcher incorrectly reports task progress to tasktracker resulting in skipped URLs by Parth (Jira)
1
by Parth (Jira)
[jira] Created: (NUTCH-333) SegmentMerger and SegmentReader should use NutchJob by Parth (Jira)
1
by Parth (Jira)
0.8 release by Sami Siren-2
11
by Piotr Kosiorowski
How can i get a page content or parse data by the page's url by cookman
3
by Lourival Júnior
[jira] Created: (NUTCH-315) CrawlDbReader usage text - implementation mismatch by Parth (Jira)
1
by Parth (Jira)
[jira] Created: (NUTCH-325) UrlFilters.java throws NPE in case urlfilter.order contains Filters that are not in plugin.includes by Parth (Jira)
2
by Parth (Jira)
[jira] Created: (NUTCH-247) robot parser to restrict. by Parth (Jira)
1
by Parth (Jira)
[jira] Created: (NUTCH-310) Review Log Levels by Parth (Jira)
2
by Parth (Jira)
[jira] Created: (NUTCH-262) Summary excerpts and highlights problems by Parth (Jira)
1
by Parth (Jira)
[jira] Created: (NUTCH-251) Administration GUI by Parth (Jira)
10
by Parth (Jira)
[jira] Created: (NUTCH-74) French Analyzer Plugin by Parth (Jira)
7
by Parth (Jira)
[jira] Created: (NUTCH-86) LanguageIdentifier API enhancements by Parth (Jira)
1
by Parth (Jira)
[jira] Created: (NUTCH-246) segment size is never as big as topN or crawlDB size in a distributed deployement by Parth (Jira)
8
by Parth (Jira)
Limiting Results By Domain by Robert Sanford
0
by Robert Sanford
Scanning the database by Robert Sanford
1
by Stefan Neufeind
Indexing href attribute in links. by Robert Sanford
0
by Robert Sanford
Library for extracting text content from binaries by Jukka Zitting
4
by Jukka Zitting
Why was "prune" removed in 0.8? by Stefan Neufeind
1
by Andrzej Białecki-2
segread vs. readseg by Stefan Groschupf-2
4
by Stefan Groschupf-2
[jira] Created: (NUTCH-324) db.score.link.internal and db.score.link.external are ignored by Parth (Jira)
2
by Parth (Jira)
[jira] Created: (NUTCH-167) Observation of <META NAME="ROBOTS" CONTENT="NOARCHIVE"> directive by Parth (Jira)
2
by Parth (Jira)
[jira] Created: (NUTCH-329) CrawlDbReader processTopNJob does not set jobNames by Parth (Jira)
1
by Parth (Jira)
result comparison tool? by Stefan Groschupf-2
1
by kkrugler
tests failing by Sami Siren-2
1
by Stefan Groschupf-2
[Fwd: Re: [jira] Commented: (NUTCH-271) Meta-data per URL/site/section] by Stefan Neufeind
0
by Stefan Neufeind
[jira] Created: (NUTCH-328) commons-cli-2.0-SNAPSHOT.jar provided with nutch is not compatible with jdk 1.4 by Parth (Jira)
1
by Parth (Jira)
[jira] Created: (NUTCH-327) bin/nutch setting of log path problems on cygwin by Parth (Jira)
1
by Parth (Jira)
Changing javac.version to 1.5? by Greg Kim
1
by Andrzej Białecki-2
[jira] Created: (NUTCH-326) WordExtractor throws java.util.NoSuchElementException on some documents by Parth (Jira)
0
by Parth (Jira)
multiple query filters by Chris Stephens-3
0
by Chris Stephens-3
Distributed Matrix Computering on Hadoop by Jack.Tang
0
by Jack.Tang
log when blocked by robots.txt by Stefan Groschupf-2
1
by Piotr Kosiorowski
[jira] Created: (NUTCH-271) Meta-data per URL/site/section by Parth (Jira)
7
by Stefan Neufeind
nutch-extensionpoints not in plugin.includes by Stefan Groschupf-2
2
by Stefan Groschupf-2
1 ... 577578579580581582583 ... 612