Nutch - Dev

This forum is an archive for the mailing list dev@nutch.apache.org (more options) Messages posted here will be sent to this mailing list.
If you'd like to contribute to Nutch, please subscribe to the Nutch developer mailing list.
1 ... 564565566567568569570 ... 604
Topics (21135)
Replies Last Post Views
[jira] Closed: (NUTCH-118) FAQ link points to invalid URL by Tim Allison (Jira)
0
by Tim Allison (Jira)
[jira] Closed: (NUTCH-116) TestNDFS a JUnit test specifically for NDFS by Tim Allison (Jira)
0
by Tim Allison (Jira)
[jira] Closed: (NUTCH-114) getting number of urls and links from crawldb by Tim Allison (Jira)
0
by Tim Allison (Jira)
[jira] Closed: (NUTCH-110) OpenSearchServlet outputs illegal xml characters by Tim Allison (Jira)
0
by Tim Allison (Jira)
[jira] Closed: (NUTCH-108) tasktracker crashs when reconnecting to a new jobtracker. by Tim Allison (Jira)
0
by Tim Allison (Jira)
[jira] Closed: (NUTCH-102) jobtracker does not start when webapps is in src by Tim Allison (Jira)
0
by Tim Allison (Jira)
[jira] Closed: (NUTCH-88) Enhance ParserFactory plugin selection policy by Tim Allison (Jira)
0
by Tim Allison (Jira)
[jira] Closed: (NUTCH-81) Webapp only works when deployed in root by Tim Allison (Jira)
0
by Tim Allison (Jira)
[jira] Closed: (NUTCH-53) Parser plugin for Zip files by Tim Allison (Jira)
0
by Tim Allison (Jira)
[jira] Closed: (NUTCH-52) Parser plugin for MS Excel files by Tim Allison (Jira)
0
by Tim Allison (Jira)
[jira] Commented: (NUTCH-48) "Did you mean" query enhancement/refignment feature request by Tim Allison (Jira)
0
by Tim Allison (Jira)
[jira] Closed: (NUTCH-150) OutlinkExtractor extremely slow on some non-plain text by Tim Allison (Jira)
0
by Tim Allison (Jira)
[jira] Created: (NUTCH-379) ParseUtil does not pass through the content's URL to the ParserFactory by Tim Allison (Jira)
5
by Tim Allison (Jira)
[jira] Created: (NUTCH-391) ParseUtil logs file contents to log file when it cannot find parser by Tim Allison (Jira)
1
by Tim Allison (Jira)
[jira] Updated: (NUTCH-185) XMLParser is configurable xml parser plugin. by Tim Allison (Jira)
0
by Tim Allison (Jira)
[jira] Commented: (NUTCH-185) XMLParser is configurable xml parser plugin. by Tim Allison (Jira)
0
by Tim Allison (Jira)
outlink extractor finds lots of junk by AJ Chen-2
0
by AJ Chen-2
[jira] Updated: (NUTCH-185) XMLParser is configurable xml parser plugin. by Tim Allison (Jira)
0
by Tim Allison (Jira)
[jira] Updated: (NUTCH-185) XMLParser is configurable xml parser plugin. by Tim Allison (Jira)
0
by Tim Allison (Jira)
[jira] Updated: (NUTCH-162) country code "jp" is used instead of language code "ja" for Japanese by Tim Allison (Jira)
0
by Tim Allison (Jira)
Problem parsing some MS Excel & other formats (Office 2003) by tryma
8
by Aisha-2
What javacc options should I use to compile NutchAnalysis.jj? by T. Kuro Kurosaka
1
by T. Kuro Kurosaka
I modify NutchAnalysis.jj and NutchDocumentTokenizer.java to let nutch support chinese word. by heack
1
by T. Kuro Kurosaka
Issue with Boosting Fields by Paul Ramirez
3
by ian.mcnaney
[jira] Commented: (NUTCH-339) Refactor nutch to allow fetcher improvements by Tim Allison (Jira)
3
by Piotr Kosiorowski
[jira] Created: (NUTCH-383) Upgrade Nutch to Hadoop 0.7 by Tim Allison (Jira)
7
by Tim Allison (Jira)
[jira] Closed: (NUTCH-371) DeleteDuplicates should remove documents with duplicate URLs by Tim Allison (Jira)
0
by Tim Allison (Jira)
HEADS UP: rev. 464654 - upgrade to Hadoop 0.7.1 breaks data compatibility by Andrzej Białecki-2
0
by Andrzej Białecki-2
[jira] Commented: (NUTCH-185) XMLParser is configurable xml parser plugin. by Tim Allison (Jira)
1
by Rida Benjelloun
[jira] Commented: (NUTCH-224) Nutch doesn't handle Korean text at all by Tim Allison (Jira)
0
by Tim Allison (Jira)
[jira] Commented: (NUTCH-357) crawling simulation by Tim Allison (Jira)
0
by Tim Allison (Jira)
[jira] Commented: (NUTCH-357) crawling simulation by Tim Allison (Jira)
0
by Tim Allison (Jira)
[jira] Commented: (NUTCH-357) crawling simulation by Tim Allison (Jira)
0
by Tim Allison (Jira)
[Nutch-dev] Re: Which extension point should I extend? by xu nutch
0
by xu nutch
Nutch nightly build failure by Nutch - Dev mailing ...
0
by Nutch - Dev mailing ...
1 ... 564565566567568569570 ... 604