Nutch - Dev

This forum is an archive for the mailing list dev@nutch.apache.org (more options) Messages posted here will be sent to this mailing list.
If you'd like to contribute to Nutch, please subscribe to the Nutch developer mailing list.
1 ... 480481482483484485486 ... 523
Topics (18300)
Replies Last Post Views
Welcome Chris Mattmann as Nutch committer by Andrzej Białecki-2
1
by chrismattmann
[jira] Updated: (NUTCH-251) Administration GUI by JIRA jira@apache.org
1
by Zaheed Haque
[jira] Closed: (NUTCH-331) Fetcher incorrectly reports task progress to tasktracker resulting in skipped URLs by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] Commented: (NUTCH-331) Fetcher incorrectly reports task progress to tasktracker resulting in skipped URLs by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] Commented: (NUTCH-331) Fetcher incorrectly reports task progress to tasktracker resulting in skipped URLs by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] Created: (NUTCH-395) Increase fetching speed by JIRA jira@apache.org
16
by AJ Chen-2
Nutch - Hadoop error by Armel T. Nene-2
0
by Armel T. Nene-2
Nutch folder configuration by Armel T. Nene-2
1
by Armel T. Nene-2
[jira] Commented: (NUTCH-251) Administration GUI by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] Resolved: (NUTCH-305) Update crawl and url filter lists to exclude jpeg|JPEG|bmp|BMP by JIRA jira@apache.org
0
by JIRA jira@apache.org
Nutch sessions cookies https by g.marras
0
by g.marras
[jira] Resolved: (NUTCH-362) Remove parse-text from unsupported filetypes in parse-plugins.xml by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] Closed: (NUTCH-349) Port Nutch to use Hadoop Text instead of UTF8 by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] Created: (NUTCH-380) Nutch does not run/build against Hadoop 0.6 by JIRA jira@apache.org
2
by JIRA jira@apache.org
[jira] Created: (NUTCH-405) Content object is not properly initialized in map method of ParseSegment by JIRA jira@apache.org
1
by JIRA jira@apache.org
Nutch crawl a Application Server Authentication by g.marras
0
by g.marras
Nutch HTTPS & Sessions by g.marras
0
by g.marras
[jira] Commented: (NUTCH-251) Administration GUI by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] Commented: (NUTCH-185) XMLParser is configurable xml parser plugin. by JIRA jira@apache.org
1
by Armel T. Nene-2
[jira] Commented: (NUTCH-251) Administration GUI by JIRA jira@apache.org
0
by JIRA jira@apache.org
Errors in RegexURLFilter by scott green
2
by scott green
[jira] Updated: (NUTCH-92) DistributedSearch incorrectly scores results by JIRA jira@apache.org
0
by JIRA jira@apache.org
Can I rewrite org.apache.nutch.parse.msword.extractText(InputStream input) like this by TKDD
0
by TKDD
[jira] Created: (NUTCH-403) Make URL filtering optional in Generator by JIRA jira@apache.org
4
by JIRA jira@apache.org
[jira] Commented: (NUTCH-273) When a page is redirected, the original url is NOT updated. by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] Created: (NUTCH-404) Fix LinkDB Usage - implementation mismatch by JIRA jira@apache.org
1
by JIRA jira@apache.org
[jira] Created: (NUTCH-388) nutch-default.xml has outdated example for urlfilter.order by JIRA jira@apache.org
1
by JIRA jira@apache.org
[jira] Commented: (NUTCH-261) Multi Language Support by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] Commented: (NUTCH-289) CrawlDatum should store IP address by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] Updated: (NUTCH-289) CrawlDatum should store IP address by JIRA jira@apache.org
0
by JIRA jira@apache.org
File Protocol by Armel T. Nene-2
0
by Armel T. Nene-2
[jira] Closed: (NUTCH-378) MetaWrapper decorator by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] Created: (NUTCH-401) Hardcoded /tmp directory in SegmentReader by JIRA jira@apache.org
2
by JIRA jira@apache.org
implement thai lanaguage analyzer in nutch by sanjeev-5
18
by sanjeev-5
Nutch requires now Java 1.5 by Andrzej Białecki-2
0
by Andrzej Białecki-2
1 ... 480481482483484485486 ... 523