Apache Tika - Development

This forum is an archive for the mailing list tika-dev@lucene.apache.org (more options) Messages posted here will be sent to this mailing list.
Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries.
1 ... 597598599600601602603 ... 619
Topics (21631)
Replies Last Post Views
[jira] Created: (TIKA-271) secure-processing not supported by some JAXP implementations by JIRA jira@apache.org
3
by JIRA jira@apache.org
[VOTE] Apache Tika 0.5 release candidate #1 by Mattmann, Chris A (3...
10
by Jukka Zitting
[jira] Created: (TIKA-325) tika-parent/pom.xml missing <inceptionYear>2007</inceptionYear> by JIRA jira@apache.org
3
by JIRA jira@apache.org
[jira] Created: (TIKA-326) Map javax.imageio.IIOException to TikaException by JIRA jira@apache.org
1
by JIRA jira@apache.org
[jira] Created: (TIKA-320) Allow disabling language detection in AutoDetectParser by JIRA jira@apache.org
2
by JIRA jira@apache.org
[jira] Created: (TIKA-322) Improve encoding detection speed and accuracy by JIRA jira@apache.org
1
by JIRA jira@apache.org
[jira] Created: (TIKA-209) Language detection is weak. by JIRA jira@apache.org
12
by JIRA jira@apache.org
Build failed in Hudson: Tika-trunk #217 by Apache Hudson Server
1
by Apache Hudson Server
Build failed in Hudson: Tika-trunk » Apache Tika parent #217 by Apache Hudson Server
1
by Apache Hudson Server
Hudson build became unstable: Tika-trunk #213 by Apache Hudson Server
3
by Apache Hudson Server
Hudson build became unstable: Tika-trunk » Apache Tika parsers #213 by Apache Hudson Server
3
by Apache Hudson Server
Build Unstable by Mattmann, Chris A (3...
0
by Mattmann, Chris A (3...
Parse context - class or map? by Jukka Zitting
5
by Jukka Zitting
Tika facade - static or not by Jukka Zitting
8
by Mattmann, Chris A (3...
[jira] Created: (TIKA-313) patch: ODF improvements for svg:desc, presentation notes by JIRA jira@apache.org
2
by JIRA jira@apache.org
[jira] Created: (TIKA-315) Tika appears to skip over an entire section of a Microsoft Word Document by JIRA jira@apache.org
4
by JIRA jira@apache.org
[jira] Created: (TIKA-319) HtmlParser - use encoding hint only if charset is supported by JIRA jira@apache.org
1
by JIRA jira@apache.org
[jira] Commented: (TIKA-94) Speech recognition by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] Created: (TIKA-275) Parse context by JIRA jira@apache.org
1
by JIRA jira@apache.org
0.5 release by Mattmann, Chris A (3...
0
by Mattmann, Chris A (3...
[jira] Created: (TIKA-314) Initial support for JPEG EXIF metadata extraction by JIRA jira@apache.org
8
by JIRA jira@apache.org
Free live video streaming of ApacheCon US 2009 by Michael McCandless-2
1
by Israel Ekpo
Re: MarkUnsupportedException by Jukka Zitting
0
by Jukka Zitting
[jira] Created: (TIKA-187) Extract the summary.getCategory() from MSOffice documents by JIRA jira@apache.org
2
by JIRA jira@apache.org
[jira] Created: (TIKA-300) rename openoffice.. parser classes to odf.. by JIRA jira@apache.org
1
by JIRA jira@apache.org
[jira] Created: (TIKA-312) TikaCLI can't print metadata by JIRA jira@apache.org
2
by JIRA jira@apache.org
[jira] Created: (TIKA-301) patch: embedded ODF and office:annotation by JIRA jira@apache.org
2
by JIRA jira@apache.org
[jira] Created: (TIKA-302) patch: initial support for ePUB by JIRA jira@apache.org
4
by JIRA jira@apache.org
[jira] Created: (TIKA-304) HtmlParser could be easier to subclass by JIRA jira@apache.org
5
by JIRA jira@apache.org
[jira] Created: (TIKA-305) XHTML href attributes end up in the wrong namespace by JIRA jira@apache.org
2
by JIRA jira@apache.org
[jira] Created: (TIKA-303) XHTMLContentHandler mishandles headers by JIRA jira@apache.org
6
by JIRA jira@apache.org
[jira] Created: (TIKA-306) patch: OOXMLParserTest uses OpenOfficeParser by JIRA jira@apache.org
2
by JIRA jira@apache.org
[jira] Created: (TIKA-287) HtmlParser should resolve relative paths in <a href="xxx"> elements by JIRA jira@apache.org
8
by JIRA jira@apache.org
[jira] Created: (TIKA-311) Broken handling of <a name="..."/> tags by JIRA jira@apache.org
1
by JIRA jira@apache.org
FYI: NekoHTML/Xerces dependency replaced with TagSoup by Jukka Zitting
1
by kkrugler
1 ... 597598599600601602603 ... 619