Apache Tika - Development

This forum is an archive for the mailing list tika-dev@lucene.apache.org (more options) Messages posted here will be sent to this mailing list.
Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries.
1 ... 638639640641642643644 ... 689
Topics (24095)
Replies Last Post Views
[jira] [Commented] (TIKA-241) Rar archive support by Mihir Sharma (Jira)
0
by Mihir Sharma (Jira)
[jira] [Commented] (TIKA-241) Rar archive support by Mihir Sharma (Jira)
0
by Mihir Sharma (Jira)
Re: svn commit: r1173743 - /tika/trunk/tika-bundle/pom.xml by Jukka Zitting
0
by Jukka Zitting
Build failed in Jenkins: Tika-trunk #635 by Apache Jenkins Serve...
1
by Apache Jenkins Serve...
Build failed in Jenkins: Tika-trunk » Apache Tika core #635 by Apache Jenkins Serve...
1
by Apache Jenkins Serve...
[jira] [Created] (TIKA-726) Provide a way to distinguish generic parse error and parse error due to unknown/wrong decryption key by Mihir Sharma (Jira)
1
by Mihir Sharma (Jira)
[jira] [Created] (TIKA-716) Upgrade apache-Mime4J to Version 0.7 by Mihir Sharma (Jira)
1
by Mihir Sharma (Jira)
[jira] [Created] (TIKA-640) RFC822Parser should configure Mime4j not to fail reading mails containing more than 1000 chars in one headers text (even if folded) by Mihir Sharma (Jira)
7
by Mihir Sharma (Jira)
[jira] [Created] (TIKA-719) Concurrent usage of HtmlParser causes infinite loop in HashMap by Mihir Sharma (Jira)
3
by Mihir Sharma (Jira)
[jira] [Created] (TIKA-708) NPE Parsing MS Word 12.0.0 by Mihir Sharma (Jira)
7
by Mihir Sharma (Jira)
[jira] [Created] (TIKA-706) NPE Parsing MS PowerPoint 97-2003 by Mihir Sharma (Jira)
5
by Mihir Sharma (Jira)
[jira] [Created] (TIKA-707) IllegalArgumentException Parsing MS Word 97 - 2003 by Mihir Sharma (Jira)
6
by Mihir Sharma (Jira)
Media container formats? by Nick Burch-4
0
by Nick Burch-4
Build failed in Jenkins: Tika-trunk #629 by Apache Jenkins Serve...
1
by Apache Jenkins Serve...
Build failed in Jenkins: Tika-trunk » Apache Tika core #629 by Apache Jenkins Serve...
2
by Apache Jenkins Serve...
[jira] Created: (TIKA-546) Add ability to create language profiles to tika-app by Mihir Sharma (Jira)
15
by Mihir Sharma (Jira)
[jira] [Commented] (TIKA-431) Tika currently misuses the HTTP Content-Encoding header, and does not seem to use the charset part of the Content-Type header properly. by Mihir Sharma (Jira)
0
by Mihir Sharma (Jira)
[jira] [Commented] (TIKA-431) Tika currently misuses the HTTP Content-Encoding header, and does not seem to use the charset part of the Content-Type header properly. by Mihir Sharma (Jira)
0
by Mihir Sharma (Jira)
[jira] [Commented] (TIKA-431) Tika currently misuses the HTTP Content-Encoding header, and does not seem to use the charset part of the Content-Type header properly. by Mihir Sharma (Jira)
0
by Mihir Sharma (Jira)
[jira] [Commented] (TIKA-431) Tika currently misuses the HTTP Content-Encoding header, and does not seem to use the charset part of the Content-Type header properly. by Mihir Sharma (Jira)
0
by Mihir Sharma (Jira)
[jira] [Commented] (TIKA-431) Tika currently misuses the HTTP Content-Encoding header, and does not seem to use the charset part of the Content-Type header properly. by Mihir Sharma (Jira)
0
by Mihir Sharma (Jira)
[jira] [Commented] (TIKA-431) Tika currently misuses the HTTP Content-Encoding header, and does not seem to use the charset part of the Content-Type header properly. by Mihir Sharma (Jira)
0
by Mihir Sharma (Jira)
[jira] [Created] (TIKA-692) TikaCLI -x or -h on a Word doc sometimes adds newline after </b> tag by Mihir Sharma (Jira)
19
by Mihir Sharma (Jira)
[jira] [Created] (TIKA-688) Enhance content-type detector to recognize almost plain text by Mihir Sharma (Jira)
2
by Mihir Sharma (Jira)
[jira] Created: (TIKA-603) Tika 0.9 compiles fine but failed a unit test by Mihir Sharma (Jira)
11
by Mihir Sharma (Jira)
[jira] Created: (TIKA-598) Update HDF parser and NetCDF parser to emit minimal XHTML by Mihir Sharma (Jira)
1
by Mihir Sharma (Jira)
[jira] [Created] (TIKA-691) java.lang.ArrayIndexOutOfBoundsException by MS Word CDF V2 Document by Mihir Sharma (Jira)
6
by Mihir Sharma (Jira)
[jira] [Commented] (TIKA-431) Tika currently misuses the HTTP Content-Encoding header, and does not seem to use the charset part of the Content-Type header properly. by Mihir Sharma (Jira)
0
by Mihir Sharma (Jira)
Request for patch review - TIKA-431 by kkrugler
0
by kkrugler
[jira] [Updated] (TIKA-431) Tika currently misuses the HTTP Content-Encoding header, and does not seem to use the charset part of the Content-Type header properly. by Mihir Sharma (Jira)
0
by Mihir Sharma (Jira)
1.0 RC in next 2 weeks by Mattmann, Chris A (3...
8
by Michael McCandless-2
[jira] Created: (TIKA-594) Upgrade Tika to pdfbox 1.4.0 by Mihir Sharma (Jira)
6
by Mihir Sharma (Jira)
[jira] [Created] (TIKA-683) RTF Parser issues with non european characters by Mihir Sharma (Jira)
28
by Mihir Sharma (Jira)
[jira] [Created] (TIKA-666) Unable to extract content from RTF files by Mihir Sharma (Jira)
5
by Mihir Sharma (Jira)
[jira] [Commented] (TIKA-431) Tika currently misuses the HTTP Content-Encoding header, and does not seem to use the charset part of the Content-Type header properly. by Mihir Sharma (Jira)
0
by Mihir Sharma (Jira)
1 ... 638639640641642643644 ... 689