Apache Tika - Development

This forum is an archive for the mailing list tika-dev@lucene.apache.org (more options) Messages posted here will be sent to this mailing list.
Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries.
1 ... 441442443444445446447 ... 497
Topics (17378)
Replies Last Post Views
[jira] [Created] (TIKA-797) MimeType.getExtension for application/vnd.ms-powerpoint returns ppz. I'd expect ppt. by JIRA jira@apache.org
3
by JIRA jira@apache.org
tika's beta dependency by ankush chadha
1
by Jukka Zitting
[jira] [Created] (TIKA-623) Add support for Outlook PST by JIRA jira@apache.org
31
by JIRA jira@apache.org
Tesseract OCR engine by Mattmann, Chris A (3...
4
by Alex Ott
[jira] [Created] (TIKA-724) PDF text sometimes has extra space between letters by JIRA jira@apache.org
9
by JIRA jira@apache.org
review board? by Alex Ott
1
by Mattmann, Chris A (3...
[jira] [Created] (TIKA-790) Reduce duplication between POIFSDocumentType (in OfficeParser) and POIFSContainerDetector by JIRA jira@apache.org
3
by JIRA jira@apache.org
[jira] [Created] (TIKA-794) Mime magic logic for Little16 is incorrect by JIRA jira@apache.org
2
by JIRA jira@apache.org
[jira] [Created] (TIKA-697) Tika reports the content type of AR archives as "text/plain" by JIRA jira@apache.org
13
by JIRA jira@apache.org
Possible re-opening of resolved issue TIKA-738? by john m-2
4
by Mattmann, Chris A (3...
[jira] [Created] (TIKA-723) Rotated text isn't extracted correctly from PDFs by JIRA jira@apache.org
4
by JIRA jira@apache.org
[jira] [Created] (TIKA-778) NullPointerException in tika-app, parsing PDF content by JIRA jira@apache.org
4
by JIRA jira@apache.org
[jira] [Created] (TIKA-738) Tika fails to extract text from PDF annotations by JIRA jira@apache.org
8
by JIRA jira@apache.org
[jira] [Commented] (TIKA-513) Support of Deja Vu (DjVu) format by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] [Created] (TIKA-789) Microsoft Project (MPP) basic support by JIRA jira@apache.org
4
by JIRA jira@apache.org
[jira] [Created] (TIKA-787) CharsetDetector text buffer is too small to small to correctly detect UTF-8 in HTML page by JIRA jira@apache.org
1
by JIRA jira@apache.org
Ogg Vorbis support by Nick Burch-4
1
by Nick Burch-4
[jira] [Created] (TIKA-786) Tika CLI --detect returns incorrect content-type for files with altered extensions by JIRA jira@apache.org
8
by JIRA jira@apache.org
[jira] [Created] (TIKA-784) Mimetype entry for DITA by JIRA jira@apache.org
5
by JIRA jira@apache.org
[jira] [Created] (TIKA-785) TikaCLI should include a --list-detectors option similar to --list-parsers by JIRA jira@apache.org
2
by JIRA jira@apache.org
[jira] [Created] (TIKA-734) Out of memory exception with Xlsx file less than 5 MB by JIRA jira@apache.org
12
by JIRA jira@apache.org
[jira] [Created] (TIKA-782) Add support for parsing binary data in RTF files by JIRA jira@apache.org
14
by JIRA jira@apache.org
Rich document indexing by kumar8anuj
0
by kumar8anuj
[jira] [Created] (TIKA-773) .NET version of Tika by JIRA jira@apache.org
3
by JIRA jira@apache.org
[jira] [Created] (TIKA-783) MD5 and SHA1 values posted on the download page for the .jar do not match actual computed values by JIRA jira@apache.org
1
by JIRA jira@apache.org
[jira] [Created] (TIKA-779) Detection of Microsoft Works 2000 Word Processor files by JIRA jira@apache.org
5
by JIRA jira@apache.org
Tika-605 GDAL Parser by Ramirez, Paul M (398...
1
by Nick Burch-4
[jira] [Created] (TIKA-663) JSP files data extraction failed by JIRA jira@apache.org
5
by JIRA jira@apache.org
Build failed in Jenkins: Tika-trunk #719 by Apache Jenkins Serve...
4
by Apache Jenkins Serve...
[jira] [Created] (TIKA-781) RTF parser should ignore most control words in ignore groups by JIRA jira@apache.org
4
by JIRA jira@apache.org
Build failed in Jenkins: Tika-trunk ยป Apache Tika core #719 by Apache Jenkins Serve...
1
by Apache Jenkins Serve...
[jira] [Created] (TIKA-780) Optimize loading of the media type registry by JIRA jira@apache.org
1
by JIRA jira@apache.org
buildbot success in ASF Buildbot on tika-trunk by buildbot
0
by buildbot
buildbot failure in ASF Buildbot on tika-trunk by buildbot
0
by buildbot
Build failed in Jenkins: Tika-trunk #717 by Apache Jenkins Serve...
1
by Apache Jenkins Serve...
1 ... 441442443444445446447 ... 497