Apache Tika - Development

This forum is an archive for the mailing list tika-dev@lucene.apache.org (more options) Messages posted here will be sent to this mailing list.
Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries.
1 ... 642643644645646647648 ... 698
Topics (24406)
Replies Last Post Views
[jira] [Created] (TIKA-794) Mime magic logic for Little16 is incorrect by Tim Allison (Jira)
2
by Tim Allison (Jira)
[jira] [Created] (TIKA-697) Tika reports the content type of AR archives as "text/plain" by Tim Allison (Jira)
13
by Tim Allison (Jira)
Possible re-opening of resolved issue TIKA-738? by john m-2
4
by Mattmann, Chris A (3...
[jira] [Created] (TIKA-723) Rotated text isn't extracted correctly from PDFs by Tim Allison (Jira)
4
by Tim Allison (Jira)
[jira] [Created] (TIKA-778) NullPointerException in tika-app, parsing PDF content by Tim Allison (Jira)
4
by Tim Allison (Jira)
[jira] [Created] (TIKA-738) Tika fails to extract text from PDF annotations by Tim Allison (Jira)
8
by Tim Allison (Jira)
[jira] [Commented] (TIKA-513) Support of Deja Vu (DjVu) format by Tim Allison (Jira)
0
by Tim Allison (Jira)
[jira] [Created] (TIKA-789) Microsoft Project (MPP) basic support by Tim Allison (Jira)
4
by Tim Allison (Jira)
[jira] [Created] (TIKA-787) CharsetDetector text buffer is too small to small to correctly detect UTF-8 in HTML page by Tim Allison (Jira)
1
by Tim Allison (Jira)
Ogg Vorbis support by Nick Burch-4
1
by Nick Burch-4
[jira] [Created] (TIKA-786) Tika CLI --detect returns incorrect content-type for files with altered extensions by Tim Allison (Jira)
8
by Tim Allison (Jira)
[jira] [Created] (TIKA-784) Mimetype entry for DITA by Tim Allison (Jira)
5
by Tim Allison (Jira)
[jira] [Created] (TIKA-785) TikaCLI should include a --list-detectors option similar to --list-parsers by Tim Allison (Jira)
2
by Tim Allison (Jira)
[jira] [Created] (TIKA-734) Out of memory exception with Xlsx file less than 5 MB by Tim Allison (Jira)
12
by Tim Allison (Jira)
[jira] [Created] (TIKA-782) Add support for parsing binary data in RTF files by Tim Allison (Jira)
14
by Tim Allison (Jira)
Rich document indexing by kumar8anuj
0
by kumar8anuj
[jira] [Created] (TIKA-773) .NET version of Tika by Tim Allison (Jira)
3
by Tim Allison (Jira)
[jira] [Created] (TIKA-783) MD5 and SHA1 values posted on the download page for the .jar do not match actual computed values by Tim Allison (Jira)
1
by Tim Allison (Jira)
[jira] [Created] (TIKA-779) Detection of Microsoft Works 2000 Word Processor files by Tim Allison (Jira)
5
by Tim Allison (Jira)
Tika-605 GDAL Parser by Ramirez, Paul M (398...
1
by Nick Burch-4
[jira] [Created] (TIKA-663) JSP files data extraction failed by Tim Allison (Jira)
5
by Tim Allison (Jira)
Build failed in Jenkins: Tika-trunk #719 by Apache Jenkins Serve...
4
by Apache Jenkins Serve...
[jira] [Created] (TIKA-781) RTF parser should ignore most control words in ignore groups by Tim Allison (Jira)
4
by Tim Allison (Jira)
Build failed in Jenkins: Tika-trunk ยป Apache Tika core #719 by Apache Jenkins Serve...
1
by Apache Jenkins Serve...
[jira] [Created] (TIKA-780) Optimize loading of the media type registry by Tim Allison (Jira)
1
by Tim Allison (Jira)
buildbot success in ASF Buildbot on tika-trunk by buildbot
0
by buildbot
buildbot failure in ASF Buildbot on tika-trunk by buildbot
0
by buildbot
Build failed in Jenkins: Tika-trunk #717 by Apache Jenkins Serve...
1
by Apache Jenkins Serve...
Updating CHANGES.txt? by Nick Burch-4
13
by Michael McCandless-2
[Shameless Self Promotion] Tika in Action permanent discount code by Chris Mattmann-3
0
by Chris Mattmann-3
[jira] [Created] (TIKA-777) RTF parser incorrectly applies fonts to complete group by Tim Allison (Jira)
3
by Tim Allison (Jira)
[jira] [Resolved] (TIKA-529) IBM420 charset detection's isLamAlef is allocation-happy by Tim Allison (Jira)
0
by Tim Allison (Jira)
[jira] [Assigned] (TIKA-529) IBM420 charset detection's isLamAlef is allocation-happy by Tim Allison (Jira)
0
by Tim Allison (Jira)
[jira] [Created] (TIKA-679) Proposal for PRT Parser by Tim Allison (Jira)
31
by Tim Allison (Jira)
[ANNOUNCE] Apache Tika 1.0 released by Mattmann, Chris A (3...
1
by Zabrane Mickael
1 ... 642643644645646647648 ... 698