Apache Tika - Development

This forum is an archive for the mailing list tika-dev@lucene.apache.org (more options) Messages posted here will be sent to this mailing list.
Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries.
1 ... 550551552553554555556 ... 606
Topics (21184)
Replies Last Post Views
Possible re-opening of resolved issue TIKA-738? by john m-2
4
by Mattmann, Chris A (3...
[jira] [Created] (TIKA-723) Rotated text isn't extracted correctly from PDFs by JIRA jira@apache.org
4
by JIRA jira@apache.org
[jira] [Created] (TIKA-778) NullPointerException in tika-app, parsing PDF content by JIRA jira@apache.org
4
by JIRA jira@apache.org
[jira] [Created] (TIKA-738) Tika fails to extract text from PDF annotations by JIRA jira@apache.org
8
by JIRA jira@apache.org
[jira] [Commented] (TIKA-513) Support of Deja Vu (DjVu) format by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] [Created] (TIKA-789) Microsoft Project (MPP) basic support by JIRA jira@apache.org
4
by JIRA jira@apache.org
[jira] [Created] (TIKA-787) CharsetDetector text buffer is too small to small to correctly detect UTF-8 in HTML page by JIRA jira@apache.org
1
by JIRA jira@apache.org
Ogg Vorbis support by Nick Burch-4
1
by Nick Burch-4
[jira] [Created] (TIKA-786) Tika CLI --detect returns incorrect content-type for files with altered extensions by JIRA jira@apache.org
8
by JIRA jira@apache.org
[jira] [Created] (TIKA-784) Mimetype entry for DITA by JIRA jira@apache.org
5
by JIRA jira@apache.org
[jira] [Created] (TIKA-785) TikaCLI should include a --list-detectors option similar to --list-parsers by JIRA jira@apache.org
2
by JIRA jira@apache.org
[jira] [Created] (TIKA-734) Out of memory exception with Xlsx file less than 5 MB by JIRA jira@apache.org
12
by JIRA jira@apache.org
[jira] [Created] (TIKA-782) Add support for parsing binary data in RTF files by JIRA jira@apache.org
14
by JIRA jira@apache.org
Rich document indexing by kumar8anuj
0
by kumar8anuj
[jira] [Created] (TIKA-773) .NET version of Tika by JIRA jira@apache.org
3
by JIRA jira@apache.org
[jira] [Created] (TIKA-783) MD5 and SHA1 values posted on the download page for the .jar do not match actual computed values by JIRA jira@apache.org
1
by JIRA jira@apache.org
[jira] [Created] (TIKA-779) Detection of Microsoft Works 2000 Word Processor files by JIRA jira@apache.org
5
by JIRA jira@apache.org
Tika-605 GDAL Parser by Ramirez, Paul M (398...
1
by Nick Burch-4
[jira] [Created] (TIKA-663) JSP files data extraction failed by JIRA jira@apache.org
5
by JIRA jira@apache.org
Build failed in Jenkins: Tika-trunk #719 by Apache Jenkins Serve...
4
by Apache Jenkins Serve...
[jira] [Created] (TIKA-781) RTF parser should ignore most control words in ignore groups by JIRA jira@apache.org
4
by JIRA jira@apache.org
Build failed in Jenkins: Tika-trunk ยป Apache Tika core #719 by Apache Jenkins Serve...
1
by Apache Jenkins Serve...
[jira] [Created] (TIKA-780) Optimize loading of the media type registry by JIRA jira@apache.org
1
by JIRA jira@apache.org
buildbot success in ASF Buildbot on tika-trunk by buildbot
0
by buildbot
buildbot failure in ASF Buildbot on tika-trunk by buildbot
0
by buildbot
Build failed in Jenkins: Tika-trunk #717 by Apache Jenkins Serve...
1
by Apache Jenkins Serve...
Updating CHANGES.txt? by Nick Burch-4
13
by Michael McCandless-2
[Shameless Self Promotion] Tika in Action permanent discount code by Chris Mattmann-3
0
by Chris Mattmann-3
[jira] [Created] (TIKA-777) RTF parser incorrectly applies fonts to complete group by JIRA jira@apache.org
3
by JIRA jira@apache.org
[jira] [Resolved] (TIKA-529) IBM420 charset detection's isLamAlef is allocation-happy by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] [Assigned] (TIKA-529) IBM420 charset detection's isLamAlef is allocation-happy by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] [Created] (TIKA-679) Proposal for PRT Parser by JIRA jira@apache.org
31
by JIRA jira@apache.org
[ANNOUNCE] Apache Tika 1.0 released by Mattmann, Chris A (3...
1
by Zabrane Mickael
[RESULT] [VOTE] Apache Tika 1.0 release rc #1 by Mattmann, Chris A (3...
0
by Mattmann, Chris A (3...
A problem in the right-to-left languages by ahmad ajiloo
11
by ahmad ajiloo
1 ... 550551552553554555556 ... 606