Apache Tika - Development

This forum is an archive for the mailing list tika-dev@lucene.apache.org (more options) Messages posted here will be sent to this mailing list.
Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries.
1 ... 441442443444445446447 ... 506
Topics (17686)
Replies Last Post Views
[jira] [Created] (TIKA-911) Converted PDF document contains question marks in place of spaces and inconsistent case by JIRA jira@apache.org
7
by JIRA jira@apache.org
[jira] [Created] (TIKA-758) Address TODOs when we upgrade to next PDFBox release by JIRA jira@apache.org
6
by JIRA jira@apache.org
[jira] [Created] (TIKA-958) MIME magic for HDF4 and HDF5 by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] [Created] (TIKA-957) Mimetype magic entry for NITF images by JIRA jira@apache.org
1
by JIRA jira@apache.org
Can't build javadocs for 1.2 API site docs by Mattmann, Chris A (3...
6
by Mattmann, Chris A (3...
[ANNOUNCE] Apache Tika 1.2 released by Mattmann, Chris A (3...
0
by Mattmann, Chris A (3...
[jira] [Created] (TIKA-955) Unable to extract "Track Changes" metadata from a microsoft word document by JIRA jira@apache.org
0
by JIRA jira@apache.org
Fixing the <title/> problem of TIKA-895 and TIKA-914 by john m-2
4
by john m-2
[RESULT] [VOTE] Apache Tika 1.2 release rc #1 by Mattmann, Chris A (3...
0
by Mattmann, Chris A (3...
[VOTE] Apache Tika 1.2 release rc #1 by Mattmann, Chris A (3...
13
by Mattmann, Chris A (3...
Tika build error using Maven by 122jxgcn
1
by Nick Burch-4
[jira] [Created] (TIKA-945) Upgrade tika-server to CXF 2.6.1 by JIRA jira@apache.org
2
by JIRA jira@apache.org
Re: JAX-RS overhead in tika-server by Sergey Beryozkin
0
by Sergey Beryozkin
[jira] [Created] (TIKA-872) Tika --extract fails for RTF by JIRA jira@apache.org
6
by JIRA jira@apache.org
[jira] [Created] (TIKA-754) Automatic line break insertion (BR element) instead of '\n' in XHTMLContentHandler by JIRA jira@apache.org
7
by JIRA jira@apache.org
[jira] [Commented] (TIKA-456) Support timeouts for parsers by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] [Created] (TIKA-892) Tika does not use the HTML5 meta charset tag when determining charset by JIRA jira@apache.org
9
by JIRA jira@apache.org
FYI: text/plain and text/html media types now come with charset info by Jukka Zitting
0
by Jukka Zitting
[jira] [Resolved] (TIKA-431) Tika currently misuses the HTTP Content-Encoding header, and does not seem to use the charset part of the Content-Type header properly. by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] [Commented] (TIKA-456) Support timeouts for parsers by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] [Resolved] (TIKA-242) Incremental configuration AutoDetectParser by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] [Resolved] (TIKA-430) Automatically let all valid XHTML 1.0 attributes through from HTML documents by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] [Resolved] (TIKA-458) Specify HTMLHandler via Context by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] [Resolved] (TIKA-502) Add programming language mime-types by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] [Resolved] (TIKA-471) Avoid Charset name bottleneck when multiple threads are using HtmlParser by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] [Resolved] (TIKA-518) Attribute values are not indexed by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] [Resolved] (TIKA-530) InvalidFormatException on a PackagePart in OOXML by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] [Resolved] (TIKA-482) Refactor image and jpeg parsers for access to MetadataExtractor API by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] [Resolved] (TIKA-322) Improve encoding detection speed and accuracy by JIRA jira@apache.org
0
by JIRA jira@apache.org
[jira] Created: (TIKA-561) Support EMLX file detection by JIRA jira@apache.org
2
by JIRA jira@apache.org
Build failed in Jenkins: Tika-trunk #895 by Apache Jenkins Serve...
1
by Apache Jenkins Serve...
[jira] [Created] (TIKA-951) Bundle activation policy for Eclipse by JIRA jira@apache.org
1
by JIRA jira@apache.org
buildbot success in ASF Buildbot on tika-trunk by buildbot
0
by buildbot
[jira] [Created] (TIKA-949) Mimetype magic needed for mapping formats such as XMind Pro and MindMapper by JIRA jira@apache.org
1
by JIRA jira@apache.org
buildbot failure in ASF Buildbot on tika-trunk by buildbot
1
by Jukka Zitting
1 ... 441442443444445446447 ... 506