Apache Tika - Development

This forum is an archive for the mailing list tika-dev@lucene.apache.org (more options) Messages posted here will be sent to this mailing list.
Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries.
1 ... 597598599600601602603 ... 661
Topics (23129)
Replies Last Post Views
[jira] [Created] (TIKA-892) Tika does not use the HTML5 meta charset tag when determining charset by Tim Allison (Jira)
9
by Tim Allison (Jira)
FYI: text/plain and text/html media types now come with charset info by Jukka Zitting
0
by Jukka Zitting
[jira] [Resolved] (TIKA-431) Tika currently misuses the HTTP Content-Encoding header, and does not seem to use the charset part of the Content-Type header properly. by Tim Allison (Jira)
0
by Tim Allison (Jira)
[jira] [Commented] (TIKA-456) Support timeouts for parsers by Tim Allison (Jira)
0
by Tim Allison (Jira)
[jira] [Resolved] (TIKA-242) Incremental configuration AutoDetectParser by Tim Allison (Jira)
0
by Tim Allison (Jira)
[jira] [Resolved] (TIKA-430) Automatically let all valid XHTML 1.0 attributes through from HTML documents by Tim Allison (Jira)
0
by Tim Allison (Jira)
[jira] [Resolved] (TIKA-458) Specify HTMLHandler via Context by Tim Allison (Jira)
0
by Tim Allison (Jira)
[jira] [Resolved] (TIKA-502) Add programming language mime-types by Tim Allison (Jira)
0
by Tim Allison (Jira)
[jira] [Resolved] (TIKA-471) Avoid Charset name bottleneck when multiple threads are using HtmlParser by Tim Allison (Jira)
0
by Tim Allison (Jira)
[jira] [Resolved] (TIKA-518) Attribute values are not indexed by Tim Allison (Jira)
0
by Tim Allison (Jira)
[jira] [Resolved] (TIKA-530) InvalidFormatException on a PackagePart in OOXML by Tim Allison (Jira)
0
by Tim Allison (Jira)
[jira] [Resolved] (TIKA-482) Refactor image and jpeg parsers for access to MetadataExtractor API by Tim Allison (Jira)
0
by Tim Allison (Jira)
[jira] [Resolved] (TIKA-322) Improve encoding detection speed and accuracy by Tim Allison (Jira)
0
by Tim Allison (Jira)
[jira] Created: (TIKA-561) Support EMLX file detection by Tim Allison (Jira)
2
by Tim Allison (Jira)
Build failed in Jenkins: Tika-trunk #895 by Apache Jenkins Serve...
1
by Apache Jenkins Serve...
[jira] [Created] (TIKA-951) Bundle activation policy for Eclipse by Tim Allison (Jira)
1
by Tim Allison (Jira)
buildbot success in ASF Buildbot on tika-trunk by buildbot
0
by buildbot
[jira] [Created] (TIKA-949) Mimetype magic needed for mapping formats such as XMind Pro and MindMapper by Tim Allison (Jira)
1
by Tim Allison (Jira)
buildbot failure in ASF Buildbot on tika-trunk by buildbot
1
by Jukka Zitting
Build failed in Jenkins: Tika-trunk #887 by Apache Jenkins Serve...
8
by Apache Jenkins Serve...
buildbot success in ASF Buildbot on tika-trunk by buildbot
0
by buildbot
Re: svn commit: r1355877 - in /tika/trunk: ./ tika-dll/ tika-dll/src/ tika-dll/src/main/ tika-dll/src/main/csharp/ tika-dll/src/main/csharp/Apache/ by Mattmann, Chris A (3...
1
by Jukka Zitting
[jira] [Created] (TIKA-930) Consolidation of Some Tika Core Properties by Tim Allison (Jira)
7
by Tim Allison (Jira)
[jira] [Created] (TIKA-947) AbstractMetadataHandler addMetadata Does not Check Property.isMultiValuePermitted by Tim Allison (Jira)
1
by Tim Allison (Jira)
JAX-RS overhead in tika-server by Jukka Zitting
11
by Joerg Ehrlich
[jira] [Resolved] (TIKA-513) Support of Deja Vu (DjVu) format by Tim Allison (Jira)
0
by Tim Allison (Jira)
[jira] [Created] (TIKA-757) Address TODOs when we upgrade to next POI release (3.8 beta 5) by Tim Allison (Jira)
6
by Tim Allison (Jira)
[jira] [Created] (TIKA-817) (PPT/PPTX) Missing date/time in text content. by Tim Allison (Jira)
5
by Tim Allison (Jira)
[jira] Created: (TIKA-605) Tika GDAL parser by Tim Allison (Jira)
10
by Tim Allison (Jira)
[jira] [Created] (TIKA-891) Use POST in addition to PUT on method calls in tika-server by Tim Allison (Jira)
2
by Tim Allison (Jira)
[jira] [Created] (TIKA-819) Make Option to Exclude Embedded Files' Text for Text Content by Tim Allison (Jira)
7
by Tim Allison (Jira)
[jira] [Created] (TIKA-776) ExifTool Embedder by Tim Allison (Jira)
6
by Tim Allison (Jira)
[jira] [Created] (TIKA-774) ExifTool Parser by Tim Allison (Jira)
9
by Tim Allison (Jira)
[jira] [Created] (TIKA-715) Some parsers produce non-well-formed XHTML SAX events by Tim Allison (Jira)
8
by Tim Allison (Jira)
Re: svn commit: r1355947 - /tika/trunk/tika-parent/pom.xml by Mattmann, Chris A (3...
0
by Mattmann, Chris A (3...
1 ... 597598599600601602603 ... 661