[jira] [Commented] (TIKA-3061) Streaming zip container detector stopping short

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (TIKA-3061) Streaming zip container detector stopping short

Chris Mattmann (Jira)

    [ https://issues.apache.org/jira/browse/TIKA-3061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17062336#comment-17062336 ]

Stefan Tandecki commented on TIKA-3061:
---------------------------------------

Unfortunately this introduced an annoying "e.printStackTrace();" in line 173. Shouldn't this exception be forwarded instead of printed (for all those files which aren't zip files anyway?)

> Streaming zip container detector stopping short
> -----------------------------------------------
>
>                 Key: TIKA-3061
>                 URL: https://issues.apache.org/jira/browse/TIKA-3061
>             Project: Tika
>          Issue Type: Task
>            Reporter: Tim Allison
>            Priority: Major
>
> In the recent regression runs in prep for 1.24, I found that in a few cases, an open office document inside of a zip was no longer identified as an open office document, but rather another zip file.
> For an unknown reason, the new {{detectStarOfficeX}} is doing something to the ziparchiveinputstream that is causing it to silently fail to iterate through all of the entries in the zip file...or, in short, causing it to stop short.  If we copy the bytes to a byte array and then process them, all is well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)