[jira] [Comment Edited] (TIKA-2666) Document last printed in the year 27321

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Comment Edited] (TIKA-2666) Document last printed in the year 27321

JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/TIKA-2666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16511471#comment-16511471 ]

Isabelle Giguere edited comment on TIKA-2666 at 6/13/18 6:11 PM:
-----------------------------------------------------------------

Thanks for investigating, [~[hidden email]].

I have opened an issue for POI : https://bz.apache.org/bugzilla/show_bug.cgi?id=62451


was (Author: igiguere):
Thanks for investigating, [~[hidden email]].

I will open an issue for POI.

> Document last printed in the year 27321
> ---------------------------------------
>
>                 Key: TIKA-2666
>                 URL: https://issues.apache.org/jira/browse/TIKA-2666
>             Project: Tika
>          Issue Type: Bug
>    Affects Versions: 1.17
>            Reporter: Isabelle Giguere
>            Priority: Minor
>         Attachments: Genetic_Factors_and_the_Directionality_of.ppt, PPT_lastPrinted_00.png, tika-app-1.17.metadata.txt
>
>
> Tika extracts a strange last print date for the attached PowerPoint (97-2003)
> In the attached screen shot PPT_lastPrinted_00.png, the date for last print was set to 00:00
> But when Tika extracts metadata from this document, the last print date is in the year 27321 !
> Last-Printed: 27321-01-23T08:20:12Z
> meta:print-date: 27321-01-23T08:20:12Z
> Attached metadata obtained using Tika 1.17
> This weird date is causing issues further down in processing.  We can probably filter it out for now, but I do wonder how 00:00 turns into 27321-01-23T08:20:12Z



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)