PDFBox 0.8.0

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

PDFBox 0.8.0

Phil Hagelberg-2

Hi.

I'm running into the "org.pdfbox.cos.COSArray cannot be cast to
org.pdfbox.cos.COSDictionary" exception parsing quite often certain PDFs
with Tika. I noticed that it's been fixed in the trunk of PDFBox (0.8.0):

https://issues.apache.org/jira/browse/PDFBOX-343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12638409#action_12638409

Unfortunately this version of PDFBox is not a drop-in replacement since
they shuffled things around and it now exists under the
org.apache.pdfbox package instead of org.pdfbox.

Is there a timeline for upgrading to PDFBox 0.8.0? Perhaps the upgrade
could be done in a branch that could be merged once 0.8.0 is released?
If it's a simple matter of replacing "org.pdfbox" with
"org.apache.pdfbox" I could volunteer for that, but if the upgrade is
more complicated it may very well be beyond my meager Java skills.

thanks,
Phil
http://technomancy.us