PDFBox 0.8.0

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

PDFBox 0.8.0

Phil Hagelberg-2


I'm running into the "org.pdfbox.cos.COSArray cannot be cast to
org.pdfbox.cos.COSDictionary" exception parsing quite often certain PDFs
with Tika. I noticed that it's been fixed in the trunk of PDFBox (0.8.0):


Unfortunately this version of PDFBox is not a drop-in replacement since
they shuffled things around and it now exists under the
org.apache.pdfbox package instead of org.pdfbox.

Is there a timeline for upgrading to PDFBox 0.8.0? Perhaps the upgrade
could be done in a branch that could be merged once 0.8.0 is released?
If it's a simple matter of replacing "org.pdfbox" with
"org.apache.pdfbox" I could volunteer for that, but if the upgrade is
more complicated it may very well be beyond my meager Java skills.