Permssion to extract text/Embedded documents

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Permssion to extract text/Embedded documents

Richard Braman
Permsssion to extract text:
 
I get the error
060302 034106 fetch okay, but can't parse
http://www.dor.state.nc.us/downloads/fillin/E585.pdf, reason:
failed(2,0): Can't be handled as pdf document. java.io.IOException: You
do not have permission to extract text
 
This is a crptography exception in stripper.gettext.
 
I can open the file no problem in IE, but when I goto plain ole acrobat,
it displays a message that says that "[it] is a secure document that has
been embedded in this document", whatever that means?
 <http://www.dor.state.nc.us/downloads/fillin/E585.pdf>
http://www.dor.state.nc.us/downloads/fillin/E585.pdf.
 
Nutch PDF parsing in CVS.
http://svn.apache.org/viewcvs.cgi/lucene/nutch/tags/release-0.7.1/src/pl
ugin/parse-pdf/src/java/org/apache/nutch/parse/pdf/PdfParser.java?rev=29
3015
<http://svn.apache.org/viewcvs.cgi/lucene/nutch/tags/release-0.7.1/src/p
lugin/parse-pdf/src/java/org/apache/nutch/parse/pdf/PdfParser.java?rev=2
93015&view=log> &view=log
 

Richard Braman
mailto:[hidden email]
561.748.4002 (voice)

http://www.taxcodesoftware.org <http://www.taxcodesoftware.org/>
Free Open Source Tax Software

 
Reply | Threaded
Open this post in threaded view
|

Re: Permssion to extract text/Embedded documents

Leonard Rosenthol
At 04:42 AM 3/2/2006, Richard Braman wrote:
>Permsssion to extract text:
>
>I get the error
>060302 034106 fetch okay, but can't parse
><http://www.dor.state.nc.us/downloads/fillin/E585.pdf>http://www.dor.state.nc.us/downloads/fillin/E585.pdf,
>reason: failed(2,0): Can't be handled as pdf document.
>java.io.IOException: You do not have permission to extract text

         Correct.

         If you open the document in Acrobat, you will see the little
lock icon in the bottom left-hand corner signifying that the document
is encrypted.   Clicking on it, displays the specifics of the digital
rights that have been applied - in this case, that text extraction
(copying) has been DISABLED.


Leonard

---------------------------------------------------------------------------
Leonard Rosenthol                            <mailto:[hidden email]>
Chief Technical Officer                      <http://www.pdfsages.com>
PDF Sages, Inc.                              215-938-7080 (voice)
                                              215-938-0880 (fax)