Re: svn commit: r594376 - in /incubator/tika/trunk: CHANGES.txt src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java src/main/java/org/apache/tika/parser/pdf/PDFParser.java

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r594376 - in /incubator/tika/trunk: CHANGES.txt src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java src/main/java/org/apache/tika/parser/pdf/PDFParser.java

Jeremias Maerki-2
The constructor IOException(String, Exception) only exists since Java 6.
I don't think that was intended, was it?

Jeremias Maerki



On 13.11.2007 02:04:31 jukka wrote:

> Author: jukka
> Date: Mon Nov 12 17:04:30 2007
> New Revision: 594376
>
> URL: http://svn.apache.org/viewvc?rev=594376&view=rev
> Log:
> TIKA-100 - Structured PDF parsing
>     - Customized the PdfTextStripper class to produce XHTML SAX events
>       (there's a somewhat similar PdfText2HTML class in PDFBox, but
>       that class produces a character stream instead of SAX events)
>
> Added:
>     incubator/tika/trunk/src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java   (with props)
> Modified:
>     incubator/tika/trunk/CHANGES.txt
>     incubator/tika/trunk/src/main/java/org/apache/tika/parser/pdf/PDFParser.java
>
<snip/>

> Added: incubator/tika/trunk/src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java
> URL: http://svn.apache.org/viewvc/incubator/tika/trunk/src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java?rev=594376&view=auto
> ==============================================================================
> --- incubator/tika/trunk/src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java (added)
> +++ incubator/tika/trunk/src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java Mon Nov 12 17:04:30 2007
> +    protected void endDocument(PDDocument pdf) throws IOException {
> +        try {
> +            handler.endDocument();
> +        } catch (SAXException e) {
> +            throw new IOException("Unable to end a document", e);
> +        }
> +    }

Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r594376 - in /incubator/tika/trunk: CHANGES.txt src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java src/main/java/org/apache/tika/parser/pdf/PDFParser.java

chrismattmann
I've verified this behavior as well while trying to apply and commit the
patch for TIKA-101. I think that the trunk is broken. I'll go ahead and fix
it.

In the future, we should probably have nightly builds to catch stuff like
this. Also, please try to be more vigilant about making sure that your
environment is set to JDK 5 before committing an update.

Thanks!

Cheers,
  Chris



On 11/18/07 9:54 AM, "Jeremias Maerki" <[hidden email]> wrote:

> The constructor IOException(String, Exception) only exists since Java 6.
> I don't think that was intended, was it?
>
> Jeremias Maerki
>
>
>
> On 13.11.2007 02:04:31 jukka wrote:
>> Author: jukka
>> Date: Mon Nov 12 17:04:30 2007
>> New Revision: 594376
>>
>> URL: http://svn.apache.org/viewvc?rev=594376&view=rev
>> Log:
>> TIKA-100 - Structured PDF parsing
>>     - Customized the PdfTextStripper class to produce XHTML SAX events
>>       (there's a somewhat similar PdfText2HTML class in PDFBox, but
>>       that class produces a character stream instead of SAX events)
>>
>> Added:
>>    
>> incubator/tika/trunk/src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java
>> (with props)
>> Modified:
>>     incubator/tika/trunk/CHANGES.txt
>>    
>> incubator/tika/trunk/src/main/java/org/apache/tika/parser/pdf/PDFParser.java
>>
> <snip/>
>> Added:
>> incubator/tika/trunk/src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java
>> URL:
>> http://svn.apache.org/viewvc/incubator/tika/trunk/src/main/java/org/apache/ti
>> ka/parser/pdf/PDF2XHTML.java?rev=594376&view=auto
>>
=============================================================================>>
=

>> ---
>> incubator/tika/trunk/src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java
>> (added)
>> +++
>> incubator/tika/trunk/src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java
>> Mon Nov 12 17:04:30 2007
>> +    protected void endDocument(PDDocument pdf) throws IOException {
>> +        try {
>> +            handler.endDocument();
>> +        } catch (SAXException e) {
>> +            throw new IOException("Unable to end a document", e);
>> +        }
>> +    }
>

______________________________________________
Chris Mattmann, Ph.D.
[hidden email]
Cognizant Development Engineer
Early Detection Research Network Project
_________________________________________________
Jet Propulsion Laboratory            Pasadena, CA
Office: 171-266B                     Mailstop:  171-246
_______________________________________________________

Disclaimer:  The opinions presented within are my own and do not reflect
those of either NASA, JPL, or the California Institute of Technology.


Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r594376 - in /incubator/tika/trunk: CHANGES.txt src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java src/main/java/org/apache/tika/parser/pdf/PDFParser.java

chrismattmann
In reply to this post by Jeremias Maerki-2
Hi Guys,

 This has been fixed in r596143:

 http://svn.apache.org/viewvc?rev=596143&view=rev

Cheers,
  Chris
 

On 11/18/07 9:54 AM, "Jeremias Maerki" <[hidden email]> wrote:

> The constructor IOException(String, Exception) only exists since Java 6.
> I don't think that was intended, was it?
>
> Jeremias Maerki
>
>
>
> On 13.11.2007 02:04:31 jukka wrote:
>> Author: jukka
>> Date: Mon Nov 12 17:04:30 2007
>> New Revision: 594376
>>
>> URL: http://svn.apache.org/viewvc?rev=594376&view=rev
>> Log:
>> TIKA-100 - Structured PDF parsing
>>     - Customized the PdfTextStripper class to produce XHTML SAX events
>>       (there's a somewhat similar PdfText2HTML class in PDFBox, but
>>       that class produces a character stream instead of SAX events)
>>
>> Added:
>>    
>> incubator/tika/trunk/src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java
>> (with props)
>> Modified:
>>     incubator/tika/trunk/CHANGES.txt
>>    
>> incubator/tika/trunk/src/main/java/org/apache/tika/parser/pdf/PDFParser.java
>>
> <snip/>
>> Added:
>> incubator/tika/trunk/src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java
>> URL:
>> http://svn.apache.org/viewvc/incubator/tika/trunk/src/main/java/org/apache/ti
>> ka/parser/pdf/PDF2XHTML.java?rev=594376&view=auto
>>
=============================================================================>>
=

>> ---
>> incubator/tika/trunk/src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java
>> (added)
>> +++
>> incubator/tika/trunk/src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java
>> Mon Nov 12 17:04:30 2007
>> +    protected void endDocument(PDDocument pdf) throws IOException {
>> +        try {
>> +            handler.endDocument();
>> +        } catch (SAXException e) {
>> +            throw new IOException("Unable to end a document", e);
>> +        }
>> +    }
>

______________________________________________
Chris Mattmann, Ph.D.
[hidden email]
Cognizant Development Engineer
Early Detection Research Network Project
_________________________________________________
Jet Propulsion Laboratory            Pasadena, CA
Office: 171-266B                     Mailstop:  171-246
_______________________________________________________

Disclaimer:  The opinions presented within are my own and do not reflect
those of either NASA, JPL, or the California Institute of Technology.


Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r594376 - in /incubator/tika/trunk: CHANGES.txt src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java src/main/java/org/apache/tika/parser/pdf/PDFParser.java

Jukka Zitting
In reply to this post by Jeremias Maerki-2
Hi,

On Nov 18, 2007 7:54 PM, Jeremias Maerki <[hidden email]> wrote:
> The constructor IOException(String, Exception) only exists since Java 6.
> I don't think that was intended, was it?

Oh, bugger. Certainly not intended (see [1]), I just wrongly recalled
that the constructor would have been available already in Java 5.

Thanks, Chris, for fixing the problem.

PS. Does anyone have an idea on how Maven could be made to select
which JDK to use based on project metadata?

[1] http://jukkaz.wordpress.com/2007/05/17/the-cause-of-an-ioexception/

BR,

Jukka Zitting