[jira] [Commented] (TIKA-2992) java.lang.UnsupportedOperationException: This feature requires ASM7 in Tika 1.21

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[jira] [Commented] (TIKA-2992) java.lang.UnsupportedOperationException: This feature requires ASM7 in Tika 1.21

David Eric Pugh (Jira)

    [ https://issues.apache.org/jira/browse/TIKA-2992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16986630#comment-16986630 ]

Arvind Jain commented on TIKA-2992:

Thanks for the reply [~nick].

Tried what you suggested, looks like we only have ASM 7.1 in our classpath because tika-parsers 1.21 requires that.

I debugged a bit more and looks like an issue with [https://github.com/apache/tika/blob/master/tika-parsers/src/main/java/org/apache/tika/parser/asm/XHTMLClassVisitor.java]. This class is initialized with OpCode ASM5.

The exception is happening here: [https://github.com/consulo/objectweb-asm/blob/master/asm/src/main/java/org/objectweb/asm/ClassVisitor.java#L150,|https://github.com/consulo/objectweb-asm/blob/master/asm/src/main/java/org/objectweb/asm/ClassVisitor.java#L150] so somewhere feature of ASM7 is being used – which is not unexpected since tika-parsers depends on ASM 7.1.

Does this make sense or am I missing something ?


>  java.lang.UnsupportedOperationException: This feature requires ASM7 in Tika 1.21
> ---------------------------------------------------------------------------------
>                 Key: TIKA-2992
>                 URL: https://issues.apache.org/jira/browse/TIKA-2992
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.21
>            Reporter: Arvind Jain
>            Priority: Major
> We are using Tika java library to parse a bunch of documents (various formats). We are seeing the exception below regularly in our logs on certain documents. Any suggestions on how to fix would be really useful. On initial investigation it looks like its a bug with mismatched ASM between XHTMLClassVisitor and tika-parsers pom. 
> Failed to parse the document. org.apache.tika.exception.TikaException: Failed to parse a Java class
> at org.apache.tika.parser.asm.XHTMLClassVisitor.parse (XHTMLClassVisitor.java:66)
> at org.apache.tika.parser.asm.ClassParser.parse (ClassParser.java:51)
> at org.apache.tika.parser.CompositeParser.parse (CompositeParser.java:280)
> at org.apache.tika.parser.CompositeParser.parse (CompositeParser.java:280)
> at org.apache.tika.parser.AutoDetectParser.parse (AutoDetectParser.java:143)
> at com.askscio.beam.docbuilder.processor.parsers.GenericParser.parse (GenericParser.java:55)
> <snipped>
> Caused by: java.lang.UnsupportedOperationException: This feature requires ASM7
> at org.objectweb.asm.ClassVisitor.visitNestMember (ClassVisitor.java:236)
> at org.objectweb.asm.ClassReader.accept (ClassReader.java:660)
> at org.objectweb.asm.ClassReader.accept (ClassReader.java:400)
> at org.apache.tika.parser.asm.XHTMLClassVisitor.parse (XHTMLClassVisitor.java:61)}}

This message was sent by Atlassian Jira