[jira] [Commented] (TIKA-3096) detect image in any document

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[jira] [Commented] (TIKA-3096) detect image in any document

Sebastian Nagel (Jira)

    [ https://issues.apache.org/jira/browse/TIKA-3096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17096592#comment-17096592 ]

Kenneth William Krugler commented on TIKA-3096:

Hi [~suchendra] - please ask usage questions on the Tika user mailing list, thanks! You can sign up using steps at [http://tika.apache.org/mail-lists.html.|http://tika.apache.org/mail-lists.html]

> detect image in any document
> ----------------------------
>                 Key: TIKA-3096
>                 URL: https://issues.apache.org/jira/browse/TIKA-3096
>             Project: Tika
>          Issue Type: Bug
>          Components: documentation, example, parser
>    Affects Versions: 1.23
>            Reporter: suchendra
>            Priority: Minor
> How do I detect whether a document contains an image or not ?
> val parser = new AutoDetectParser()
>  val handler = new ToXMLContentHandler()
>  parser.parse(tikaIs, handler, new Metadata, new ParseContext)
>  println("File Content:" + handler.toString)
> I tried using HTMLHandler and based on existence of img tag, considered file contains image. Is there any better way to achieve this? 

This message was sent by Atlassian Jira