[jira] [Updated] (TIKA-3096) detect image in any document

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Updated] (TIKA-3096) detect image in any document

Clark Perkins (Jira)

     [ https://issues.apache.org/jira/browse/TIKA-3096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

suchendra updated TIKA-3096:
----------------------------
    Description:
How do I detect whether a document contains an image or not ?

val parser = new AutoDetectParser()
 val handler = new ToXMLContentHandler()
 parser.parse(tikaIs, handler, new Metadata, new ParseContext)
 println("File Content:" + handler.toString)

 

I tried using HTMLHandler and based on existence of img tag, considered file contains image. Is there any better way to achieve this? 

  was:
How do I detect whether a document contains a image or not ?

val parser = new AutoDetectParser()
val handler = new ToXMLContentHandler()
parser.parse(tikaIs, handler, new Metadata, new ParseContext)
println("File Content:" + handler.toString)


> detect image in any document
> ----------------------------
>
>                 Key: TIKA-3096
>                 URL: https://issues.apache.org/jira/browse/TIKA-3096
>             Project: Tika
>          Issue Type: Bug
>          Components: documentation, example, parser
>    Affects Versions: 1.23
>            Reporter: suchendra
>            Priority: Minor
>
> How do I detect whether a document contains an image or not ?
> val parser = new AutoDetectParser()
>  val handler = new ToXMLContentHandler()
>  parser.parse(tikaIs, handler, new Metadata, new ParseContext)
>  println("File Content:" + handler.toString)
>  
> I tried using HTMLHandler and based on existence of img tag, considered file contains image. Is there any better way to achieve this? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)