[jira] [Created] (TIKA-3097) Out of memory while parsing docx

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (TIKA-3097) Out of memory while parsing docx

Mihir Sharma (Jira)
suchendra created TIKA-3097:
-------------------------------

             Summary: Out of memory while parsing docx
                 Key: TIKA-3097
                 URL: https://issues.apache.org/jira/browse/TIKA-3097
             Project: Tika
          Issue Type: Bug
          Components: core, parser
    Affects Versions: 1.24
            Reporter: suchendra
         Attachments: test.docx

I have written simple Scala code to extract the content from uploaded file which is docx. JVM goes OOM when tika tries to parse the file. I have configured JVM heap to 1GB and tried with 2GB same issue occurs, issue both with jar as well as in my code.
Attached the file for reference.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)