[jira] [Comment Edited] (TIKA-3144) Detecting hprof memory dump files exported from Android Studio

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Comment Edited] (TIKA-3144) Detecting hprof memory dump files exported from Android Studio

Sebastian Nagel (Jira)

    [ https://issues.apache.org/jira/browse/TIKA-3144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17169133#comment-17169133 ]

Parth edited comment on TIKA-3144 at 7/31/20, 8:22 PM:
-------------------------------------------------------

So all you had to do was add an xml element to `tika-mimetypes.xml`?

 <mime-type type="application/vnd.java.hprof ">
 <_comment>Java hprof text file</_comment>
 <magic priority="50">
 <match value="JAVA PROFILE \\d\\.\\d\\.\\d
u0000" type="regex" offset="0"/>
 </magic>
 <glob pattern="*.hprof"/>
 </mime-type>
 <mime-type type="application/vnd.java.hprof.text">
 <_comment>Java hprof text file</_comment>
 <magic priority="50">
 <match value="JAVA PROFILE \\d\\.\\d\\.
d," type="regex" offset="0"/>
 </magic>
 <glob pattern="*.hprof.txt"/>
 <sub-class-of type="text/plain"/>
 </mime-type>



The hprof parser is not needed?


was (Author: tamane):
So all you had to do was add an xml element to `tika-mimetypes.xml`?
```
  <mime-type type="application/vnd.java.hprof ">
    <_comment>Java hprof text file</_comment>
    <magic priority="50">
      <match value="JAVA PROFILE \\d\\.\\d\\.\\d\\u0000" type="regex" offset="0"/>
    </magic>
    <glob pattern="*.hprof"/>
  </mime-type>
  <mime-type type="application/vnd.java.hprof.text">
    <_comment>Java hprof text file</_comment>
    <magic priority="50">
      <match value="JAVA PROFILE \\d\\.\\d\\.\\d," type="regex" offset="0"/>
    </magic>
    <glob pattern="*.hprof.txt"/>
    <sub-class-of type="text/plain"/>
  </mime-type>
```

The hprof parser is not needed?

> Detecting hprof memory dump files exported from Android Studio
> --------------------------------------------------------------
>
>                 Key: TIKA-3144
>                 URL: https://issues.apache.org/jira/browse/TIKA-3144
>             Project: Tika
>          Issue Type: New Feature
>          Components: detector, parser
>            Reporter: Parth
>            Priority: Major
>              Labels: android-studio, hprof, memory-dump
>
> I was trying to detect a hprof file by downloading and passing it as input stream to tika. But the MIME type is being detected as `application/octantstream`. Can a more granular support for detecting if a file is of type hprof be added?
> Hprof files are Java memory dump files and you can read more about it here: https://dzone.com/articles/memory-analysis-how-to-obtain-java-heat-dump
> I had found this parser for hprof in Java if it helps: https://github.com/eaftan/hprof-parser



--
This message was sent by Atlassian Jira
(v8.3.4#803005)