[jira] [Created] (TIKA-3209) Different between PictureRunMapper in POI and PicturesSource in Tika

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (TIKA-3209) Different between PictureRunMapper in POI and PicturesSource in Tika

Nicholas DiPiazza (Jira)
Peter Lee created TIKA-3209:
-------------------------------

             Summary: Different between PictureRunMapper in POI and PicturesSource in Tika
                 Key: TIKA-3209
                 URL: https://issues.apache.org/jira/browse/TIKA-3209
             Project: Tika
          Issue Type: Bug
          Components: parser
            Reporter: Peter Lee


1. In git log of POI, class PictureRunMapper was copy from class PicturesSource in Tika. see [1]

2. This TODO of Tika suggest replace PicturesSource with PictureRunMapper once POI 3.18 is out. see [2]


So I try to replace but got a test fail.

I think it may because the different between in method nextUnclaimed in these two classes. see [3][4]

 

Can we remove this line in POI ? see [4]

 

[1] [https://github.com/apache/poi/commit/bdb0e8199bb6891b068e97da69d6410870e8066b]


[2] [https://github.com/apache/tika/blob/172d40322f5662e428850ad7a8fb4113e453a51c/tika-parser-modules/tika-parser-microsoft-module/src/main/java/org/apache/tika/parser/microsoft/WordExtractor.java#L641]


[3]

[https://github.com/apache/tika/blob/172d40322f5662e428850ad7a8fb4113e453a51c/tika-parser-modules/tika-parser-microsoft-module/src/main/java/org/apache/tika/parser/microsoft/WordExtractor.java#L709]

 

[4] 

[https://github.com/apache/poi/blob/f509d1deae86866ed531f10f2eba7db17e098473/src/scratchpad/src/org/apache/poi/hwpf/usermodel/PictureRunMapper.java#L130]

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)