[VOTE] Apache Tika 1.14 Release Candidate #1

classic Classic list List threaded Threaded
21 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Apache Tika 1.14 Release Candidate #1

Mattmann, Chris A (3010)
Tests passed for me and I also don’t have strings installed?

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Principal Data Scientist, Engineering Administrative Office (3010)
Manager, Open Source Projects Formulation and Development Office (8212)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 180-503E, Mailstop: 180-502
Email: [hidden email]
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Director, Information Retrieval and Data Science Group (IRDS)
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
WWW: http://irds.usc.edu/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 

On 11/2/16, 6:20 AM, "Allison, Timothy B." <[hidden email]> wrote:

    Or, in other words, we need to find another test file or a modification of the current test file for strings since we now have a dbf parser.  I don't think this is a blocker, do you?
   
    Given that this is a truncated file, I'd expect the exception from the DBFParser, but if we don't want that behavior, let's open a ticket and fix.
   
    -----Original Message-----
    From: Allison, Timothy B. [mailto:[hidden email]]
    Sent: Wednesday, November 2, 2016 9:17 AM
    To: [hidden email]; [hidden email]
    Subject: RE: [VOTE] Apache Tika 1.14 Release Candidate #1
   
    Ken,
      I don't have strings installed.  I suspect what's happening, though, is that this file is now being handled by the dbf parser, and I'm getting this exception with that parser.
   
   
    org.apache.tika.exception.TikaException: Expecting space or asterisk at beginning of record, not:10
   
    at org.apache.tika.parser.dbf.DBFReader.fillRow(DBFReader.java:165)
    at org.apache.tika.parser.dbf.DBFReader.next(DBFReader.java:138)
    at org.apache.tika.parser.dbf.DBFParser.parse(DBFParser.java:81)
    at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
    at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
    at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
    at org.apache.tika.TikaTest.getXML(TikaTest.java:186)
    at org.apache.tika.TikaTest.getXML(TikaTest.java:171)
    at org.apache.tika.parser.strings.StringsParserTest.testParse2(StringsParserTest.java:42)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at ...
   
    -----Original Message-----
    From: Ken Krugler [mailto:[hidden email]]
    Sent: Tuesday, November 1, 2016 11:47 PM
    To: [hidden email]
    Subject: Re: [VOTE] Apache Tika 1.14 Release Candidate #1
   
    [Resending - has anyone else run into this same issue, when building from the 1.14-rc1 tag?]
   
    Just for grins, I pulled from git and checked out the the 1.14-rc1 tag, then ran “mvn clean package”.
   
    For me it fails with:
   
    Running org.apache.tika.parser.strings.StringsParserTest
    Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.685 sec <<< FAILURE! - in org.apache.tika.parser.strings.StringsParserTest
    testParse(org.apache.tika.parser.strings.StringsParserTest)  Time elapsed: 1.685 sec  <<< FAILURE!
    java.lang.AssertionError: null
    at org.junit.Assert.fail(Assert.java:86)
    at org.junit.Assert.assertTrue(Assert.java:41)
    at org.junit.Assert.assertTrue(Assert.java:52)
    at org.apache.tika.parser.strings.StringsParserTest.testParse(StringsParserTest.java:68)
   
    …
   
    Results :
   
    Failed tests:
     StringsParserTest.testParse:68 null
   
    Tests run: 755, Failures: 1, Errors: 0, Skipped: 18
   
    — Ken
   
    > On Oct 19, 2016, at 11:48am, Chris Mattmann <[hidden email]> wrote:
    >
    > Hi Folks,
    >
    > A first candidate for the Tika 1.14 release is available at:
    >
    > https://dist.apache.org/repos/dist/dev/tika/
    >
    > The release candidate is a zip archive of the sources in:
    >
    > https://git-wip-us.apache.org/repos/asf?p=tika.git;a=tree;hb=687d7706c
    > 9778e4f49f2834a07e5a9d99b23042b
    >
    > The SHA1 checksum of the archive is:
    > ad9152392ffe6b620c8102ab538df0579b36c520
    >
    > In addition, a staged maven repository is available here:
    >
    > https://repository.apache.org/content/repositories/orgapachetika-1020/
    >
    > Please vote on releasing this package as Apache Tika 1.14.
    > The vote is open for the next 72 hours and passes if a majority of at
    > least three +1 Tika PMC votes are cast.
    >
    > [ ] +1 Release this package as Apache Tika 1.14 [ ] -1 Do not release
    > this package because..
    >
    > Cheers,
    > Chris
    >
    > P.S. Of course here is my +1.
    >
    >
    >
    >
    >
   
    --------------------------
    Ken Krugler
    +1 530-210-6378
    http://www.scaleunlimited.com
    custom big data solutions & training
    Hadoop, Cascading, Cassandra & Solr
   
   
   
   

12