[DISCUSS] Tika 1.8 or 1.7.1

classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

[DISCUSS] Tika 1.8 or 1.7.1

Tyler Palsulich-2
Hi Folks,

Now that TIKA-1581 (JHighlight licensing issues) is resolved, we need to
release a new version of Tika. I'll volunteer to be the release manager
again.

Should we release this as 1.8 or 1.7.1?

Does anyone have any last minute issues they'd like to finish and see in
Tika 1.X? I'd like to get the example working with CORS (TIKA-1585 and
TIKA-1586). Any others?

Have a good weekend,
Tyler
Reply | Threaded
Open this post in threaded view
|

RE: [DISCUSS] Tika 1.8 or 1.7.1

kkrugler
Given how recently we did a 1.7 release, my vote would be for 1.7.1

And to keep this release as simple as possible, just cherry-pick the fix for TIKA-1581 into the 1.7 code base.

-- Ken

> From: Tyler Palsulich
> Sent: March 28, 2015 8:01:03am PDT
> To: [hidden email]
> Subject: [DISCUSS] Tika 1.8 or 1.7.1
>
> Hi Folks,
>
> Now that TIKA-1581 (JHighlight licensing issues) is resolved, we need to
> release a new version of Tika. I'll volunteer to be the release manager
> again.
>
> Should we release this as 1.8 or 1.7.1?
>
> Does anyone have any last minute issues they'd like to finish and see in
> Tika 1.X? I'd like to get the example working with CORS (TIKA-1585 and
> TIKA-1586). Any others?
>
> Have a good weekend,
> Tyler

--------------------------
Ken Krugler
+1 530-210-6378
http://www.scaleunlimited.com
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr





Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Tika 1.8 or 1.7.1

Mattmann, Chris A (3010)
In reply to this post by Tyler Palsulich-2
Hi Tyler - I would VOTE for 1.8. Given the stuff associated
with releasing (updating the website; sending emails; waiting
periods, etc.) let’s ship all the updates we have too along
with the jhighlight fix.

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: [hidden email]
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++






-----Original Message-----
From: Tyler Palsulich <[hidden email]>
Reply-To: "[hidden email]" <[hidden email]>
Date: Saturday, March 28, 2015 at 8:01 AM
To: "[hidden email]" <[hidden email]>
Subject: [DISCUSS] Tika 1.8 or 1.7.1

>Hi Folks,
>
>Now that TIKA-1581 (JHighlight licensing issues) is resolved, we need to
>release a new version of Tika. I'll volunteer to be the release manager
>again.
>
>Should we release this as 1.8 or 1.7.1?
>
>Does anyone have any last minute issues they'd like to finish and see in
>Tika 1.X? I'd like to get the example working with CORS (TIKA-1585 and
>TIKA-1586). Any others?
>
>Have a good weekend,
>Tyler

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Tika 1.8 or 1.7.1

Allison, Timothy B.
Once we fix TIKA-1584, I don't have a preference.  I defer to Chris's experience (so I guess, +1 for 1.8) given the amount of work required.

It'd be great if we could make sure we aren't bundling any pdfs in our tika-app jar, too.  Many apologies if that's been fixed!

________________________________________
From: Mattmann, Chris A (3980) <[hidden email]>
Sent: Saturday, March 28, 2015 11:41 AM
To: [hidden email]
Subject: Re: [DISCUSS] Tika 1.8 or 1.7.1

Hi Tyler - I would VOTE for 1.8. Given the stuff associated
with releasing (updating the website; sending emails; waiting
periods, etc.) let’s ship all the updates we have too along
with the jhighlight fix.

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: [hidden email]
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++






-----Original Message-----
From: Tyler Palsulich <[hidden email]>
Reply-To: "[hidden email]" <[hidden email]>
Date: Saturday, March 28, 2015 at 8:01 AM
To: "[hidden email]" <[hidden email]>
Subject: [DISCUSS] Tika 1.8 or 1.7.1

>Hi Folks,
>
>Now that TIKA-1581 (JHighlight licensing issues) is resolved, we need to
>release a new version of Tika. I'll volunteer to be the release manager
>again.
>
>Should we release this as 1.8 or 1.7.1?
>
>Does anyone have any last minute issues they'd like to finish and see in
>Tika 1.X? I'd like to get the example working with CORS (TIKA-1585 and
>TIKA-1586). Any others?
>
>Have a good weekend,
>Tyler

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Tika 1.8 or 1.7.1

Tyler Palsulich-2
In reply to this post by Mattmann, Chris A (3010)
I'm also leaning toward 1.8. Especially given the newly identified
regression in TIKA-1584.

Tyler
On Mar 28, 2015 11:47 AM, "Mattmann, Chris A (3980)" <
[hidden email]> wrote:

> Hi Tyler - I would VOTE for 1.8. Given the stuff associated
> with releasing (updating the website; sending emails; waiting
> periods, etc.) let’s ship all the updates we have too along
> with the jhighlight fix.
>
> Cheers,
> Chris
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Chief Architect
> Instrument Software and Science Data Systems Section (398)
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 168-519, Mailstop: 168-527
> Email: [hidden email]
> WWW:  http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Associate Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>
>
>
>
> -----Original Message-----
> From: Tyler Palsulich <[hidden email]>
> Reply-To: "[hidden email]" <[hidden email]>
> Date: Saturday, March 28, 2015 at 8:01 AM
> To: "[hidden email]" <[hidden email]>
> Subject: [DISCUSS] Tika 1.8 or 1.7.1
>
> >Hi Folks,
> >
> >Now that TIKA-1581 (JHighlight licensing issues) is resolved, we need to
> >release a new version of Tika. I'll volunteer to be the release manager
> >again.
> >
> >Should we release this as 1.8 or 1.7.1?
> >
> >Does anyone have any last minute issues they'd like to finish and see in
> >Tika 1.X? I'd like to get the example working with CORS (TIKA-1585 and
> >TIKA-1586). Any others?
> >
> >Have a good weekend,
> >Tyler
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Tika 1.8 or 1.7.1

Konstantin Gribov
+1 to releasing 1.8.

--
Best regards,
Konstantin Gribov

сб, 28 марта 2015, 22:25, Tyler Palsulich <[hidden email]>:

> I'm also leaning toward 1.8. Especially given the newly identified
> regression in TIKA-1584.
>
> Tyler
> On Mar 28, 2015 11:47 AM, "Mattmann, Chris A (3980)" <
> [hidden email]> wrote:
>
> > Hi Tyler - I would VOTE for 1.8. Given the stuff associated
> > with releasing (updating the website; sending emails; waiting
> > periods, etc.) let’s ship all the updates we have too along
> > with the jhighlight fix.
> >
> > Cheers,
> > Chris
> >
> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > Chris Mattmann, Ph.D.
> > Chief Architect
> > Instrument Software and Science Data Systems Section (398)
> > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> > Office: 168-519, Mailstop: 168-527
> > Email: [hidden email]
> > WWW:  http://sunset.usc.edu/~mattmann/
> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > Adjunct Associate Professor, Computer Science Department
> > University of Southern California, Los Angeles, CA 90089 USA
> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >
> >
> >
> >
> >
> >
> > -----Original Message-----
> > From: Tyler Palsulich <[hidden email]>
> > Reply-To: "[hidden email]" <[hidden email]>
> > Date: Saturday, March 28, 2015 at 8:01 AM
> > To: "[hidden email]" <[hidden email]>
> > Subject: [DISCUSS] Tika 1.8 or 1.7.1
> >
> > >Hi Folks,
> > >
> > >Now that TIKA-1581 (JHighlight licensing issues) is resolved, we need to
> > >release a new version of Tika. I'll volunteer to be the release manager
> > >again.
> > >
> > >Should we release this as 1.8 or 1.7.1?
> > >
> > >Does anyone have any last minute issues they'd like to finish and see in
> > >Tika 1.X? I'd like to get the example working with CORS (TIKA-1585 and
> > >TIKA-1586). Any others?
> > >
> > >Have a good weekend,
> > >Tyler
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Tika 1.8 or 1.7.1

Konstantin Gribov
Also, I think, we should resolve TIKA-1575 (upgrade to pdfbox 1.8.9) since
pdfbox 1.8.8 hangs on some pdf forms.

--
Best regards,
Konstantin Gribov

сб, 28 марта 2015 г. в 23:22, Konstantin Gribov <[hidden email]>:

> +1 to releasing 1.8.
>
> --
> Best regards,
> Konstantin Gribov
>
> сб, 28 марта 2015, 22:25, Tyler Palsulich <[hidden email]>:
>
> I'm also leaning toward 1.8. Especially given the newly identified
>> regression in TIKA-1584.
>>
>> Tyler
>> On Mar 28, 2015 11:47 AM, "Mattmann, Chris A (3980)" <
>> [hidden email]> wrote:
>>
>> > Hi Tyler - I would VOTE for 1.8. Given the stuff associated
>> > with releasing (updating the website; sending emails; waiting
>> > periods, etc.) let’s ship all the updates we have too along
>> > with the jhighlight fix.
>> >
>> > Cheers,
>> > Chris
>> >
>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> > Chris Mattmann, Ph.D.
>> > Chief Architect
>> > Instrument Software and Science Data Systems Section (398)
>> > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>> > Office: 168-519, Mailstop: 168-527
>> > Email: [hidden email]
>> > WWW:  http://sunset.usc.edu/~mattmann/
>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> > Adjunct Associate Professor, Computer Science Department
>> > University of Southern California, Los Angeles, CA 90089 USA
>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> >
>> >
>> >
>> >
>> >
>> >
>> > -----Original Message-----
>> > From: Tyler Palsulich <[hidden email]>
>> > Reply-To: "[hidden email]" <[hidden email]>
>> > Date: Saturday, March 28, 2015 at 8:01 AM
>> > To: "[hidden email]" <[hidden email]>
>> > Subject: [DISCUSS] Tika 1.8 or 1.7.1
>> >
>> > >Hi Folks,
>> > >
>> > >Now that TIKA-1581 (JHighlight licensing issues) is resolved, we need
>> to
>> > >release a new version of Tika. I'll volunteer to be the release manager
>> > >again.
>> > >
>> > >Should we release this as 1.8 or 1.7.1?
>> > >
>> > >Does anyone have any last minute issues they'd like to finish and see
>> in
>> > >Tika 1.X? I'd like to get the example working with CORS (TIKA-1585 and
>> > >TIKA-1586). Any others?
>> > >
>> > >Have a good weekend,
>> > >Tyler
>> >
>> >
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Tika 1.8 or 1.7.1

Oleg Tikhonov-2
+1 for 1.8 release.
On 29 Mar 2015 02:04, "Konstantin Gribov" <[hidden email]> wrote:

> Also, I think, we should resolve TIKA-1575 (upgrade to pdfbox 1.8.9) since
> pdfbox 1.8.8 hangs on some pdf forms.
>
> --
> Best regards,
> Konstantin Gribov
>
> сб, 28 марта 2015 г. в 23:22, Konstantin Gribov <[hidden email]>:
>
> > +1 to releasing 1.8.
> >
> > --
> > Best regards,
> > Konstantin Gribov
> >
> > сб, 28 марта 2015, 22:25, Tyler Palsulich <[hidden email]>:
> >
> > I'm also leaning toward 1.8. Especially given the newly identified
> >> regression in TIKA-1584.
> >>
> >> Tyler
> >> On Mar 28, 2015 11:47 AM, "Mattmann, Chris A (3980)" <
> >> [hidden email]> wrote:
> >>
> >> > Hi Tyler - I would VOTE for 1.8. Given the stuff associated
> >> > with releasing (updating the website; sending emails; waiting
> >> > periods, etc.) let’s ship all the updates we have too along
> >> > with the jhighlight fix.
> >> >
> >> > Cheers,
> >> > Chris
> >> >
> >> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >> > Chris Mattmann, Ph.D.
> >> > Chief Architect
> >> > Instrument Software and Science Data Systems Section (398)
> >> > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> >> > Office: 168-519, Mailstop: 168-527
> >> > Email: [hidden email]
> >> > WWW:  http://sunset.usc.edu/~mattmann/
> >> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >> > Adjunct Associate Professor, Computer Science Department
> >> > University of Southern California, Los Angeles, CA 90089 USA
> >> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > -----Original Message-----
> >> > From: Tyler Palsulich <[hidden email]>
> >> > Reply-To: "[hidden email]" <[hidden email]>
> >> > Date: Saturday, March 28, 2015 at 8:01 AM
> >> > To: "[hidden email]" <[hidden email]>
> >> > Subject: [DISCUSS] Tika 1.8 or 1.7.1
> >> >
> >> > >Hi Folks,
> >> > >
> >> > >Now that TIKA-1581 (JHighlight licensing issues) is resolved, we need
> >> to
> >> > >release a new version of Tika. I'll volunteer to be the release
> manager
> >> > >again.
> >> > >
> >> > >Should we release this as 1.8 or 1.7.1?
> >> > >
> >> > >Does anyone have any last minute issues they'd like to finish and see
> >> in
> >> > >Tika 1.X? I'd like to get the example working with CORS (TIKA-1585
> and
> >> > >TIKA-1586). Any others?
> >> > >
> >> > >Have a good weekend,
> >> > >Tyler
> >> >
> >> >
> >>
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Tika 1.8 or 1.7.1

Hong-Thai Nguyen-3
In reply to this post by Tyler Palsulich-2
+1 for 1.8

Hong-Thai

> On 28 Mar 2015, at 16:01, Tyler Palsulich <[hidden email]> wrote:
>
> Hi Folks,
>
> Now that TIKA-1581 (JHighlight licensing issues) is resolved, we need to
> release a new version of Tika. I'll volunteer to be the release manager
> again.
>
> Should we release this as 1.8 or 1.7.1?
>
> Does anyone have any last minute issues they'd like to finish and see in
> Tika 1.X? I'd like to get the example working with CORS (TIKA-1585 and
> TIKA-1586). Any others?
>
> Have a good weekend,
> Tyler
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Tika 1.8 or 1.7.1

Tyler Palsulich
Once TIKA-1584 and TIKA-1575 are resolved, I'll work up an RC (unless
something else pops up).

Thank you everyone.

Tyler
On Mar 29, 2015 4:43 AM, "Hong-Thai Nguyen" <[hidden email]> wrote:

> +1 for 1.8
>
> Hong-Thai
>
> > On 28 Mar 2015, at 16:01, Tyler Palsulich <[hidden email]> wrote:
> >
> > Hi Folks,
> >
> > Now that TIKA-1581 (JHighlight licensing issues) is resolved, we need to
> > release a new version of Tika. I'll volunteer to be the release manager
> > again.
> >
> > Should we release this as 1.8 or 1.7.1?
> >
> > Does anyone have any last minute issues they'd like to finish and see in
> > Tika 1.X? I'd like to get the example working with CORS (TIKA-1585 and
> > TIKA-1586). Any others?
> >
> > Have a good weekend,
> > Tyler
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Tika 1.8 or 1.7.1

David Meikle
In reply to this post by Tyler Palsulich-2
+1 for 1.8

> On 28 Mar 2015, at 15:01, Tyler Palsulich <[hidden email]> wrote:
>
> Should we release this as 1.8 or 1.7.1?

Reply | Threaded
Open this post in threaded view
|

RE: [DISCUSS] Tika 1.8 or 1.7.1

Allison, Timothy B.
In reply to this post by Tyler Palsulich
Unless there are objections, I'd like these to be resolved before 1.8:

TIKA-1584 -- I'll fix
TIKA-1575 -- Resolved by Konstantin Gribov (thank you!)
TIKA-1512 -- I'll put in a temporary fix so that we don't get IOOBEs, but I'll leave this open and do some more digging to see if we need to open a ticket at the POI level
TIKA-1511 -- I'll remove "provided" for xerial

TIKA-1549 -- We should thank Toke Eskildsen in CHANGES.txt, no?

I'll have these fixes completed by noon EDT.  Should I run against govdocs1 before or after the RC?

My last build of Tika app (a few days ago) ballooned to ~43MB, and that's before I add ~3MB for xerial.  Tika server is now ~48MB.  As of my last build, we are still including ~4MB of pdfs (README.NLDAS1.pdf and README.NLDAS2.pdf) from the GRIB(?) parser in the tika-app and tika-server jars.

Best,

              Tim



-----Original Message-----
From: Tyler Palsulich [mailto:[hidden email]]
Sent: Sunday, March 29, 2015 9:13 AM
To: [hidden email]
Subject: Re: [DISCUSS] Tika 1.8 or 1.7.1

Once TIKA-1584 and TIKA-1575 are resolved, I'll work up an RC (unless
something else pops up).

Thank you everyone.

Tyler
On Mar 29, 2015 4:43 AM, "Hong-Thai Nguyen" <[hidden email]> wrote:

> +1 for 1.8
>
> Hong-Thai
>
> > On 28 Mar 2015, at 16:01, Tyler Palsulich <[hidden email]> wrote:
> >
> > Hi Folks,
> >
> > Now that TIKA-1581 (JHighlight licensing issues) is resolved, we need to
> > release a new version of Tika. I'll volunteer to be the release manager
> > again.
> >
> > Should we release this as 1.8 or 1.7.1?
> >
> > Does anyone have any last minute issues they'd like to finish and see in
> > Tika 1.X? I'd like to get the example working with CORS (TIKA-1585 and
> > TIKA-1586). Any others?
> >
> > Have a good weekend,
> > Tyler
>
Reply | Threaded
Open this post in threaded view
|

RE: [DISCUSS] Tika 1.8 or 1.7.1

Allison, Timothy B.
All,

I've made the changes that I had hoped to.  Grib pdf exclusion remains for any takers.

Let me know when I should initiate the run against govdocs1 to see if there are any surprises on that corpus with Tika 1.8.

Best,

            Tim

-----Original Message-----
From: Allison, Timothy B. [mailto:[hidden email]]
Sent: Monday, March 30, 2015 7:03 AM
To: [hidden email]
Subject: RE: [DISCUSS] Tika 1.8 or 1.7.1

Unless there are objections, I'd like these to be resolved before 1.8:

TIKA-1584 -- I'll fix
TIKA-1575 -- Resolved by Konstantin Gribov (thank you!)
TIKA-1512 -- I'll put in a temporary fix so that we don't get IOOBEs, but I'll leave this open and do some more digging to see if we need to open a ticket at the POI level
TIKA-1511 -- I'll remove "provided" for xerial

TIKA-1549 -- We should thank Toke Eskildsen in CHANGES.txt, no?

I'll have these fixes completed by noon EDT.  Should I run against govdocs1 before or after the RC?

My last build of Tika app (a few days ago) ballooned to ~43MB, and that's before I add ~3MB for xerial.  Tika server is now ~48MB.  As of my last build, we are still including ~4MB of pdfs (README.NLDAS1.pdf and README.NLDAS2.pdf) from the GRIB(?) parser in the tika-app and tika-server jars.

Best,

              Tim



-----Original Message-----
From: Tyler Palsulich [mailto:[hidden email]]
Sent: Sunday, March 29, 2015 9:13 AM
To: [hidden email]
Subject: Re: [DISCUSS] Tika 1.8 or 1.7.1

Once TIKA-1584 and TIKA-1575 are resolved, I'll work up an RC (unless
something else pops up).

Thank you everyone.

Tyler
On Mar 29, 2015 4:43 AM, "Hong-Thai Nguyen" <[hidden email]> wrote:

> +1 for 1.8
>
> Hong-Thai
>
> > On 28 Mar 2015, at 16:01, Tyler Palsulich <[hidden email]> wrote:
> >
> > Hi Folks,
> >
> > Now that TIKA-1581 (JHighlight licensing issues) is resolved, we need to
> > release a new version of Tika. I'll volunteer to be the release manager
> > again.
> >
> > Should we release this as 1.8 or 1.7.1?
> >
> > Does anyone have any last minute issues they'd like to finish and see in
> > Tika 1.X? I'd like to get the example working with CORS (TIKA-1585 and
> > TIKA-1586). Any others?
> >
> > Have a good weekend,
> > Tyler
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Tika 1.8 or 1.7.1

Tyler Palsulich
I just remembered TIKA-1509 and TIKA-1558 -- testing now for blacklist
functionality through TIKA-1509. If that works, I'll back out TIKA-1558.

Tim, I think you should run govdocs from the RC, in case something changes
between your run and the cut.

Tyler

On Mon, Mar 30, 2015 at 10:17 AM, Allison, Timothy B. <[hidden email]>
wrote:

> All,
>
> I've made the changes that I had hoped to.  Grib pdf exclusion remains for
> any takers.
>
> Let me know when I should initiate the run against govdocs1 to see if
> there are any surprises on that corpus with Tika 1.8.
>
> Best,
>
>             Tim
>
> -----Original Message-----
> From: Allison, Timothy B. [mailto:[hidden email]]
> Sent: Monday, March 30, 2015 7:03 AM
> To: [hidden email]
> Subject: RE: [DISCUSS] Tika 1.8 or 1.7.1
>
> Unless there are objections, I'd like these to be resolved before 1.8:
>
> TIKA-1584 -- I'll fix
> TIKA-1575 -- Resolved by Konstantin Gribov (thank you!)
> TIKA-1512 -- I'll put in a temporary fix so that we don't get IOOBEs, but
> I'll leave this open and do some more digging to see if we need to open a
> ticket at the POI level
> TIKA-1511 -- I'll remove "provided" for xerial
>
> TIKA-1549 -- We should thank Toke Eskildsen in CHANGES.txt, no?
>
> I'll have these fixes completed by noon EDT.  Should I run against
> govdocs1 before or after the RC?
>
> My last build of Tika app (a few days ago) ballooned to ~43MB, and that's
> before I add ~3MB for xerial.  Tika server is now ~48MB.  As of my last
> build, we are still including ~4MB of pdfs (README.NLDAS1.pdf and
> README.NLDAS2.pdf) from the GRIB(?) parser in the tika-app and tika-server
> jars.
>
> Best,
>
>               Tim
>
>
>
> -----Original Message-----
> From: Tyler Palsulich [mailto:[hidden email]]
> Sent: Sunday, March 29, 2015 9:13 AM
> To: [hidden email]
> Subject: Re: [DISCUSS] Tika 1.8 or 1.7.1
>
> Once TIKA-1584 and TIKA-1575 are resolved, I'll work up an RC (unless
> something else pops up).
>
> Thank you everyone.
>
> Tyler
> On Mar 29, 2015 4:43 AM, "Hong-Thai Nguyen" <[hidden email]> wrote:
>
> > +1 for 1.8
> >
> > Hong-Thai
> >
> > > On 28 Mar 2015, at 16:01, Tyler Palsulich <[hidden email]>
> wrote:
> > >
> > > Hi Folks,
> > >
> > > Now that TIKA-1581 (JHighlight licensing issues) is resolved, we need
> to
> > > release a new version of Tika. I'll volunteer to be the release manager
> > > again.
> > >
> > > Should we release this as 1.8 or 1.7.1?
> > >
> > > Does anyone have any last minute issues they'd like to finish and see
> in
> > > Tika 1.X? I'd like to get the example working with CORS (TIKA-1585 and
> > > TIKA-1586). Any others?
> > >
> > > Have a good weekend,
> > > Tyler
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Tika 1.8 or 1.7.1

Mattmann, Chris A (3010)
+1 to running tika-batch and govdocs. Woot.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: [hidden email]
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++






-----Original Message-----
From: Tyler Palsulich <[hidden email]>
Reply-To: "[hidden email]" <[hidden email]>
Date: Monday, March 30, 2015 at 3:22 PM
To: "[hidden email]" <[hidden email]>
Subject: Re: [DISCUSS] Tika 1.8 or 1.7.1

>I just remembered TIKA-1509 and TIKA-1558 -- testing now for blacklist
>functionality through TIKA-1509. If that works, I'll back out TIKA-1558.
>
>Tim, I think you should run govdocs from the RC, in case something changes
>between your run and the cut.
>
>Tyler
>
>On Mon, Mar 30, 2015 at 10:17 AM, Allison, Timothy B. <[hidden email]>
>wrote:
>
>> All,
>>
>> I've made the changes that I had hoped to.  Grib pdf exclusion remains
>>for
>> any takers.
>>
>> Let me know when I should initiate the run against govdocs1 to see if
>> there are any surprises on that corpus with Tika 1.8.
>>
>> Best,
>>
>>             Tim
>>
>> -----Original Message-----
>> From: Allison, Timothy B. [mailto:[hidden email]]
>> Sent: Monday, March 30, 2015 7:03 AM
>> To: [hidden email]
>> Subject: RE: [DISCUSS] Tika 1.8 or 1.7.1
>>
>> Unless there are objections, I'd like these to be resolved before 1.8:
>>
>> TIKA-1584 -- I'll fix
>> TIKA-1575 -- Resolved by Konstantin Gribov (thank you!)
>> TIKA-1512 -- I'll put in a temporary fix so that we don't get IOOBEs,
>>but
>> I'll leave this open and do some more digging to see if we need to open
>>a
>> ticket at the POI level
>> TIKA-1511 -- I'll remove "provided" for xerial
>>
>> TIKA-1549 -- We should thank Toke Eskildsen in CHANGES.txt, no?
>>
>> I'll have these fixes completed by noon EDT.  Should I run against
>> govdocs1 before or after the RC?
>>
>> My last build of Tika app (a few days ago) ballooned to ~43MB, and
>>that's
>> before I add ~3MB for xerial.  Tika server is now ~48MB.  As of my last
>> build, we are still including ~4MB of pdfs (README.NLDAS1.pdf and
>> README.NLDAS2.pdf) from the GRIB(?) parser in the tika-app and
>>tika-server
>> jars.
>>
>> Best,
>>
>>               Tim
>>
>>
>>
>> -----Original Message-----
>> From: Tyler Palsulich [mailto:[hidden email]]
>> Sent: Sunday, March 29, 2015 9:13 AM
>> To: [hidden email]
>> Subject: Re: [DISCUSS] Tika 1.8 or 1.7.1
>>
>> Once TIKA-1584 and TIKA-1575 are resolved, I'll work up an RC (unless
>> something else pops up).
>>
>> Thank you everyone.
>>
>> Tyler
>> On Mar 29, 2015 4:43 AM, "Hong-Thai Nguyen" <[hidden email]>
>>wrote:
>>
>> > +1 for 1.8
>> >
>> > Hong-Thai
>> >
>> > > On 28 Mar 2015, at 16:01, Tyler Palsulich <[hidden email]>
>> wrote:
>> > >
>> > > Hi Folks,
>> > >
>> > > Now that TIKA-1581 (JHighlight licensing issues) is resolved, we
>>need
>> to
>> > > release a new version of Tika. I'll volunteer to be the release
>>manager
>> > > again.
>> > >
>> > > Should we release this as 1.8 or 1.7.1?
>> > >
>> > > Does anyone have any last minute issues they'd like to finish and
>>see
>> in
>> > > Tika 1.X? I'd like to get the example working with CORS (TIKA-1585
>>and
>> > > TIKA-1586). Any others?
>> > >
>> > > Have a good weekend,
>> > > Tyler
>> >
>>

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Tika 1.8 or 1.7.1

Mattmann, Chris A (3010)
In reply to this post by Tyler Palsulich
Also I can run the RC on a subset of ImageCat [1] to test the
new RC too when it’s ready.

Cheers,
Chris

[1] https://github.com/chrismattmann/imagecat/


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: [hidden email]
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++






-----Original Message-----
From: Tyler Palsulich <[hidden email]>
Reply-To: "[hidden email]" <[hidden email]>
Date: Monday, March 30, 2015 at 3:22 PM
To: "[hidden email]" <[hidden email]>
Subject: Re: [DISCUSS] Tika 1.8 or 1.7.1

>I just remembered TIKA-1509 and TIKA-1558 -- testing now for blacklist
>functionality through TIKA-1509. If that works, I'll back out TIKA-1558.
>
>Tim, I think you should run govdocs from the RC, in case something changes
>between your run and the cut.
>
>Tyler
>
>On Mon, Mar 30, 2015 at 10:17 AM, Allison, Timothy B. <[hidden email]>
>wrote:
>
>> All,
>>
>> I've made the changes that I had hoped to.  Grib pdf exclusion remains
>>for
>> any takers.
>>
>> Let me know when I should initiate the run against govdocs1 to see if
>> there are any surprises on that corpus with Tika 1.8.
>>
>> Best,
>>
>>             Tim
>>
>> -----Original Message-----
>> From: Allison, Timothy B. [mailto:[hidden email]]
>> Sent: Monday, March 30, 2015 7:03 AM
>> To: [hidden email]
>> Subject: RE: [DISCUSS] Tika 1.8 or 1.7.1
>>
>> Unless there are objections, I'd like these to be resolved before 1.8:
>>
>> TIKA-1584 -- I'll fix
>> TIKA-1575 -- Resolved by Konstantin Gribov (thank you!)
>> TIKA-1512 -- I'll put in a temporary fix so that we don't get IOOBEs,
>>but
>> I'll leave this open and do some more digging to see if we need to open
>>a
>> ticket at the POI level
>> TIKA-1511 -- I'll remove "provided" for xerial
>>
>> TIKA-1549 -- We should thank Toke Eskildsen in CHANGES.txt, no?
>>
>> I'll have these fixes completed by noon EDT.  Should I run against
>> govdocs1 before or after the RC?
>>
>> My last build of Tika app (a few days ago) ballooned to ~43MB, and
>>that's
>> before I add ~3MB for xerial.  Tika server is now ~48MB.  As of my last
>> build, we are still including ~4MB of pdfs (README.NLDAS1.pdf and
>> README.NLDAS2.pdf) from the GRIB(?) parser in the tika-app and
>>tika-server
>> jars.
>>
>> Best,
>>
>>               Tim
>>
>>
>>
>> -----Original Message-----
>> From: Tyler Palsulich [mailto:[hidden email]]
>> Sent: Sunday, March 29, 2015 9:13 AM
>> To: [hidden email]
>> Subject: Re: [DISCUSS] Tika 1.8 or 1.7.1
>>
>> Once TIKA-1584 and TIKA-1575 are resolved, I'll work up an RC (unless
>> something else pops up).
>>
>> Thank you everyone.
>>
>> Tyler
>> On Mar 29, 2015 4:43 AM, "Hong-Thai Nguyen" <[hidden email]>
>>wrote:
>>
>> > +1 for 1.8
>> >
>> > Hong-Thai
>> >
>> > > On 28 Mar 2015, at 16:01, Tyler Palsulich <[hidden email]>
>> wrote:
>> > >
>> > > Hi Folks,
>> > >
>> > > Now that TIKA-1581 (JHighlight licensing issues) is resolved, we
>>need
>> to
>> > > release a new version of Tika. I'll volunteer to be the release
>>manager
>> > > again.
>> > >
>> > > Should we release this as 1.8 or 1.7.1?
>> > >
>> > > Does anyone have any last minute issues they'd like to finish and
>>see
>> in
>> > > Tika 1.X? I'd like to get the example working with CORS (TIKA-1585
>>and
>> > > TIKA-1586). Any others?
>> > >
>> > > Have a good weekend,
>> > > Tyler
>> >
>>