Tika 2.0?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Tika 2.0?

Allison, Timothy B.
All,

  We're getting some increasing deltas btwn the 2.0 and trunk branches.  Many of these are my fault; I gave up making updates to 2.0 around April/May, I think.

  What would people think of punting on some of the desired goals of 2.0 (e.g. chaining parsers, more structured but still simple metadata) and releasing 2.0 soonish...say 2.0-BETA end of September?

  We've been able to make some major improvements to Tika without breaking backwards compatibility.  We _might_ be able to do that with the outstanding issues for 2.0 when someone has time.

  We could also do the upgrade to jdk 8 with Tika 2.0.

  If this sounds reasonable, I propose creating a 1.x branch from trunk for 1.x maintenance and then reworking trunk to the 2.x structure that Bob Paulin so elegantly worked out.  I figure we can either copy/paste from trunk to the current 2.x (and _hope_ we get all the updates) or use Bob's 2.0 as a model for restructuring trunk.  At this point, I'd prefer the second option.  The key here is to switch "trunk" to 2.0 so that we all have the mindset that 2.0 is what we're focused on.

   The main benefit of this proposal is that we'd have a more modular Tika soon.

   What do you think?

         Best,

               Tim
Reply | Threaded
Open this post in threaded view
|

Re: Tika 2.0?

Sergey Beryozkin
Hi Tim

Having a new major 2.0 master is a good idea IMHO. It will take time to
make it final but it's better to finally make it 'mainstream' and start
having new ideas realized or finalized...

Sergey
On 28/08/17 14:32, Allison, Timothy B. wrote:

> All,
>
>    We're getting some increasing deltas btwn the 2.0 and trunk branches.  Many of these are my fault; I gave up making updates to 2.0 around April/May, I think.
>
>    What would people think of punting on some of the desired goals of 2.0 (e.g. chaining parsers, more structured but still simple metadata) and releasing 2.0 soonish...say 2.0-BETA end of September?
>
>    We've been able to make some major improvements to Tika without breaking backwards compatibility.  We _might_ be able to do that with the outstanding issues for 2.0 when someone has time.
>
>    We could also do the upgrade to jdk 8 with Tika 2.0.
>
>    If this sounds reasonable, I propose creating a 1.x branch from trunk for 1.x maintenance and then reworking trunk to the 2.x structure that Bob Paulin so elegantly worked out.  I figure we can either copy/paste from trunk to the current 2.x (and _hope_ we get all the updates) or use Bob's 2.0 as a model for restructuring trunk.  At this point, I'd prefer the second option.  The key here is to switch "trunk" to 2.0 so that we all have the mindset that 2.0 is what we're focused on.
>
>     The main benefit of this proposal is that we'd have a more modular Tika soon.
>
>     What do you think?
>
>           Best,
>
>                 Tim
>
Reply | Threaded
Open this post in threaded view
|

Re: Tika 2.0?

Bob Paulin-2
In reply to this post by Allison, Timothy B.
Tim,

+1 You've done an admirable job of dual maintenance but it sounds like
it became a heavy tax on development.  Releasing would allow us to get
back to "trunk" based development again.  Then we could focus on porting
any missed patches and start looking for any regressions.  I also like
the idea of picking up Java 8 as many other projects are starting to do
this.

- Bob



On 8/28/2017 8:32 AM, Allison, Timothy B. wrote:

> All,
>
>   We're getting some increasing deltas btwn the 2.0 and trunk branches.  Many of these are my fault; I gave up making updates to 2.0 around April/May, I think.
>
>   What would people think of punting on some of the desired goals of 2.0 (e.g. chaining parsers, more structured but still simple metadata) and releasing 2.0 soonish...say 2.0-BETA end of September?
>
>   We've been able to make some major improvements to Tika without breaking backwards compatibility.  We _might_ be able to do that with the outstanding issues for 2.0 when someone has time.
>
>   We could also do the upgrade to jdk 8 with Tika 2.0.
>
>   If this sounds reasonable, I propose creating a 1.x branch from trunk for 1.x maintenance and then reworking trunk to the 2.x structure that Bob Paulin so elegantly worked out.  I figure we can either copy/paste from trunk to the current 2.x (and _hope_ we get all the updates) or use Bob's 2.0 as a model for restructuring trunk.  At this point, I'd prefer the second option.  The key here is to switch "trunk" to 2.0 so that we all have the mindset that 2.0 is what we're focused on.
>
>    The main benefit of this proposal is that we'd have a more modular Tika soon.
>
>    What do you think?
>
>          Best,
>
>                Tim
>


signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Tika 2.0?

Konstantin Gribov
Tim,

+1 to making restructuring master to 2.x shape. If we can at least migrate
modularization patches, dependency changes and move to java 8 it certainly
will be a good step forward and big reduction of technical debt.

On пн, 28 авг. 2017, 16:52 Bob Paulin <[hidden email]> wrote:

> Tim,
>
> +1 You've done an admirable job of dual maintenance but it sounds like
> it became a heavy tax on development.  Releasing would allow us to get
> back to "trunk" based development again.  Then we could focus on porting
> any missed patches and start looking for any regressions.  I also like
> the idea of picking up Java 8 as many other projects are starting to do
> this.
>
> - Bob
>
>
>
> On 8/28/2017 8:32 AM, Allison, Timothy B. wrote:
> > All,
> >
> >   We're getting some increasing deltas btwn the 2.0 and trunk branches.
> Many of these are my fault; I gave up making updates to 2.0 around
> April/May, I think.
> >
> >   What would people think of punting on some of the desired goals of 2.0
> (e.g. chaining parsers, more structured but still simple metadata) and
> releasing 2.0 soonish...say 2.0-BETA end of September?
> >
> >   We've been able to make some major improvements to Tika without
> breaking backwards compatibility.  We _might_ be able to do that with the
> outstanding issues for 2.0 when someone has time.
> >
> >   We could also do the upgrade to jdk 8 with Tika 2.0.
> >
> >   If this sounds reasonable, I propose creating a 1.x branch from trunk
> for 1.x maintenance and then reworking trunk to the 2.x structure that Bob
> Paulin so elegantly worked out.  I figure we can either copy/paste from
> trunk to the current 2.x (and _hope_ we get all the updates) or use Bob's
> 2.0 as a model for restructuring trunk.  At this point, I'd prefer the
> second option.  The key here is to switch "trunk" to 2.0 so that we all
> have the mindset that 2.0 is what we're focused on.
> >
> >    The main benefit of this proposal is that we'd have a more modular
> Tika soon.
> >
> >    What do you think?
> >
> >          Best,
> >
> >                Tim
> >
>
>
> --

Best regards,
Konstantin Gribov
Reply | Threaded
Open this post in threaded view
|

Re: Tika 2.0?

Mattmann, Chris A (3010)
I am cool to finally get on the 2.0 kool aid and execute the plan as described by Tim
below for our next release.

+1.

Cheers,
Chris


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Principal Data Scientist, Engineering Administrative Office (3010)
Manager, NSF & Open Source Projects Formulation and Development Offices (8212)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 180-503E, Mailstop: 180-503
Email: [hidden email]
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Director, Information Retrieval and Data Science Group (IRDS)
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
WWW: http://irds.usc.edu/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 

On 8/28/17, 8:12 AM, "Konstantin Gribov" <[hidden email]> wrote:

    Tim,
   
    +1 to making restructuring master to 2.x shape. If we can at least migrate
    modularization patches, dependency changes and move to java 8 it certainly
    will be a good step forward and big reduction of technical debt.
   
    On пн, 28 авг. 2017, 16:52 Bob Paulin <[hidden email]> wrote:
   
    > Tim,
    >
    > +1 You've done an admirable job of dual maintenance but it sounds like
    > it became a heavy tax on development.  Releasing would allow us to get
    > back to "trunk" based development again.  Then we could focus on porting
    > any missed patches and start looking for any regressions.  I also like
    > the idea of picking up Java 8 as many other projects are starting to do
    > this.
    >
    > - Bob
    >
    >
    >
    > On 8/28/2017 8:32 AM, Allison, Timothy B. wrote:
    > > All,
    > >
    > >   We're getting some increasing deltas btwn the 2.0 and trunk branches.
    > Many of these are my fault; I gave up making updates to 2.0 around
    > April/May, I think.
    > >
    > >   What would people think of punting on some of the desired goals of 2.0
    > (e.g. chaining parsers, more structured but still simple metadata) and
    > releasing 2.0 soonish...say 2.0-BETA end of September?
    > >
    > >   We've been able to make some major improvements to Tika without
    > breaking backwards compatibility.  We _might_ be able to do that with the
    > outstanding issues for 2.0 when someone has time.
    > >
    > >   We could also do the upgrade to jdk 8 with Tika 2.0.
    > >
    > >   If this sounds reasonable, I propose creating a 1.x branch from trunk
    > for 1.x maintenance and then reworking trunk to the 2.x structure that Bob
    > Paulin so elegantly worked out.  I figure we can either copy/paste from
    > trunk to the current 2.x (and _hope_ we get all the updates) or use Bob's
    > 2.0 as a model for restructuring trunk.  At this point, I'd prefer the
    > second option.  The key here is to switch "trunk" to 2.0 so that we all
    > have the mindset that 2.0 is what we're focused on.
    > >
    > >    The main benefit of this proposal is that we'd have a more modular
    > Tika soon.
    > >
    > >    What do you think?
    > >
    > >          Best,
    > >
    > >                Tim
    > >
    >
    >
    > --
   
    Best regards,
    Konstantin Gribov
   

Reply | Threaded
Open this post in threaded view
|

Re: Tika 2.0?

Mattmann, Chris A (3010)
In reply to this post by Konstantin Gribov
BTW, one *very* important thing to do *before* we make the steps below would be
to look very closely at the wiki and all the documentation that has been written with
1.x in mind and either update it or make sure it still applies to our new master after the
2.x merge, and 1.x maintenance branch creation.

Cheers,
Chris


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Principal Data Scientist, Engineering Administrative Office (3010)
Manager, NSF & Open Source Projects Formulation and Development Offices (8212)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 180-503E, Mailstop: 180-503
Email: [hidden email]
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Director, Information Retrieval and Data Science Group (IRDS)
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
WWW: http://irds.usc.edu/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 

On 8/28/17, 8:12 AM, "Konstantin Gribov" <[hidden email]> wrote:

    Tim,
   
    +1 to making restructuring master to 2.x shape. If we can at least migrate
    modularization patches, dependency changes and move to java 8 it certainly
    will be a good step forward and big reduction of technical debt.
   
    On пн, 28 авг. 2017, 16:52 Bob Paulin <[hidden email]> wrote:
   
    > Tim,
    >
    > +1 You've done an admirable job of dual maintenance but it sounds like
    > it became a heavy tax on development.  Releasing would allow us to get
    > back to "trunk" based development again.  Then we could focus on porting
    > any missed patches and start looking for any regressions.  I also like
    > the idea of picking up Java 8 as many other projects are starting to do
    > this.
    >
    > - Bob
    >
    >
    >
    > On 8/28/2017 8:32 AM, Allison, Timothy B. wrote:
    > > All,
    > >
    > >   We're getting some increasing deltas btwn the 2.0 and trunk branches.
    > Many of these are my fault; I gave up making updates to 2.0 around
    > April/May, I think.
    > >
    > >   What would people think of punting on some of the desired goals of 2.0
    > (e.g. chaining parsers, more structured but still simple metadata) and
    > releasing 2.0 soonish...say 2.0-BETA end of September?
    > >
    > >   We've been able to make some major improvements to Tika without
    > breaking backwards compatibility.  We _might_ be able to do that with the
    > outstanding issues for 2.0 when someone has time.
    > >
    > >   We could also do the upgrade to jdk 8 with Tika 2.0.
    > >
    > >   If this sounds reasonable, I propose creating a 1.x branch from trunk
    > for 1.x maintenance and then reworking trunk to the 2.x structure that Bob
    > Paulin so elegantly worked out.  I figure we can either copy/paste from
    > trunk to the current 2.x (and _hope_ we get all the updates) or use Bob's
    > 2.0 as a model for restructuring trunk.  At this point, I'd prefer the
    > second option.  The key here is to switch "trunk" to 2.0 so that we all
    > have the mindset that 2.0 is what we're focused on.
    > >
    > >    The main benefit of this proposal is that we'd have a more modular
    > Tika soon.
    > >
    > >    What do you think?
    > >
    > >          Best,
    > >
    > >                Tim
    > >
    >
    >
    > --
   
    Best regards,
    Konstantin Gribov
   

Reply | Threaded
Open this post in threaded view
|

RE: Tika 2.0?

Allison, Timothy B.
In reply to this post by Allison, Timothy B.
Y, well, I didn't say _which_ September...

Given my limited availability to work on this in Sept and POI's decision to move to Java 1.8, I propose releasing Tika 1.17 after the release of POI 3.17 and PDFBox 2.0.8.  This would be the last version of Tika at the Java 1.7 level, and then we bump the Java requirement to 1.8, switch master to the 2.0 layout and create a 1.x maintenance branch (with Java 1.8) for quick critical bug fixes/security vulnerabilities until we release 2.0.

What do you all think?

 
-----Original Message-----
From: Allison, Timothy B. [mailto:[hidden email]]
Sent: Monday, August 28, 2017 9:33 AM
To: [hidden email]
Subject: Tika 2.0?

All,

  We're getting some increasing deltas btwn the 2.0 and trunk branches.  Many of these are my fault; I gave up making updates to 2.0 around April/May, I think.

  What would people think of punting on some of the desired goals of 2.0 (e.g. chaining parsers, more structured but still simple metadata) and releasing 2.0 soonish...say 2.0-BETA end of September?

  We've been able to make some major improvements to Tika without breaking backwards compatibility.  We _might_ be able to do that with the outstanding issues for 2.0 when someone has time.

  We could also do the upgrade to jdk 8 with Tika 2.0.

  If this sounds reasonable, I propose creating a 1.x branch from trunk for 1.x maintenance and then reworking trunk to the 2.x structure that Bob Paulin so elegantly worked out.  I figure we can either copy/paste from trunk to the current 2.x (and _hope_ we get all the updates) or use Bob's 2.0 as a model for restructuring trunk.  At this point, I'd prefer the second option.  The key here is to switch "trunk" to 2.0 so that we all have the mindset that 2.0 is what we're focused on.

   The main benefit of this proposal is that we'd have a more modular Tika soon.

   What do you think?

         Best,

               Tim
Reply | Threaded
Open this post in threaded view
|

Re: Tika 2.0?

Chris Mattmann
+1000



On 9/11/17, 12:03 PM, "Allison, Timothy B." <[hidden email]> wrote:

    Y, well, I didn't say _which_ September...
   
    Given my limited availability to work on this in Sept and POI's decision to move to Java 1.8, I propose releasing Tika 1.17 after the release of POI 3.17 and PDFBox 2.0.8.  This would be the last version of Tika at the Java 1.7 level, and then we bump the Java requirement to 1.8, switch master to the 2.0 layout and create a 1.x maintenance branch (with Java 1.8) for quick critical bug fixes/security vulnerabilities until we release 2.0.
   
    What do you all think?
   
     
    -----Original Message-----
    From: Allison, Timothy B. [mailto:[hidden email]]
    Sent: Monday, August 28, 2017 9:33 AM
    To: [hidden email]
    Subject: Tika 2.0?
   
    All,
   
      We're getting some increasing deltas btwn the 2.0 and trunk branches.  Many of these are my fault; I gave up making updates to 2.0 around April/May, I think.
   
      What would people think of punting on some of the desired goals of 2.0 (e.g. chaining parsers, more structured but still simple metadata) and releasing 2.0 soonish...say 2.0-BETA end of September?
   
      We've been able to make some major improvements to Tika without breaking backwards compatibility.  We _might_ be able to do that with the outstanding issues for 2.0 when someone has time.
   
      We could also do the upgrade to jdk 8 with Tika 2.0.
   
      If this sounds reasonable, I propose creating a 1.x branch from trunk for 1.x maintenance and then reworking trunk to the 2.x structure that Bob Paulin so elegantly worked out.  I figure we can either copy/paste from trunk to the current 2.x (and _hope_ we get all the updates) or use Bob's 2.0 as a model for restructuring trunk.  At this point, I'd prefer the second option.  The key here is to switch "trunk" to 2.0 so that we all have the mindset that 2.0 is what we're focused on.
   
       The main benefit of this proposal is that we'd have a more modular Tika soon.
   
       What do you think?
   
             Best,
   
                   Tim
   


Reply | Threaded
Open this post in threaded view
|

Re: Tika 2.0?

Bob Paulin-2
Just so it's clear are we going to:

1) Rename the 2.0 branch over to master

or

2) Re-apply the changes on master. 

I recall Chris' preference was 1 which would be quicker.  However there
is very likely missed patches.  2 will be more time consuming but it
would be more likely to include all the most recent code.  I'm open to
either.  Not sure how far out of date 2.0 branch is so I defer to Tim on
the risk of going with #1.


- Bob


On 9/11/2017 5:15 PM, Chris Mattmann wrote:

> +1000
>
>
>
> On 9/11/17, 12:03 PM, "Allison, Timothy B." <[hidden email]> wrote:
>
>     Y, well, I didn't say _which_ September...
>    
>     Given my limited availability to work on this in Sept and POI's decision to move to Java 1.8, I propose releasing Tika 1.17 after the release of POI 3.17 and PDFBox 2.0.8.  This would be the last version of Tika at the Java 1.7 level, and then we bump the Java requirement to 1.8, switch master to the 2.0 layout and create a 1.x maintenance branch (with Java 1.8) for quick critical bug fixes/security vulnerabilities until we release 2.0.
>    
>     What do you all think?
>    
>      
>     -----Original Message-----
>     From: Allison, Timothy B. [mailto:[hidden email]]
>     Sent: Monday, August 28, 2017 9:33 AM
>     To: [hidden email]
>     Subject: Tika 2.0?
>    
>     All,
>    
>       We're getting some increasing deltas btwn the 2.0 and trunk branches.  Many of these are my fault; I gave up making updates to 2.0 around April/May, I think.
>    
>       What would people think of punting on some of the desired goals of 2.0 (e.g. chaining parsers, more structured but still simple metadata) and releasing 2.0 soonish...say 2.0-BETA end of September?
>    
>       We've been able to make some major improvements to Tika without breaking backwards compatibility.  We _might_ be able to do that with the outstanding issues for 2.0 when someone has time.
>    
>       We could also do the upgrade to jdk 8 with Tika 2.0.
>    
>       If this sounds reasonable, I propose creating a 1.x branch from trunk for 1.x maintenance and then reworking trunk to the 2.x structure that Bob Paulin so elegantly worked out.  I figure we can either copy/paste from trunk to the current 2.x (and _hope_ we get all the updates) or use Bob's 2.0 as a model for restructuring trunk.  At this point, I'd prefer the second option.  The key here is to switch "trunk" to 2.0 so that we all have the mindset that 2.0 is what we're focused on.
>    
>        The main benefit of this proposal is that we'd have a more modular Tika soon.
>    
>        What do you think?
>    
>              Best,
>    
>                    Tim
>    
>
>
>


signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

RE: Tika 2.0?

Allison, Timothy B.
I'd strongly advocate for 2.  I _think_ the hard work was laying out the general structure and adding the ProxyParser workaround.  Copying and pasting/reworking into that structure will be:

A) far less dangerous than 1
And
B) we'll have a cleaner history.

On A), I know that we didn't add some major components including: configurability of parsers, completely cleaned up logging, numerous bug fixes and even entire modules (tika-dl).

On B), there were a few times where I "caught a parser up" in 2.0 not by individual commits based on the original history but based on a copy/paste from the contemporaneous master.  This obliterated the history of some commits on the 2.0 branch and would force us to look back at master.

-----Original Message-----
From: Bob Paulin [mailto:[hidden email]]
Sent: Monday, September 11, 2017 9:48 PM
To: [hidden email]
Subject: Re: Tika 2.0?

Just so it's clear are we going to:

1) Rename the 2.0 branch over to master

or

2) Re-apply the changes on master. 

I recall Chris' preference was 1 which would be quicker.  However there is very likely missed patches.  2 will be more time consuming but it would be more likely to include all the most recent code.  I'm open to either.  Not sure how far out of date 2.0 branch is so I defer to Tim on the risk of going with #1.


- Bob


On 9/11/2017 5:15 PM, Chris Mattmann wrote:

> +1000
>
>
>
> On 9/11/17, 12:03 PM, "Allison, Timothy B." <[hidden email]> wrote:
>
>     Y, well, I didn't say _which_ September...
>    
>     Given my limited availability to work on this in Sept and POI's decision to move to Java 1.8, I propose releasing Tika 1.17 after the release of POI 3.17 and PDFBox 2.0.8.  This would be the last version of Tika at the Java 1.7 level, and then we bump the Java requirement to 1.8, switch master to the 2.0 layout and create a 1.x maintenance branch (with Java 1.8) for quick critical bug fixes/security vulnerabilities until we release 2.0.
>    
>     What do you all think?
>    
>      
>     -----Original Message-----
>     From: Allison, Timothy B. [mailto:[hidden email]]
>     Sent: Monday, August 28, 2017 9:33 AM
>     To: [hidden email]
>     Subject: Tika 2.0?
>    
>     All,
>    
>       We're getting some increasing deltas btwn the 2.0 and trunk branches.  Many of these are my fault; I gave up making updates to 2.0 around April/May, I think.
>    
>       What would people think of punting on some of the desired goals of 2.0 (e.g. chaining parsers, more structured but still simple metadata) and releasing 2.0 soonish...say 2.0-BETA end of September?
>    
>       We've been able to make some major improvements to Tika without breaking backwards compatibility.  We _might_ be able to do that with the outstanding issues for 2.0 when someone has time.
>    
>       We could also do the upgrade to jdk 8 with Tika 2.0.
>    
>       If this sounds reasonable, I propose creating a 1.x branch from trunk for 1.x maintenance and then reworking trunk to the 2.x structure that Bob Paulin so elegantly worked out.  I figure we can either copy/paste from trunk to the current 2.x (and _hope_ we get all the updates) or use Bob's 2.0 as a model for restructuring trunk.  At this point, I'd prefer the second option.  The key here is to switch "trunk" to 2.0 so that we all have the mindset that 2.0 is what we're focused on.
>    
>        The main benefit of this proposal is that we'd have a more modular Tika soon.
>    
>        What do you think?
>    
>              Best,
>    
>                    Tim
>    
>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Tika 2.0?

Chris Mattmann
B it is, proceed (



On 9/12/17, 5:10 AM, "Allison, Timothy B." <[hidden email]> wrote:

    I'd strongly advocate for 2.  I _think_ the hard work was laying out the general structure and adding the ProxyParser workaround.  Copying and pasting/reworking into that structure will be:
   
    A) far less dangerous than 1
    And
    B) we'll have a cleaner history.
   
    On A), I know that we didn't add some major components including: configurability of parsers, completely cleaned up logging, numerous bug fixes and even entire modules (tika-dl).
   
    On B), there were a few times where I "caught a parser up" in 2.0 not by individual commits based on the original history but based on a copy/paste from the contemporaneous master.  This obliterated the history of some commits on the 2.0 branch and would force us to look back at master.
   
    -----Original Message-----
    From: Bob Paulin [mailto:[hidden email]]
    Sent: Monday, September 11, 2017 9:48 PM
    To: [hidden email]
    Subject: Re: Tika 2.0?
   
    Just so it's clear are we going to:
   
    1) Rename the 2.0 branch over to master
   
    or
   
    2) Re-apply the changes on master.
   
    I recall Chris' preference was 1 which would be quicker.  However there is very likely missed patches.  2 will be more time consuming but it would be more likely to include all the most recent code.  I'm open to either.  Not sure how far out of date 2.0 branch is so I defer to Tim on the risk of going with #1.
   
   
    - Bob
   
   
    On 9/11/2017 5:15 PM, Chris Mattmann wrote:
    > +1000
    >
    >
    >
    > On 9/11/17, 12:03 PM, "Allison, Timothy B." <[hidden email]> wrote:
    >
    >     Y, well, I didn't say _which_ September...
    >    
    >     Given my limited availability to work on this in Sept and POI's decision to move to Java 1.8, I propose releasing Tika 1.17 after the release of POI 3.17 and PDFBox 2.0.8.  This would be the last version of Tika at the Java 1.7 level, and then we bump the Java requirement to 1.8, switch master to the 2.0 layout and create a 1.x maintenance branch (with Java 1.8) for quick critical bug fixes/security vulnerabilities until we release 2.0.
    >    
    >     What do you all think?
    >    
    >      
    >     -----Original Message-----
    >     From: Allison, Timothy B. [mailto:[hidden email]]
    >     Sent: Monday, August 28, 2017 9:33 AM
    >     To: [hidden email]
    >     Subject: Tika 2.0?
    >    
    >     All,
    >    
    >       We're getting some increasing deltas btwn the 2.0 and trunk branches.  Many of these are my fault; I gave up making updates to 2.0 around April/May, I think.
    >    
    >       What would people think of punting on some of the desired goals of 2.0 (e.g. chaining parsers, more structured but still simple metadata) and releasing 2.0 soonish...say 2.0-BETA end of September?
    >    
    >       We've been able to make some major improvements to Tika without breaking backwards compatibility.  We _might_ be able to do that with the outstanding issues for 2.0 when someone has time.
    >    
    >       We could also do the upgrade to jdk 8 with Tika 2.0.
    >    
    >       If this sounds reasonable, I propose creating a 1.x branch from trunk for 1.x maintenance and then reworking trunk to the 2.x structure that Bob Paulin so elegantly worked out.  I figure we can either copy/paste from trunk to the current 2.x (and _hope_ we get all the updates) or use Bob's 2.0 as a model for restructuring trunk.  At this point, I'd prefer the second option.  The key here is to switch "trunk" to 2.0 so that we all have the mindset that 2.0 is what we're focused on.
    >    
    >        The main benefit of this proposal is that we'd have a more modular Tika soon.
    >    
    >        What do you think?
    >    
    >              Best,
    >    
    >                    Tim
    >    
    >
    >
    >