Quantcast

Move oldest release archive from lucene/tika to tika?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Move oldest release archive from lucene/tika to tika?

Jan Høydahl / Cominvent
Hi,

The Lucene release archive still hosts Tika releases v0.2-0.7, se https://archive.apache.org/dist/lucene/tika/. These do not exist in https://archive.apache.org/dist/tika/

We’d prefer if you would copy the missing pieces over to Tika’s release area and we’ll then remove it from Lucene.

For reference, see https://issues.apache.org/jira/browse/LUCENE-7696

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Move oldest release archive from lucene/tika to tika?

Chris Mattmann
Hi Jan,

Thanks for the message, but frankly I don’t think we should do this. The releases
have been made and should be canonical, even if they were made from Lucene since
Tika originated there. Many sites over many years have likely mirror’ed thoese URLs and
I think this could screw that up. Is there some driving motivation to do this? For example,
the releases are likely small-ish, and don’t take a lot of space.

Just trying to understand.

Cheers,
Chris




On 2/15/17, 3:28 PM, "Jan Høydahl" <[hidden email]> wrote:

    Hi,
   
    The Lucene release archive still hosts Tika releases v0.2-0.7, se https://archive.apache.org/dist/lucene/tika/. These do not exist in https://archive.apache.org/dist/tika/
   
    We’d prefer if you would copy the missing pieces over to Tika’s release area and we’ll then remove it from Lucene.
   
    For reference, see https://issues.apache.org/jira/browse/LUCENE-7696
   
    --
    Jan Høydahl, search solution architect
    Cominvent AS - www.cominvent.com
   
   


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Move oldest release archive from lucene/tika to tika?

Jan Høydahl / Cominvent
Just trying to understand why we need to have hadoop, nutch, tika and mahout releases lying aorund when nobody will come to Lucene for them anymore, and I doubt anyone would prefer to use those versions.

So no pressing “need”, just general housekeeping.

Both Hadoop and Mahout projects have copied over all releases to their respective release folders and lucene/hadoop folder is completely gone from the mirros, although it still remains in the archive.

For Tika, I see that you link to the lucene archives so people will actually know that these releases exist :) Worse so for Nutch that have neither a complete archive themselves nor link to the lucene archives.

I can see the argument that there may still be some automated scripts pulling down 0.7 somewhere. Do we have download stats per file available somewhere?

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

> 16. feb. 2017 kl. 00.30 skrev Chris Mattmann <[hidden email]>:
>
> Hi Jan,
>
> Thanks for the message, but frankly I don’t think we should do this. The releases
> have been made and should be canonical, even if they were made from Lucene since
> Tika originated there. Many sites over many years have likely mirror’ed thoese URLs and
> I think this could screw that up. Is there some driving motivation to do this? For example,
> the releases are likely small-ish, and don’t take a lot of space.
>
> Just trying to understand.
>
> Cheers,
> Chris
>
>
>
>
> On 2/15/17, 3:28 PM, "Jan Høydahl" <[hidden email]> wrote:
>
>    Hi,
>
>    The Lucene release archive still hosts Tika releases v0.2-0.7, se https://archive.apache.org/dist/lucene/tika/. These do not exist in https://archive.apache.org/dist/tika/
>
>    We’d prefer if you would copy the missing pieces over to Tika’s release area and we’ll then remove it from Lucene.
>
>    For reference, see https://issues.apache.org/jira/browse/LUCENE-7696
>
>    --
>    Jan Høydahl, search solution architect
>    Cominvent AS - www.cominvent.com
>
>
>
>

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Move oldest release archive from lucene/tika to tika?

Chris Mattmann
Hey Jan,

Yeah I guess the motivation to me is not just for people to know that they exist, but
in general to follow Apache release policy in terms of making releases immutable, in a
persistent location, and so forth.

What do others think? I don’t feel strongly either way. I’m happy to copy the releases
to Tika, but others should chime in here especially Tika PMC.

Thanks,
Chris




On 2/16/17, 6:03 AM, "Jan Høydahl" <[hidden email]> wrote:

    Just trying to understand why we need to have hadoop, nutch, tika and mahout releases lying aorund when nobody will come to Lucene for them anymore, and I doubt anyone would prefer to use those versions.
   
    So no pressing “need”, just general housekeeping.
   
    Both Hadoop and Mahout projects have copied over all releases to their respective release folders and lucene/hadoop folder is completely gone from the mirros, although it still remains in the archive.
   
    For Tika, I see that you link to the lucene archives so people will actually know that these releases exist :) Worse so for Nutch that have neither a complete archive themselves nor link to the lucene archives.
   
    I can see the argument that there may still be some automated scripts pulling down 0.7 somewhere. Do we have download stats per file available somewhere?
   
    --
    Jan Høydahl, search solution architect
    Cominvent AS - www.cominvent.com
   
    > 16. feb. 2017 kl. 00.30 skrev Chris Mattmann <[hidden email]>:
    >
    > Hi Jan,
    >
    > Thanks for the message, but frankly I don’t think we should do this. The releases
    > have been made and should be canonical, even if they were made from Lucene since
    > Tika originated there. Many sites over many years have likely mirror’ed thoese URLs and
    > I think this could screw that up. Is there some driving motivation to do this? For example,
    > the releases are likely small-ish, and don’t take a lot of space.
    >
    > Just trying to understand.
    >
    > Cheers,
    > Chris
    >
    >
    >
    >
    > On 2/15/17, 3:28 PM, "Jan Høydahl" <[hidden email]> wrote:
    >
    >    Hi,
    >
    >    The Lucene release archive still hosts Tika releases v0.2-0.7, se https://archive.apache.org/dist/lucene/tika/. These do not exist in https://archive.apache.org/dist/tika/
    >
    >    We’d prefer if you would copy the missing pieces over to Tika’s release area and we’ll then remove it from Lucene.
    >
    >    For reference, see https://issues.apache.org/jira/browse/LUCENE-7696
    >
    >    --
    >    Jan Høydahl, search solution architect
    >    Cominvent AS - www.cominvent.com
    >
    >
    >
    >
   
   


Loading...