Quantcast

5.x to 6.x migration: replacement for Lucene50Codec

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

5.x to 6.x migration: replacement for Lucene50Codec

Andreas Sewe
Hi,

I am currently attempting a Lucene 5.x to 6.x migration (from 5.2.1 to
6.1, to be precise), and am looking for a replacement for Lucene50Codec:

  indexWriterConfig.setCodec(new Lucene50Codec(Mode.BEST_COMPRESSION));

The org.apache.lucene.codecs.lucene50 package is still there, so what I
am after is probably possible, but unfortunately not obvious.

Any pointers are greatly appreciated.

Best wishes,

Andreas

--
Codetrails GmbH
The knowledge transfer company

Robert-Bosch-Str. 7, 64293 Darmstadt
Phone: +49-6151-276-7092
Mobile: +49-170-811-3791
http://www.codetrails.com/

Managing Director: Dr. Marcel Bruch
Handelsregister: Darmstadt HRB 91940


signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: 5.x to 6.x migration: replacement for Lucene50Codec

Adrien Grand
If you move to Lucene 6.1, then this should be Lucene60Codec. More
generally that would be the same codec that is returned by Codec.getDefault.

Le mer. 29 mars 2017 ร  18:11, Andreas Sewe <[hidden email]> a
รฉcrit :

> Hi,
>
> I am currently attempting a Lucene 5.x to 6.x migration (from 5.2.1 to
> 6.1, to be precise), and am looking for a replacement for Lucene50Codec:
>
>   indexWriterConfig.setCodec(new Lucene50Codec(Mode.BEST_COMPRESSION));
>
> The org.apache.lucene.codecs.lucene50 package is still there, so what I
> am after is probably possible, but unfortunately not obvious.
>
> Any pointers are greatly appreciated.
>
> Best wishes,
>
> Andreas
>
> --
> Codetrails GmbH
> The knowledge transfer company
>
> Robert-Bosch-Str. 7, 64293 Darmstadt
> Phone: +49-6151-276-7092 <+49%206151%202767092>
> Mobile: +49-170-811-3791 <+49%20170%208113791>
> http://www.codetrails.com/
>
> Managing Director: Dr. Marcel Bruch
> Handelsregister: Darmstadt HRB 91940
>
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: 5.x to 6.x migration: replacement for Lucene50Codec

Andreas Sewe
Hi Adrien,

> If you move to Lucene 6.1, then this should be Lucene60Codec. More
> generally that would be the same codec that is returned by Codec.getDefault.

I should have mentioned that I for compatibility reasons still need to
be able to read/write indexes created with the old version, i.e., with
the 5.0 codec.

As the org.apache.lucene.codecs.lucene50 package is still around, I
think that this should be possible; there is just no ready-made Codec
for me to use.

I hope this clarifies things.

Best wishes,

Andreas

--
Codetrails GmbH
The knowledge transfer company

Robert-Bosch-Str. 7, 64293 Darmstadt
Phone: +49-6151-276-7092
Mobile: +49-170-811-3791
http://www.codetrails.com/

Managing Director: Dr. Marcel Bruch
Handelsregister: Darmstadt HRB 91940


signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: 5.x to 6.x migration: replacement for Lucene50Codec

Uwe Schindler
Hi,

you have to define your own codec only during indexing, so you can just update that for the migration. This then affects all new segments written to your index.

To read indexes, Lucene will automatically load the codec based on the names written to index files. If you want to open 5.x indexes, the lucene-backwards-codecs.jar must be in classpath, as lucene-core.jar does not contain the old codec. You would otherwise get some Exception.

Uwe

-----
Uwe Schindler
Achterdiek 19, D-28357 Bremen
http://www.thetaphi.de
eMail: [hidden email]

> -----Original Message-----
> From: Andreas Sewe [mailto:[hidden email]]
> Sent: Thursday, March 30, 2017 9:17 AM
> To: [hidden email]
> Cc: Adrien Grand <[hidden email]>
> Subject: Re: 5.x to 6.x migration: replacement for Lucene50Codec
>
> Hi Adrien,
>
> > If you move to Lucene 6.1, then this should be Lucene60Codec. More
> > generally that would be the same codec that is returned by
> Codec.getDefault.
>
> I should have mentioned that I for compatibility reasons still need to
> be able to read/write indexes created with the old version, i.e., with
> the 5.0 codec.
>
> As the org.apache.lucene.codecs.lucene50 package is still around, I
> think that this should be possible; there is just no ready-made Codec
> for me to use.
>
> I hope this clarifies things.
>
> Best wishes,
>
> Andreas
>
> --
> Codetrails GmbH
> The knowledge transfer company
>
> Robert-Bosch-Str. 7, 64293 Darmstadt
> Phone: +49-6151-276-7092
> Mobile: +49-170-811-3791
> http://www.codetrails.com/
>
> Managing Director: Dr. Marcel Bruch
> Handelsregister: Darmstadt HRB 91940



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: 5.x to 6.x migration: replacement for Lucene50Codec

Uwe Schindler
Hi,

> > I should have mentioned that I for compatibility reasons still need to
> > be able to read/write indexes created with the old version, i.e., with
> > the 5.0 codec.

The old codecs are read-only! As said before, you can only specify the codec for IndexWriter. That means new segemnts to already existing indexes will automatically use the new codec. Old segments already in your index will stay with the old codec, until they are merged away, in which case they are implicitly upgraded.

As the Lucene 5 codec is read only, it is impossible to create a new index (or modify an existing index) in a way that it will still be readable with Lucene 5. As soon as you touch an index with the new codec, it will be mixed codec versions and cannot be read with old version. But Lucene 6 will happily handle the mixed codec index - it is designed for that use case (default-codec indexes will behave the same way). ๐Ÿ˜Š

Uwe




---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: 5.x to 6.x migration: replacement for Lucene50Codec

Andreas Sewe
Hi Uwe,

>>> I should have mentioned that I for compatibility reasons still need to
>>> be able to read/write indexes created with the old version, i.e., with
>>> the 5.0 codec.
>
> The old codecs are read-only! As said before, you can only specify the codec for IndexWriter. That means new segemnts to already existing indexes will automatically use the new codec. Old segments already in your index will stay with the old codec, until they are merged away, in which case they are implicitly upgraded.

Just to be clear: If lucene-backwards-codecs.jar 6.1 is on the classpath
(and gives me access to Lucene50Codec), can I specify the Lucene50Codec
in the IndexWriter's IndexWriterConfig and thus get Lucene 6.1 to write
an index compatible with Lucene 5.0?

> As the Lucene 5 codec is read only, it is impossible to create a new index (or modify an existing index) in a way that it will still be readable with Lucene 5. As soon as you touch an index with the new codec, it will be mixed codec versions and cannot be read with old version. But Lucene 6 will happily handle the mixed codec index - it is designed for that use case (default-codec indexes will behave the same way). ๐Ÿ˜Š

That's a cool feature and may work well for my use case. The only thing
I worry about is how ServiceLoader-based Codec discovery will work in an
OSGi environment (specially, an Eclipse plug-in).

Best wishes,

Andreas

--
Codetrails GmbH
The knowledge transfer company

Robert-Bosch-Str. 7, 64293 Darmstadt
Phone: +49-6151-276-7092
Mobile: +49-170-811-3791
http://www.codetrails.com/

Managing Director: Dr. Marcel Bruch
Handelsregister: Darmstadt HRB 91940


signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: 5.x to 6.x migration: replacement for Lucene50Codec

Uwe Schindler
Hi,

> >>> I should have mentioned that I for compatibility reasons still need to
> >>> be able to read/write indexes created with the old version, i.e., with
> >>> the 5.0 codec.
> >
> > The old codecs are read-only! As said before, you can only specify the
> codec for IndexWriter. That means new segemnts to already existing indexes
> will automatically use the new codec. Old segments already in your index will
> stay with the old codec, until they are merged away, in which case they are
> implicitly upgraded.
>
> Just to be clear: If lucene-backwards-codecs.jar 6.1 is on the classpath
> (and gives me access to Lucene50Codec), can I specify the Lucene50Codec
> in the IndexWriter's IndexWriterConfig and thus get Lucene 6.1 to write
> an index compatible with Lucene 5.0?

Does not work as the Lucene50 Codec can only READ indexes, writing is not implemented. The Lucene50 Codec is just there to READ indexes which still contain 5.x segments. This is done by Codec.forName() based on the codec written to the index files.

If you pass an instance of the Lucene50 codec to IndexWriter, you will hit an UnsupportedOperationException at some point.

> > As the Lucene 5 codec is read only, it is impossible to create a new index (or
> modify an existing index) in a way that it will still be readable with Lucene 5.
> As soon as you touch an index with the new codec, it will be mixed codec
> versions and cannot be read with old version. But Lucene 6 will happily
> handle the mixed codec index - it is designed for that use case (default-codec
> indexes will behave the same way). ๐Ÿ˜Š
>
> That's a cool feature and may work well for my use case. The only thing
> I worry about is how ServiceLoader-based Codec discovery will work in an
> OSGi environment (specially, an Eclipse plug-in).

Lucene requires ServiceLoader for reading indexes, no way around that. The jar files of Lucene must _all_ reside in the same ClassLoader or alternatively in the ContextClassLoader. If Lucene's ServiceLoader would not work in your environment, opening a DirectoryReader would fail ASAP with a not found exception.

Uwe

> Best wishes,
>
> Andreas
>
> --
> Codetrails GmbH
> The knowledge transfer company
>
> Robert-Bosch-Str. 7, 64293 Darmstadt
> Phone: +49-6151-276-7092
> Mobile: +49-170-811-3791
> http://www.codetrails.com/
>
> Managing Director: Dr. Marcel Bruch
> Handelsregister: Darmstadt HRB 91940



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Loading...