Solr 8.0.0 + IndexUpgrader

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Solr 8.0.0 + IndexUpgrader

Herbert Hackelsberger
Hi,

I tried to upgrade my test index from Solr 7.7.1 to Solr 8.0.0.
The file segments_4h7 already contains the string Lucene70.
I upgraded before with this command:

java -cp lucene-core-7.7.1.jar;lucene-backward-codecs-7.7.1.jar org.apache.lucene.index.IndexUpgrader C:\solr\server\solr\syneris\data\index\

Everything went successful, when I start solr via solr.cmd start, no errors are logged.
Now, when I try to upgrade to Solr 8 I also tried to upgraded the index with the following command:

java -cp lucene-core-8.0.0.jar;lucene-backward-codecs-8.0.0.jar org.apache.lucene.index.IndexUpgrader C:\solr\server\solr\syneris\data\index\

But I always get an exception:

Exception in thread "main" org.apache.lucene.index.IndexFormatTooOldException: Format version is not supported (resource BufferedChecksumIndexInput(MMapIndexInput(path="C:\solr\server\solr\syneris\data\index\segments_4h7"))): This index was initially created with Lucene 6.x while the current version is 8.0.0 and Lucene only supports reading the current and previous major versions.. This version of Lucene only supports indexes created with release 7.0 and later.
        at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:318)
        at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:289)
        at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:432)
        at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:429)
        at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:680)
        at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:632)
        at org.apache.lucene.index.SegmentInfos.readLatestCommit(SegmentInfos.java:434)
        at org.apache.lucene.index.DirectoryReader.listCommits(DirectoryReader.java:260)
        at org.apache.lucene.index.IndexUpgrader.upgrade(IndexUpgrader.java:158)
        at org.apache.lucene.index.IndexUpgrader.main(IndexUpgrader.java:78)

Any ideas, without performing a full reindex?


Mit freundlichen Grüßen

Herbert Hackelsberger
Kundensupport/Qualitätssicherung
______________________________________
TECHNODAT Technische Datenverarbeitung GmbH
Jakob-Haringer-Straße 6
5020  Salzburg / Austria

T  | +43 (0)662 2282-141
F  | +43 (0)662 2282-9
E  | [hidden email]<mailto:[hidden email]>
W | www.technodat.at

Rechtsform: GmbH; Firmensitz: Salzburg
Firmenbuchgericht: Landesgericht Salzburg
FN 64072z; DVR: 0481831; UID-Nr. ATU33826508

Reply | Threaded
Open this post in threaded view
|

Re: Solr 8.0.0 + IndexUpgrader

Shawn Heisey-2
On 4/1/2019 9:19 AM, Herbert Hackelsberger wrote:

> I tried to upgrade my test index from Solr 7.7.1 to Solr 8.0.0.
> The file segments_4h7 already contains the string Lucene70.
> I upgraded before with this command:
>
> java -cp lucene-core-7.7.1.jar;lucene-backward-codecs-7.7.1.jar org.apache.lucene.index.IndexUpgrader C:\solr\server\solr\syneris\data\index\
>
> Everything went successful, when I start solr via solr.cmd start, no errors are logged.
> Now, when I try to upgrade to Solr 8 I also tried to upgraded the index with the following command:
>
> java -cp lucene-core-8.0.0.jar;lucene-backward-codecs-8.0.0.jar org.apache.lucene.index.IndexUpgrader C:\solr\server\solr\syneris\data\index\

Upgrading through two or more major versions is not supported.  If the
index has ever been touched by version 6.6.x or older, then 8.x will not
be able to read that index, even if it is upgraded to 7.x first.

Reindexing from scratch is the only option.  In my opinion, all indexes
should be rebuilt from scratch when upgrading, even when the new version
can read the old format.

Thanks,
Shawn
Reply | Threaded
Open this post in threaded view
|

Re: Solr 8.0.0 + IndexUpgrader

Erick Erickson
In reply to this post by Herbert Hackelsberger
As of Lucene 6, a marker was written into each segment, and when segments are merged the lowest marker is preserved. If any marker for version of Lucene X-2 is found, you will see the error you see.

This has been a source of considerable confusion. The guarantee of “one major revision backwards compatability” has always actually meant that in a case like yours, say from Lucene 5x -> 7x, you wouldn’t get a failure, but you _would_ get subtle errors.

From Robert Muir:
“...Because it is a lossy index and does not retain all of the user's data, its not possible to safely migrate some things automagically…"

IndexUpgraderTool does not actually change this restriction. All it does is insure that all segments were written by the current version. It cannot recreate data that’s not there in the first place.

Your only choice at this point is to fully re-index.

Best,
Erick

> On Apr 1, 2019, at 8:19 AM, Herbert Hackelsberger <[hidden email]> wrote:
>
> Hi,
>
> I tried to upgrade my test index from Solr 7.7.1 to Solr 8.0.0.
> The file segments_4h7 already contains the string Lucene70.
> I upgraded before with this command:
>
> java -cp lucene-core-7.7.1.jar;lucene-backward-codecs-7.7.1.jar org.apache.lucene.index.IndexUpgrader C:\solr\server\solr\syneris\data\index\
>
> Everything went successful, when I start solr via solr.cmd start, no errors are logged.
> Now, when I try to upgrade to Solr 8 I also tried to upgraded the index with the following command:
>
> java -cp lucene-core-8.0.0.jar;lucene-backward-codecs-8.0.0.jar org.apache.lucene.index.IndexUpgrader C:\solr\server\solr\syneris\data\index\
>
> But I always get an exception:
>
> Exception in thread "main" org.apache.lucene.index.IndexFormatTooOldException: Format version is not supported (resource BufferedChecksumIndexInput(MMapIndexInput(path="C:\solr\server\solr\syneris\data\index\segments_4h7"))): This index was initially created with Lucene 6.x while the current version is 8.0.0 and Lucene only supports reading the current and previous major versions.. This version of Lucene only supports indexes created with release 7.0 and later.
>        at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:318)
>        at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:289)
>        at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:432)
>        at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:429)
>        at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:680)
>        at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:632)
>        at org.apache.lucene.index.SegmentInfos.readLatestCommit(SegmentInfos.java:434)
>        at org.apache.lucene.index.DirectoryReader.listCommits(DirectoryReader.java:260)
>        at org.apache.lucene.index.IndexUpgrader.upgrade(IndexUpgrader.java:158)
>        at org.apache.lucene.index.IndexUpgrader.main(IndexUpgrader.java:78)
>
> Any ideas, without performing a full reindex?
>
>
> Mit freundlichen Grüßen
>
> Herbert Hackelsberger
> Kundensupport/Qualitätssicherung
> ______________________________________
> TECHNODAT Technische Datenverarbeitung GmbH
> Jakob-Haringer-Straße 6
> 5020  Salzburg / Austria
>
> T  | +43 (0)662 2282-141
> F  | +43 (0)662 2282-9
> E  | [hidden email]<mailto:[hidden email]>
> W | www.technodat.at
>
> Rechtsform: GmbH; Firmensitz: Salzburg
> Firmenbuchgericht: Landesgericht Salzburg
> FN 64072z; DVR: 0481831; UID-Nr. ATU33826508
>

Reply | Threaded
Open this post in threaded view
|

AW: Solr 8.0.0 + IndexUpgrader

Herbert Hackelsberger
Thanks for the fast response!

I used the IndexUpgrader to upgrade to 7.7.1 from 6.x and afterwards from 7.7.1 to 8.0.0

java -cp lucene-core-7.7.1.jar;lucene-backward-codecs-7.7.1.jar org.apache.lucene.index.IndexUpgrader C:\solr\server\solr\syneris\data\index\
java -cp lucene-core-8.0.0.jar;lucene-backward-codecs-8.0.0.jar org.apache.lucene.index.IndexUpgrader C:\solr\server\solr\syneris\data\index\

So, am I correct:
- When using the IndexUpgrader, it will make the Index usable in the actual version, without all new features.
- Using the Index Upgrader in the future again on the next major version will again result in this error situation.

Best Regards


-----Ursprüngliche Nachricht-----
Von: Erick Erickson <[hidden email]>
Gesendet: Montag, 1. April 2019 17:33
An: [hidden email]
Betreff: Re: Solr 8.0.0 + IndexUpgrader

As of Lucene 6, a marker was written into each segment, and when segments are merged the lowest marker is preserved. If any marker for version of Lucene X-2 is found, you will see the error you see.

This has been a source of considerable confusion. The guarantee of “one major revision backwards compatability” has always actually meant that in a case like yours, say from Lucene 5x -> 7x, you wouldn’t get a failure, but you _would_ get subtle errors.

From Robert Muir:
“...Because it is a lossy index and does not retain all of the user's data, its not possible to safely migrate some things automagically…"

IndexUpgraderTool does not actually change this restriction. All it does is insure that all segments were written by the current version. It cannot recreate data that’s not there in the first place.

Your only choice at this point is to fully re-index.

Best,
Erick

> On Apr 1, 2019, at 8:19 AM, Herbert Hackelsberger <[hidden email]> wrote:
>
> Hi,
>
> I tried to upgrade my test index from Solr 7.7.1 to Solr 8.0.0.
> The file segments_4h7 already contains the string Lucene70.
> I upgraded before with this command:
>
> java -cp lucene-core-7.7.1.jar;lucene-backward-codecs-7.7.1.jar
> org.apache.lucene.index.IndexUpgrader
> C:\solr\server\solr\syneris\data\index\
>
> Everything went successful, when I start solr via solr.cmd start, no errors are logged.
> Now, when I try to upgrade to Solr 8 I also tried to upgraded the index with the following command:
>
> java -cp lucene-core-8.0.0.jar;lucene-backward-codecs-8.0.0.jar
> org.apache.lucene.index.IndexUpgrader
> C:\solr\server\solr\syneris\data\index\
>
> But I always get an exception:
>
> Exception in thread "main" org.apache.lucene.index.IndexFormatTooOldException: Format version is not supported (resource BufferedChecksumIndexInput(MMapIndexInput(path="C:\solr\server\solr\syneris\data\index\segments_4h7"))): This index was initially created with Lucene 6.x while the current version is 8.0.0 and Lucene only supports reading the current and previous major versions.. This version of Lucene only supports indexes created with release 7.0 and later.
>        at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:318)
>        at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:289)
>        at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:432)
>        at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:429)
>        at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:680)
>        at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:632)
>        at org.apache.lucene.index.SegmentInfos.readLatestCommit(SegmentInfos.java:434)
>        at org.apache.lucene.index.DirectoryReader.listCommits(DirectoryReader.java:260)
>        at org.apache.lucene.index.IndexUpgrader.upgrade(IndexUpgrader.java:158)
>        at
> org.apache.lucene.index.IndexUpgrader.main(IndexUpgrader.java:78)
>
> Any ideas, without performing a full reindex?
>
>
> Mit freundlichen Grüßen
>
> Herbert Hackelsberger
> Kundensupport/Qualitätssicherung
> ______________________________________
> TECHNODAT Technische Datenverarbeitung GmbH Jakob-Haringer-Straße 6
> 5020  Salzburg / Austria
>
> T  | +43 (0)662 2282-141
> F  | +43 (0)662 2282-9
> E  | [hidden email]<mailto:[hidden email]>
> W | www.technodat.at
>
> Rechtsform: GmbH; Firmensitz: Salzburg
> Firmenbuchgericht: Landesgericht Salzburg FN 64072z; DVR: 0481831;
> UID-Nr. ATU33826508
>

Reply | Threaded
Open this post in threaded view
|

Re: AW: Solr 8.0.0 + IndexUpgrader

Shawn Heisey-2
On 4/1/2019 9:47 AM, Herbert Hackelsberger wrote:
> So, am I correct:
> - When using the IndexUpgrader, it will make the Index usable in the actual version, without all new features.
> - Using the Index Upgrader in the future again on the next major version will again result in this error situation.

That is correct.

If the "new features" are not related to the index format, then you will
have full access to them even with an older index.

The Lucene IndexUpgrader function does a forceMerge on the index, down
to one segment.  Solr calls that operation "optimize".

There's really no need to use IndexUpgrader.  Solr will directly use an
index from one major version back with no trouble.  If you ask Solr to
optimize the index down to one segment, that is an identical operation
to IndexUpgrader, with the difference that you can still access the
index while it is happening.

Thanks,
Shawn
Reply | Threaded
Open this post in threaded view
|

AW: AW: Solr 8.0.0 + IndexUpgrader

Herbert Hackelsberger
Many Thanks!

-----Ursprüngliche Nachricht-----
Von: Shawn Heisey <[hidden email]>
Gesendet: Montag, 1. April 2019 18:03
An: [hidden email]
Betreff: Re: AW: Solr 8.0.0 + IndexUpgrader

On 4/1/2019 9:47 AM, Herbert Hackelsberger wrote:
> So, am I correct:
> - When using the IndexUpgrader, it will make the Index usable in the actual version, without all new features.
> - Using the Index Upgrader in the future again on the next major version will again result in this error situation.

That is correct.

If the "new features" are not related to the index format, then you will have full access to them even with an older index.

The Lucene IndexUpgrader function does a forceMerge on the index, down to one segment.  Solr calls that operation "optimize".

There's really no need to use IndexUpgrader.  Solr will directly use an index from one major version back with no trouble.  If you ask Solr to optimize the index down to one segment, that is an identical operation to IndexUpgrader, with the difference that you can still access the index while it is happening.

Thanks,
Shawn
Reply | Threaded
Open this post in threaded view
|

Re: Solr 8.0.0 + IndexUpgrader

Erick Erickson
In reply to this post by Shawn Heisey-2
Minor nit. For IndexUpgraderTool and optimize to be identical, you have to specify maxSegments=1 on optimize.

As of LUCENE-7976, optimize respects the max segment size and does _not_ necessarily rewrite segments that have no deleted documents, especially if they’re near 5G which is the default max segment size.

Which nit doesn’t matter in this case of course…

Best,
Erick



> On Apr 1, 2019, at 9:03 AM, Shawn Heisey <[hidden email]> wrote:
>
> On 4/1/2019 9:47 AM, Herbert Hackelsberger wrote:
>> So, am I correct:
>> - When using the IndexUpgrader, it will make the Index usable in the actual version, without all new features.
>> - Using the Index Upgrader in the future again on the next major version will again result in this error situation.
>
> That is correct.
>
> If the "new features" are not related to the index format, then you will have full access to them even with an older index.
>
> The Lucene IndexUpgrader function does a forceMerge on the index, down to one segment.  Solr calls that operation "optimize".
>
> There's really no need to use IndexUpgrader.  Solr will directly use an index from one major version back with no trouble.  If you ask Solr to optimize the index down to one segment, that is an identical operation to IndexUpgrader, with the difference that you can still access the index while it is happening.
>
> Thanks,
> Shawn