Recipe for moving to solr cloud without reindexing

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

Recipe for moving to solr cloud without reindexing

Bjarke Buur Mortensen
Hi List,

is there a cookbook recipe for moving an existing solr core to a solr cloud
collection.

We currently have a single machine with a large core (~150gb), and we would
like to move to solr cloud.

I haven't been able to find anything that reuses an existing index, so any
pointers much appreciated.

Thanks,
Bjarke
Reply | Threaded
Open this post in threaded view
|

RE: Recipe for moving to solr cloud without reindexing

Markus Jelsma-2
Hello Bjarke,

If you are not going to shard you can just create a 1 shard/1 replica collection, shut down Solr, copy the data directory into the replica's directory and start up again.

Regards,
Markus
 
-----Original message-----

> From:Bjarke Buur Mortensen <[hidden email]>
> Sent: Tuesday 7th August 2018 13:06
> To: [hidden email]
> Subject: Recipe for moving to solr cloud without reindexing
>
> Hi List,
>
> is there a cookbook recipe for moving an existing solr core to a solr cloud
> collection.
>
> We currently have a single machine with a large core (~150gb), and we would
> like to move to solr cloud.
>
> I haven't been able to find anything that reuses an existing index, so any
> pointers much appreciated.
>
> Thanks,
> Bjarke
>
Reply | Threaded
Open this post in threaded view
|

Re: Recipe for moving to solr cloud without reindexing

Bjarke Buur Mortensen
Thank you, that is of course a way to go, but I would actually like to be
able to shard ...
Could I use your approach and add shards dynamically?


2018-08-07 13:28 GMT+02:00 Markus Jelsma <[hidden email]>:

> Hello Bjarke,
>
> If you are not going to shard you can just create a 1 shard/1 replica
> collection, shut down Solr, copy the data directory into the replica's
> directory and start up again.
>
> Regards,
> Markus
>
> -----Original message-----
> > From:Bjarke Buur Mortensen <[hidden email]>
> > Sent: Tuesday 7th August 2018 13:06
> > To: [hidden email]
> > Subject: Recipe for moving to solr cloud without reindexing
> >
> > Hi List,
> >
> > is there a cookbook recipe for moving an existing solr core to a solr
> cloud
> > collection.
> >
> > We currently have a single machine with a large core (~150gb), and we
> would
> > like to move to solr cloud.
> >
> > I haven't been able to find anything that reuses an existing index, so
> any
> > pointers much appreciated.
> >
> > Thanks,
> > Bjarke
> >
>
Reply | Threaded
Open this post in threaded view
|

RE: Recipe for moving to solr cloud without reindexing

Markus Jelsma-2
Hello Bjarke,

You can use shard splitting:
https://lucene.apache.org/solr/guide/6_6/collections-api.html#CollectionsAPI-splitshard

Regards,
Markus

 
 
-----Original message-----

> From:Bjarke Buur Mortensen <[hidden email]>
> Sent: Tuesday 7th August 2018 13:47
> To: [hidden email]
> Subject: Re: Recipe for moving to solr cloud without reindexing
>
> Thank you, that is of course a way to go, but I would actually like to be
> able to shard ...
> Could I use your approach and add shards dynamically?
>
>
> 2018-08-07 13:28 GMT+02:00 Markus Jelsma <[hidden email]>:
>
> > Hello Bjarke,
> >
> > If you are not going to shard you can just create a 1 shard/1 replica
> > collection, shut down Solr, copy the data directory into the replica's
> > directory and start up again.
> >
> > Regards,
> > Markus
> >
> > -----Original message-----
> > > From:Bjarke Buur Mortensen <[hidden email]>
> > > Sent: Tuesday 7th August 2018 13:06
> > > To: [hidden email]
> > > Subject: Recipe for moving to solr cloud without reindexing
> > >
> > > Hi List,
> > >
> > > is there a cookbook recipe for moving an existing solr core to a solr
> > cloud
> > > collection.
> > >
> > > We currently have a single machine with a large core (~150gb), and we
> > would
> > > like to move to solr cloud.
> > >
> > > I haven't been able to find anything that reuses an existing index, so
> > any
> > > pointers much appreciated.
> > >
> > > Thanks,
> > > Bjarke
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Recipe for moving to solr cloud without reindexing

Bjarke Buur Mortensen
Right, that seems like a way to go, will give it a try.

Thanks!
/Bjarke

2018-08-07 14:08 GMT+02:00 Markus Jelsma <[hidden email]>:

> Hello Bjarke,
>
> You can use shard splitting:
> https://lucene.apache.org/solr/guide/6_6/collections-
> api.html#CollectionsAPI-splitshard
>
> Regards,
> Markus
>
>
>
> -----Original message-----
> > From:Bjarke Buur Mortensen <[hidden email]>
> > Sent: Tuesday 7th August 2018 13:47
> > To: [hidden email]
> > Subject: Re: Recipe for moving to solr cloud without reindexing
> >
> > Thank you, that is of course a way to go, but I would actually like to be
> > able to shard ...
> > Could I use your approach and add shards dynamically?
> >
> >
> > 2018-08-07 13:28 GMT+02:00 Markus Jelsma <[hidden email]>:
> >
> > > Hello Bjarke,
> > >
> > > If you are not going to shard you can just create a 1 shard/1 replica
> > > collection, shut down Solr, copy the data directory into the replica's
> > > directory and start up again.
> > >
> > > Regards,
> > > Markus
> > >
> > > -----Original message-----
> > > > From:Bjarke Buur Mortensen <[hidden email]>
> > > > Sent: Tuesday 7th August 2018 13:06
> > > > To: [hidden email]
> > > > Subject: Recipe for moving to solr cloud without reindexing
> > > >
> > > > Hi List,
> > > >
> > > > is there a cookbook recipe for moving an existing solr core to a solr
> > > cloud
> > > > collection.
> > > >
> > > > We currently have a single machine with a large core (~150gb), and we
> > > would
> > > > like to move to solr cloud.
> > > >
> > > > I haven't been able to find anything that reuses an existing index,
> so
> > > any
> > > > pointers much appreciated.
> > > >
> > > > Thanks,
> > > > Bjarke
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Recipe for moving to solr cloud without reindexing

Rahul Singh-3
In reply to this post by Bjarke Buur Mortensen
Bjarke,

I am imagining that at some point you may need to shard that data if it grows. Or do you imagine this data to remain stagnant?

Generally you want to add solrcloud to do two things : 1. Increase availability with replicas 2. Increase available data via shards 3. Increase fault tolerance with leader and replicas being spread around the cluster.

You would be bypassing general High availability / distributed computing processes by trying to not reindex.

Rahul
On Aug 7, 2018, 7:06 AM -0400, Bjarke Buur Mortensen <[hidden email]>, wrote:

> Hi List,
>
> is there a cookbook recipe for moving an existing solr core to a solr cloud
> collection.
>
> We currently have a single machine with a large core (~150gb), and we would
> like to move to solr cloud.
>
> I haven't been able to find anything that reuses an existing index, so any
> pointers much appreciated.
>
> Thanks,
> Bjarke
Reply | Threaded
Open this post in threaded view
|

Re: Recipe for moving to solr cloud without reindexing

Erick Erickson
Bjarke:

One thing, what version of Solr are you moving _from_ and _to_?
Solr/Lucene only guarantee one major backward revision so you can copy
an index created with Solr 6 to another Solr 6 or Solr 7, but you
couldn't copy an index created with Solr 5 to Solr 7...

Also note that shard splitting is a very expensive operation, so be patient....

Best,
Erick

On Tue, Aug 7, 2018 at 6:17 AM, Rahul Singh
<[hidden email]> wrote:

> Bjarke,
>
> I am imagining that at some point you may need to shard that data if it grows. Or do you imagine this data to remain stagnant?
>
> Generally you want to add solrcloud to do two things : 1. Increase availability with replicas 2. Increase available data via shards 3. Increase fault tolerance with leader and replicas being spread around the cluster.
>
> You would be bypassing general High availability / distributed computing processes by trying to not reindex.
>
> Rahul
> On Aug 7, 2018, 7:06 AM -0400, Bjarke Buur Mortensen <[hidden email]>, wrote:
>> Hi List,
>>
>> is there a cookbook recipe for moving an existing solr core to a solr cloud
>> collection.
>>
>> We currently have a single machine with a large core (~150gb), and we would
>> like to move to solr cloud.
>>
>> I haven't been able to find anything that reuses an existing index, so any
>> pointers much appreciated.
>>
>> Thanks,
>> Bjarke
Reply | Threaded
Open this post in threaded view
|

Re: Recipe for moving to solr cloud without reindexing

Bjarke Buur Mortensen
In reply to this post by Rahul Singh-3
Rahul, thanks, I do indeed want to be able to shard.
For now I'll go with Markus' suggestion and try to use the SPLITSHARD
command.

2018-08-07 15:17 GMT+02:00 Rahul Singh <[hidden email]>:

> Bjarke,
>
> I am imagining that at some point you may need to shard that data if it
> grows. Or do you imagine this data to remain stagnant?
>
> Generally you want to add solrcloud to do two things : 1. Increase
> availability with replicas 2. Increase available data via shards 3.
> Increase fault tolerance with leader and replicas being spread around the
> cluster.
>
> You would be bypassing general High availability / distributed computing
> processes by trying to not reindex.
>
> Rahul
> On Aug 7, 2018, 7:06 AM -0400, Bjarke Buur Mortensen <
> [hidden email]>, wrote:
> > Hi List,
> >
> > is there a cookbook recipe for moving an existing solr core to a solr
> cloud
> > collection.
> >
> > We currently have a single machine with a large core (~150gb), and we
> would
> > like to move to solr cloud.
> >
> > I haven't been able to find anything that reuses an existing index, so
> any
> > pointers much appreciated.
> >
> > Thanks,
> > Bjarke
>
Reply | Threaded
Open this post in threaded view
|

Re: Recipe for moving to solr cloud without reindexing

Bjarke Buur Mortensen
In reply to this post by Erick Erickson
Erick,

thanks, that is of course something I left out of the original question.
Our Solr is 7.1, so that should not present a problem (crossing fingers).

However, on my dev box I'm trying out the steps, and here I have some
segments created with version 6 of Solr.

After having copied data from my non-cloud solr into my
single-shard-single-replica collection and verified that Solr Cloud works
with this collection, I then submit the splitshard command

http://172.17.0.4:8984
/solr/admin/collections?action=SPLITSHARD&collection=procurement&shard=shard1

However, this gives me the error:
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
from server at http://172.17.0.4:8984/solr:
java.lang.IllegalArgumentException: Cannot merge a segment that has been
created with major version 6 into this index which has been created by
major version 7"}

I have tried running both optimize and IndexUpgrader on the index before
shard splitting, but the same error still occurs.

Any ideas as to why this happens?

Below is an output from running IndexUpgrader, which I cannot decipher.
It both states that "All segments upgraded to version 7.1.0" and ''all
running merges have aborted" ¯\_(ツ)_/¯

Thanks a lot,
Bjarke


======================
java -cp
/opt/solr/server/solr-webapp/webapp/WEB-INF/lib/lucene-backward-codecs-7.1.0.jar:/opt/solr/server/solr-webapp/webapp/WEB-INF/lib/lucene-core-7.1.0.jar
org.apache.lucene.index.IndexUpgrader -delete-prior-commits -verbose
/var/solr/cloud/procurement_shard1_replica_n1/data/index
IFD 0 [2018-08-08T13:00:18.244Z; main]: init: current segments file is
"segments_4vs";
deletionPolicy=org.apache.lucene.index.KeepOnlyLastCommitDeletionPolicy@721e0f4f
IFD 0 [2018-08-08T13:00:18.266Z; main]: init: load commit "segments_4vs"
IFD 0 [2018-08-08T13:00:18.270Z; main]: now checkpoint
"_bhg(7.1.0):C108396" [1 segments ; isCommit = false]
IFD 0 [2018-08-08T13:00:18.270Z; main]: 0 msec to checkpoint
IW 0 [2018-08-08T13:00:18.270Z; main]: init: create=false
IW 0 [2018-08-08T13:00:18.273Z; main]:
dir=MMapDirectory@/var/solr/cloud/procurement_shard1_replica_n1/data/index
lockFactory=org.apache.lucene.store.NativeFSLockFactory@6debcae2
index=_bhg(7.1.0):C108396
version=7.1.0
analyzer=null
ramBufferSizeMB=16.0
maxBufferedDocs=-1
mergedSegmentWarmer=null
delPolicy=org.apache.lucene.index.KeepOnlyLastCommitDeletionPolicy
commit=null
openMode=CREATE_OR_APPEND
similarity=org.apache.lucene.search.similarities.BM25Similarity
mergeScheduler=ConcurrentMergeScheduler: maxThreadCount=-1,
maxMergeCount=-1, ioThrottle=true
codec=Lucene70
infoStream=org.apache.lucene.util.PrintStreamInfoStream
mergePolicy=UpgradeIndexMergePolicy([TieredMergePolicy: maxMergeAtOnce=10,
maxMergeAtOnceExplicit=30, maxMergedSegmentMB=5120.0, floorSegmentMB=2.0,
forceMergeDeletesPctAllowed=10.0, segmentsPerTier=10.0,
maxCFSSegmentSizeMB=8.796093022207999E12, noCFSRatio=0.1)
indexerThreadPool=org.apache.lucene.index.DocumentsWriterPerThreadPool@5ba23b66
readerPooling=true
perThreadHardLimitMB=1945
useCompoundFile=true
commitOnClose=true
indexSort=null
writer=org.apache.lucene.index.IndexWriter@2ff4f00f

IW 0 [2018-08-08T13:00:18.273Z; main]: MMapDirectory.UNMAP_SUPPORTED=true
IndexUpgrader 0 [2018-08-08T13:00:18.274Z; main]: Upgrading all pre-7.1.0
segments of index directory
'MMapDirectory@/var/solr/cloud/procurement_shard1_replica_n1/data/index
lockFactory=org.apache.lucene.store.NativeFSLockFactory@6debcae2' to
version 7.1.0...
IW 0 [2018-08-08T13:00:18.274Z; main]: forceMerge: index now
_bhg(7.1.0):C108396
IW 0 [2018-08-08T13:00:18.274Z; main]: now flush at forceMerge
IW 0 [2018-08-08T13:00:18.274Z; main]:   start flush: applyAllDeletes=true
IW 0 [2018-08-08T13:00:18.274Z; main]:   index before flush
_bhg(7.1.0):C108396
DW 0 [2018-08-08T13:00:18.274Z; main]: startFullFlush
DW 0 [2018-08-08T13:00:18.275Z; main]: main finishFullFlush success=true
IW 0 [2018-08-08T13:00:18.275Z; main]: now apply all deletes for all
segments buffered updates bytesUsed=0 reader pool bytesUsed=0
BD 0 [2018-08-08T13:00:18.275Z; main]: waitApply: no deletes to apply
UPGMP 0 [2018-08-08T13:00:18.276Z; main]: findForcedMerges:
segmentsToUpgrade={}
MS 0 [2018-08-08T13:00:18.282Z; main]: initDynamicDefaults spins=true
maxThreadCount=1 maxMergeCount=6
MS 0 [2018-08-08T13:00:18.282Z; main]: now merge
MS 0 [2018-08-08T13:00:18.282Z; main]:   index: _bhg(7.1.0):C108396
MS 0 [2018-08-08T13:00:18.282Z; main]:   no more merges pending; now return
IndexUpgrader 0 [2018-08-08T13:00:18.282Z; main]: All segments upgraded to
version 7.1.0
IndexUpgrader 0 [2018-08-08T13:00:18.283Z; main]: Enforcing commit to
rewrite all index metadata...
IW 0 [2018-08-08T13:00:18.283Z; main]: commit: start
IW 0 [2018-08-08T13:00:18.283Z; main]: commit: enter lock
IW 0 [2018-08-08T13:00:18.283Z; main]: commit: now prepare
IW 0 [2018-08-08T13:00:18.283Z; main]: prepareCommit: flush
IW 0 [2018-08-08T13:00:18.283Z; main]:   index before flush
_bhg(7.1.0):C108396
DW 0 [2018-08-08T13:00:18.283Z; main]: startFullFlush
IW 0 [2018-08-08T13:00:18.283Z; main]: now apply all deletes for all
segments buffered updates bytesUsed=0 reader pool bytesUsed=0
BD 0 [2018-08-08T13:00:18.283Z; main]: waitApply: no deletes to apply
DW 0 [2018-08-08T13:00:18.284Z; main]: main finishFullFlush success=true
IW 0 [2018-08-08T13:00:18.284Z; main]: startCommit(): start
IW 0 [2018-08-08T13:00:18.284Z; main]: startCommit
index=_bhg(7.1.0):C108396 changeCount=2
IW 0 [2018-08-08T13:00:18.293Z; main]: startCommit: wrote pending segments
file "pending_segments_4vt"
IW 0 [2018-08-08T13:00:18.295Z; main]: done all syncs:
[_bhg_Lucene50_0.tip, _bhg.fdx, _bhg.fnm, _bhg.nvm, _bhg.fdt, _bhg.si,
_bhg_Lucene50_0.pos, _bhg.nvd, _bhg_Lucene50_0.doc, _bhg_Lucene50_0.tim]
IW 0 [2018-08-08T13:00:18.295Z; main]: commit: pendingCommit != null
IW 0 [2018-08-08T13:00:18.298Z; main]: commit: done writing segments file
"segments_4vt"
IFD 0 [2018-08-08T13:00:18.298Z; main]: now checkpoint
"_bhg(7.1.0):C108396" [1 segments ; isCommit = true]
IFD 0 [2018-08-08T13:00:18.298Z; main]: deleteCommits: now decRef commit
"segments_4vs"
IFD 0 [2018-08-08T13:00:18.298Z; main]: delete [segments_4vs]
IFD 0 [2018-08-08T13:00:18.299Z; main]: 0 msec to checkpoint
IW 0 [2018-08-08T13:00:18.319Z; main]: commit: took 16.0 msec
IW 0 [2018-08-08T13:00:18.319Z; main]: commit: done
IndexUpgrader 0 [2018-08-08T13:00:18.319Z; main]: Committed upgraded
metadata to index.
IW 0 [2018-08-08T13:00:18.319Z; main]: now flush at close
IW 0 [2018-08-08T13:00:18.319Z; main]:   start flush: applyAllDeletes=true
IW 0 [2018-08-08T13:00:18.319Z; main]:   index before flush
_bhg(7.1.0):C108396
DW 0 [2018-08-08T13:00:18.319Z; main]: startFullFlush
DW 0 [2018-08-08T13:00:18.320Z; main]: main finishFullFlush success=true
IW 0 [2018-08-08T13:00:18.320Z; main]: now apply all deletes for all
segments buffered updates bytesUsed=0 reader pool bytesUsed=0
BD 0 [2018-08-08T13:00:18.320Z; main]: waitApply: no deletes to apply
MS 0 [2018-08-08T13:00:18.320Z; main]: updateMergeThreads ioThrottle=true
targetMBPerSec=10240.0 MB/sec
MS 0 [2018-08-08T13:00:18.320Z; main]: now merge
MS 0 [2018-08-08T13:00:18.321Z; main]:   index: _bhg(7.1.0):C108396
MS 0 [2018-08-08T13:00:18.321Z; main]:   no more merges pending; now return
IW 0 [2018-08-08T13:00:18.321Z; main]: waitForMerges
IW 0 [2018-08-08T13:00:18.321Z; main]: waitForMerges done
IW 0 [2018-08-08T13:00:18.321Z; main]: commit: start
IW 0 [2018-08-08T13:00:18.321Z; main]: commit: enter lock
IW 0 [2018-08-08T13:00:18.321Z; main]: commit: now prepare
IW 0 [2018-08-08T13:00:18.321Z; main]: prepareCommit: flush
IW 0 [2018-08-08T13:00:18.321Z; main]:   index before flush
_bhg(7.1.0):C108396
DW 0 [2018-08-08T13:00:18.321Z; main]: startFullFlush
IW 0 [2018-08-08T13:00:18.321Z; main]: now apply all deletes for all
segments buffered updates bytesUsed=0 reader pool bytesUsed=0
BD 0 [2018-08-08T13:00:18.322Z; main]: waitApply: no deletes to apply
DW 0 [2018-08-08T13:00:18.322Z; main]: main finishFullFlush success=true
IW 0 [2018-08-08T13:00:18.322Z; main]: startCommit(): start
IW 0 [2018-08-08T13:00:18.322Z; main]:   skip startCommit(): no changes
pending
IW 0 [2018-08-08T13:00:18.322Z; main]: commit: pendingCommit == null; skip
IW 0 [2018-08-08T13:00:18.322Z; main]: commit: took 0.9 msec
IW 0 [2018-08-08T13:00:18.322Z; main]: commit: done
IW 0 [2018-08-08T13:00:18.322Z; main]: rollback
IW 0 [2018-08-08T13:00:18.322Z; main]: all running merges have aborted
IW 0 [2018-08-08T13:00:18.323Z; main]: rollback: done finish merges
DW 0 [2018-08-08T13:00:18.323Z; main]: abort
DW 0 [2018-08-08T13:00:18.323Z; main]: done abort success=true
IW 0 [2018-08-08T13:00:18.323Z; main]: rollback: infos=_bhg(7.1.0):C108396
IFD 0 [2018-08-08T13:00:18.323Z; main]: now checkpoint
"_bhg(7.1.0):C108396" [1 segments ; isCommit = false]
IFD 0 [2018-08-08T13:00:18.323Z; main]: 0 msec to checkpoint

2018-08-07 16:38 GMT+02:00 Erick Erickson <[hidden email]>:

> Bjarke:
>
> One thing, what version of Solr are you moving _from_ and _to_?
> Solr/Lucene only guarantee one major backward revision so you can copy
> an index created with Solr 6 to another Solr 6 or Solr 7, but you
> couldn't copy an index created with Solr 5 to Solr 7...
>
> Also note that shard splitting is a very expensive operation, so be
> patient....
>
> Best,
> Erick
>
> On Tue, Aug 7, 2018 at 6:17 AM, Rahul Singh
> <[hidden email]> wrote:
> > Bjarke,
> >
> > I am imagining that at some point you may need to shard that data if it
> grows. Or do you imagine this data to remain stagnant?
> >
> > Generally you want to add solrcloud to do two things : 1. Increase
> availability with replicas 2. Increase available data via shards 3.
> Increase fault tolerance with leader and replicas being spread around the
> cluster.
> >
> > You would be bypassing general High availability / distributed computing
> processes by trying to not reindex.
> >
> > Rahul
> > On Aug 7, 2018, 7:06 AM -0400, Bjarke Buur Mortensen <
> [hidden email]>, wrote:
> >> Hi List,
> >>
> >> is there a cookbook recipe for moving an existing solr core to a solr
> cloud
> >> collection.
> >>
> >> We currently have a single machine with a large core (~150gb), and we
> would
> >> like to move to solr cloud.
> >>
> >> I haven't been able to find anything that reuses an existing index, so
> any
> >> pointers much appreciated.
> >>
> >> Thanks,
> >> Bjarke
>
Reply | Threaded
Open this post in threaded view
|

Re: Recipe for moving to solr cloud without reindexing

Erick Erickson
Bjarke:

Using SPLITSHARD on an index with 6x segments just seems to not work,
even outside the standalone-> cloud issue. I'll raise a JIRA.
Meanwhile I think you'll have to re-index I'm afraid.

Thanks for raising the issue.

Erick

On Wed, Aug 8, 2018 at 6:34 AM, Bjarke Buur Mortensen
<[hidden email]> wrote:

> Erick,
>
> thanks, that is of course something I left out of the original question.
> Our Solr is 7.1, so that should not present a problem (crossing fingers).
>
> However, on my dev box I'm trying out the steps, and here I have some
> segments created with version 6 of Solr.
>
> After having copied data from my non-cloud solr into my
> single-shard-single-replica collection and verified that Solr Cloud works
> with this collection, I then submit the splitshard command
>
> http://172.17.0.4:8984
> /solr/admin/collections?action=SPLITSHARD&collection=procurement&shard=shard1
>
> However, this gives me the error:
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
> from server at http://172.17.0.4:8984/solr:
> java.lang.IllegalArgumentException: Cannot merge a segment that has been
> created with major version 6 into this index which has been created by
> major version 7"}
>
> I have tried running both optimize and IndexUpgrader on the index before
> shard splitting, but the same error still occurs.
>
> Any ideas as to why this happens?
>
> Below is an output from running IndexUpgrader, which I cannot decipher.
> It both states that "All segments upgraded to version 7.1.0" and ''all
> running merges have aborted" ¯\_(ツ)_/¯
>
> Thanks a lot,
> Bjarke
>
>
> ======================
> java -cp
> /opt/solr/server/solr-webapp/webapp/WEB-INF/lib/lucene-backward-codecs-7.1.0.jar:/opt/solr/server/solr-webapp/webapp/WEB-INF/lib/lucene-core-7.1.0.jar
> org.apache.lucene.index.IndexUpgrader -delete-prior-commits -verbose
> /var/solr/cloud/procurement_shard1_replica_n1/data/index
> IFD 0 [2018-08-08T13:00:18.244Z; main]: init: current segments file is
> "segments_4vs";
> deletionPolicy=org.apache.lucene.index.KeepOnlyLastCommitDeletionPolicy@721e0f4f
> IFD 0 [2018-08-08T13:00:18.266Z; main]: init: load commit "segments_4vs"
> IFD 0 [2018-08-08T13:00:18.270Z; main]: now checkpoint
> "_bhg(7.1.0):C108396" [1 segments ; isCommit = false]
> IFD 0 [2018-08-08T13:00:18.270Z; main]: 0 msec to checkpoint
> IW 0 [2018-08-08T13:00:18.270Z; main]: init: create=false
> IW 0 [2018-08-08T13:00:18.273Z; main]:
> dir=MMapDirectory@/var/solr/cloud/procurement_shard1_replica_n1/data/index
> lockFactory=org.apache.lucene.store.NativeFSLockFactory@6debcae2
> index=_bhg(7.1.0):C108396
> version=7.1.0
> analyzer=null
> ramBufferSizeMB=16.0
> maxBufferedDocs=-1
> mergedSegmentWarmer=null
> delPolicy=org.apache.lucene.index.KeepOnlyLastCommitDeletionPolicy
> commit=null
> openMode=CREATE_OR_APPEND
> similarity=org.apache.lucene.search.similarities.BM25Similarity
> mergeScheduler=ConcurrentMergeScheduler: maxThreadCount=-1,
> maxMergeCount=-1, ioThrottle=true
> codec=Lucene70
> infoStream=org.apache.lucene.util.PrintStreamInfoStream
> mergePolicy=UpgradeIndexMergePolicy([TieredMergePolicy: maxMergeAtOnce=10,
> maxMergeAtOnceExplicit=30, maxMergedSegmentMB=5120.0, floorSegmentMB=2.0,
> forceMergeDeletesPctAllowed=10.0, segmentsPerTier=10.0,
> maxCFSSegmentSizeMB=8.796093022207999E12, noCFSRatio=0.1)
> indexerThreadPool=org.apache.lucene.index.DocumentsWriterPerThreadPool@5ba23b66
> readerPooling=true
> perThreadHardLimitMB=1945
> useCompoundFile=true
> commitOnClose=true
> indexSort=null
> writer=org.apache.lucene.index.IndexWriter@2ff4f00f
>
> IW 0 [2018-08-08T13:00:18.273Z; main]: MMapDirectory.UNMAP_SUPPORTED=true
> IndexUpgrader 0 [2018-08-08T13:00:18.274Z; main]: Upgrading all pre-7.1.0
> segments of index directory
> 'MMapDirectory@/var/solr/cloud/procurement_shard1_replica_n1/data/index
> lockFactory=org.apache.lucene.store.NativeFSLockFactory@6debcae2' to
> version 7.1.0...
> IW 0 [2018-08-08T13:00:18.274Z; main]: forceMerge: index now
> _bhg(7.1.0):C108396
> IW 0 [2018-08-08T13:00:18.274Z; main]: now flush at forceMerge
> IW 0 [2018-08-08T13:00:18.274Z; main]:   start flush: applyAllDeletes=true
> IW 0 [2018-08-08T13:00:18.274Z; main]:   index before flush
> _bhg(7.1.0):C108396
> DW 0 [2018-08-08T13:00:18.274Z; main]: startFullFlush
> DW 0 [2018-08-08T13:00:18.275Z; main]: main finishFullFlush success=true
> IW 0 [2018-08-08T13:00:18.275Z; main]: now apply all deletes for all
> segments buffered updates bytesUsed=0 reader pool bytesUsed=0
> BD 0 [2018-08-08T13:00:18.275Z; main]: waitApply: no deletes to apply
> UPGMP 0 [2018-08-08T13:00:18.276Z; main]: findForcedMerges:
> segmentsToUpgrade={}
> MS 0 [2018-08-08T13:00:18.282Z; main]: initDynamicDefaults spins=true
> maxThreadCount=1 maxMergeCount=6
> MS 0 [2018-08-08T13:00:18.282Z; main]: now merge
> MS 0 [2018-08-08T13:00:18.282Z; main]:   index: _bhg(7.1.0):C108396
> MS 0 [2018-08-08T13:00:18.282Z; main]:   no more merges pending; now return
> IndexUpgrader 0 [2018-08-08T13:00:18.282Z; main]: All segments upgraded to
> version 7.1.0
> IndexUpgrader 0 [2018-08-08T13:00:18.283Z; main]: Enforcing commit to
> rewrite all index metadata...
> IW 0 [2018-08-08T13:00:18.283Z; main]: commit: start
> IW 0 [2018-08-08T13:00:18.283Z; main]: commit: enter lock
> IW 0 [2018-08-08T13:00:18.283Z; main]: commit: now prepare
> IW 0 [2018-08-08T13:00:18.283Z; main]: prepareCommit: flush
> IW 0 [2018-08-08T13:00:18.283Z; main]:   index before flush
> _bhg(7.1.0):C108396
> DW 0 [2018-08-08T13:00:18.283Z; main]: startFullFlush
> IW 0 [2018-08-08T13:00:18.283Z; main]: now apply all deletes for all
> segments buffered updates bytesUsed=0 reader pool bytesUsed=0
> BD 0 [2018-08-08T13:00:18.283Z; main]: waitApply: no deletes to apply
> DW 0 [2018-08-08T13:00:18.284Z; main]: main finishFullFlush success=true
> IW 0 [2018-08-08T13:00:18.284Z; main]: startCommit(): start
> IW 0 [2018-08-08T13:00:18.284Z; main]: startCommit
> index=_bhg(7.1.0):C108396 changeCount=2
> IW 0 [2018-08-08T13:00:18.293Z; main]: startCommit: wrote pending segments
> file "pending_segments_4vt"
> IW 0 [2018-08-08T13:00:18.295Z; main]: done all syncs:
> [_bhg_Lucene50_0.tip, _bhg.fdx, _bhg.fnm, _bhg.nvm, _bhg.fdt, _bhg.si,
> _bhg_Lucene50_0.pos, _bhg.nvd, _bhg_Lucene50_0.doc, _bhg_Lucene50_0.tim]
> IW 0 [2018-08-08T13:00:18.295Z; main]: commit: pendingCommit != null
> IW 0 [2018-08-08T13:00:18.298Z; main]: commit: done writing segments file
> "segments_4vt"
> IFD 0 [2018-08-08T13:00:18.298Z; main]: now checkpoint
> "_bhg(7.1.0):C108396" [1 segments ; isCommit = true]
> IFD 0 [2018-08-08T13:00:18.298Z; main]: deleteCommits: now decRef commit
> "segments_4vs"
> IFD 0 [2018-08-08T13:00:18.298Z; main]: delete [segments_4vs]
> IFD 0 [2018-08-08T13:00:18.299Z; main]: 0 msec to checkpoint
> IW 0 [2018-08-08T13:00:18.319Z; main]: commit: took 16.0 msec
> IW 0 [2018-08-08T13:00:18.319Z; main]: commit: done
> IndexUpgrader 0 [2018-08-08T13:00:18.319Z; main]: Committed upgraded
> metadata to index.
> IW 0 [2018-08-08T13:00:18.319Z; main]: now flush at close
> IW 0 [2018-08-08T13:00:18.319Z; main]:   start flush: applyAllDeletes=true
> IW 0 [2018-08-08T13:00:18.319Z; main]:   index before flush
> _bhg(7.1.0):C108396
> DW 0 [2018-08-08T13:00:18.319Z; main]: startFullFlush
> DW 0 [2018-08-08T13:00:18.320Z; main]: main finishFullFlush success=true
> IW 0 [2018-08-08T13:00:18.320Z; main]: now apply all deletes for all
> segments buffered updates bytesUsed=0 reader pool bytesUsed=0
> BD 0 [2018-08-08T13:00:18.320Z; main]: waitApply: no deletes to apply
> MS 0 [2018-08-08T13:00:18.320Z; main]: updateMergeThreads ioThrottle=true
> targetMBPerSec=10240.0 MB/sec
> MS 0 [2018-08-08T13:00:18.320Z; main]: now merge
> MS 0 [2018-08-08T13:00:18.321Z; main]:   index: _bhg(7.1.0):C108396
> MS 0 [2018-08-08T13:00:18.321Z; main]:   no more merges pending; now return
> IW 0 [2018-08-08T13:00:18.321Z; main]: waitForMerges
> IW 0 [2018-08-08T13:00:18.321Z; main]: waitForMerges done
> IW 0 [2018-08-08T13:00:18.321Z; main]: commit: start
> IW 0 [2018-08-08T13:00:18.321Z; main]: commit: enter lock
> IW 0 [2018-08-08T13:00:18.321Z; main]: commit: now prepare
> IW 0 [2018-08-08T13:00:18.321Z; main]: prepareCommit: flush
> IW 0 [2018-08-08T13:00:18.321Z; main]:   index before flush
> _bhg(7.1.0):C108396
> DW 0 [2018-08-08T13:00:18.321Z; main]: startFullFlush
> IW 0 [2018-08-08T13:00:18.321Z; main]: now apply all deletes for all
> segments buffered updates bytesUsed=0 reader pool bytesUsed=0
> BD 0 [2018-08-08T13:00:18.322Z; main]: waitApply: no deletes to apply
> DW 0 [2018-08-08T13:00:18.322Z; main]: main finishFullFlush success=true
> IW 0 [2018-08-08T13:00:18.322Z; main]: startCommit(): start
> IW 0 [2018-08-08T13:00:18.322Z; main]:   skip startCommit(): no changes
> pending
> IW 0 [2018-08-08T13:00:18.322Z; main]: commit: pendingCommit == null; skip
> IW 0 [2018-08-08T13:00:18.322Z; main]: commit: took 0.9 msec
> IW 0 [2018-08-08T13:00:18.322Z; main]: commit: done
> IW 0 [2018-08-08T13:00:18.322Z; main]: rollback
> IW 0 [2018-08-08T13:00:18.322Z; main]: all running merges have aborted
> IW 0 [2018-08-08T13:00:18.323Z; main]: rollback: done finish merges
> DW 0 [2018-08-08T13:00:18.323Z; main]: abort
> DW 0 [2018-08-08T13:00:18.323Z; main]: done abort success=true
> IW 0 [2018-08-08T13:00:18.323Z; main]: rollback: infos=_bhg(7.1.0):C108396
> IFD 0 [2018-08-08T13:00:18.323Z; main]: now checkpoint
> "_bhg(7.1.0):C108396" [1 segments ; isCommit = false]
> IFD 0 [2018-08-08T13:00:18.323Z; main]: 0 msec to checkpoint
>
> 2018-08-07 16:38 GMT+02:00 Erick Erickson <[hidden email]>:
>
>> Bjarke:
>>
>> One thing, what version of Solr are you moving _from_ and _to_?
>> Solr/Lucene only guarantee one major backward revision so you can copy
>> an index created with Solr 6 to another Solr 6 or Solr 7, but you
>> couldn't copy an index created with Solr 5 to Solr 7...
>>
>> Also note that shard splitting is a very expensive operation, so be
>> patient....
>>
>> Best,
>> Erick
>>
>> On Tue, Aug 7, 2018 at 6:17 AM, Rahul Singh
>> <[hidden email]> wrote:
>> > Bjarke,
>> >
>> > I am imagining that at some point you may need to shard that data if it
>> grows. Or do you imagine this data to remain stagnant?
>> >
>> > Generally you want to add solrcloud to do two things : 1. Increase
>> availability with replicas 2. Increase available data via shards 3.
>> Increase fault tolerance with leader and replicas being spread around the
>> cluster.
>> >
>> > You would be bypassing general High availability / distributed computing
>> processes by trying to not reindex.
>> >
>> > Rahul
>> > On Aug 7, 2018, 7:06 AM -0400, Bjarke Buur Mortensen <
>> [hidden email]>, wrote:
>> >> Hi List,
>> >>
>> >> is there a cookbook recipe for moving an existing solr core to a solr
>> cloud
>> >> collection.
>> >>
>> >> We currently have a single machine with a large core (~150gb), and we
>> would
>> >> like to move to solr cloud.
>> >>
>> >> I haven't been able to find anything that reuses an existing index, so
>> any
>> >> pointers much appreciated.
>> >>
>> >> Thanks,
>> >> Bjarke
>>
Reply | Threaded
Open this post in threaded view
|

Re: Recipe for moving to solr cloud without reindexing

Bjarke Buur Mortensen
OK, thanks.

As long as it's my dev box, reindexing is fine.
I just hope that my assumption holds, that our prod solr is 7x segments
only.

Thanks again,
Bjarke

2018-08-08 20:03 GMT+02:00 Erick Erickson <[hidden email]>:

> Bjarke:
>
> Using SPLITSHARD on an index with 6x segments just seems to not work,
> even outside the standalone-> cloud issue. I'll raise a JIRA.
> Meanwhile I think you'll have to re-index I'm afraid.
>
> Thanks for raising the issue.
>
> Erick
>
> On Wed, Aug 8, 2018 at 6:34 AM, Bjarke Buur Mortensen
> <[hidden email]> wrote:
> > Erick,
> >
> > thanks, that is of course something I left out of the original question.
> > Our Solr is 7.1, so that should not present a problem (crossing fingers).
> >
> > However, on my dev box I'm trying out the steps, and here I have some
> > segments created with version 6 of Solr.
> >
> > After having copied data from my non-cloud solr into my
> > single-shard-single-replica collection and verified that Solr Cloud works
> > with this collection, I then submit the splitshard command
> >
> > http://172.17.0.4:8984
> > /solr/admin/collections?action=SPLITSHARD&collection=
> procurement&shard=shard1
> >
> > However, this gives me the error:
> > org.apache.solr.client.solrj.impl.HttpSolrClient$
> RemoteSolrException:Error
> > from server at http://172.17.0.4:8984/solr:
> > java.lang.IllegalArgumentException: Cannot merge a segment that has been
> > created with major version 6 into this index which has been created by
> > major version 7"}
> >
> > I have tried running both optimize and IndexUpgrader on the index before
> > shard splitting, but the same error still occurs.
> >
> > Any ideas as to why this happens?
> >
> > Below is an output from running IndexUpgrader, which I cannot decipher.
> > It both states that "All segments upgraded to version 7.1.0" and ''all
> > running merges have aborted" ¯\_(ツ)_/¯
> >
> > Thanks a lot,
> > Bjarke
> >
> >
> > ======================
> > java -cp
> > /opt/solr/server/solr-webapp/webapp/WEB-INF/lib/lucene-
> backward-codecs-7.1.0.jar:/opt/solr/server/solr-webapp/
> webapp/WEB-INF/lib/lucene-core-7.1.0.jar
> > org.apache.lucene.index.IndexUpgrader -delete-prior-commits -verbose
> > /var/solr/cloud/procurement_shard1_replica_n1/data/index
> > IFD 0 [2018-08-08T13:00:18.244Z; main]: init: current segments file is
> > "segments_4vs";
> > deletionPolicy=org.apache.lucene.index.KeepOnlyLastCommitDeletionPoli
> cy@721e0f4f
> > IFD 0 [2018-08-08T13:00:18.266Z; main]: init: load commit "segments_4vs"
> > IFD 0 [2018-08-08T13:00:18.270Z; main]: now checkpoint
> > "_bhg(7.1.0):C108396" [1 segments ; isCommit = false]
> > IFD 0 [2018-08-08T13:00:18.270Z; main]: 0 msec to checkpoint
> > IW 0 [2018-08-08T13:00:18.270Z; main]: init: create=false
> > IW 0 [2018-08-08T13:00:18.273Z; main]:
> > dir=MMapDirectory@/var/solr/cloud/procurement_shard1_
> replica_n1/data/index
> > lockFactory=org.apache.lucene.store.NativeFSLockFactory@6debcae2
> > index=_bhg(7.1.0):C108396
> > version=7.1.0
> > analyzer=null
> > ramBufferSizeMB=16.0
> > maxBufferedDocs=-1
> > mergedSegmentWarmer=null
> > delPolicy=org.apache.lucene.index.KeepOnlyLastCommitDeletionPolicy
> > commit=null
> > openMode=CREATE_OR_APPEND
> > similarity=org.apache.lucene.search.similarities.BM25Similarity
> > mergeScheduler=ConcurrentMergeScheduler: maxThreadCount=-1,
> > maxMergeCount=-1, ioThrottle=true
> > codec=Lucene70
> > infoStream=org.apache.lucene.util.PrintStreamInfoStream
> > mergePolicy=UpgradeIndexMergePolicy([TieredMergePolicy:
> maxMergeAtOnce=10,
> > maxMergeAtOnceExplicit=30, maxMergedSegmentMB=5120.0, floorSegmentMB=2.0,
> > forceMergeDeletesPctAllowed=10.0, segmentsPerTier=10.0,
> > maxCFSSegmentSizeMB=8.796093022207999E12, noCFSRatio=0.1)
> > indexerThreadPool=org.apache.lucene.index.DocumentsWriterPerThreadPool@
> 5ba23b66
> > readerPooling=true
> > perThreadHardLimitMB=1945
> > useCompoundFile=true
> > commitOnClose=true
> > indexSort=null
> > writer=org.apache.lucene.index.IndexWriter@2ff4f00f
> >
> > IW 0 [2018-08-08T13:00:18.273Z; main]: MMapDirectory.UNMAP_SUPPORTED=
> true
> > IndexUpgrader 0 [2018-08-08T13:00:18.274Z; main]: Upgrading all pre-7.1.0
> > segments of index directory
> > 'MMapDirectory@/var/solr/cloud/procurement_shard1_replica_n1/data/index
> > lockFactory=org.apache.lucene.store.NativeFSLockFactory@6debcae2' to
> > version 7.1.0...
> > IW 0 [2018-08-08T13:00:18.274Z; main]: forceMerge: index now
> > _bhg(7.1.0):C108396
> > IW 0 [2018-08-08T13:00:18.274Z; main]: now flush at forceMerge
> > IW 0 [2018-08-08T13:00:18.274Z; main]:   start flush:
> applyAllDeletes=true
> > IW 0 [2018-08-08T13:00:18.274Z; main]:   index before flush
> > _bhg(7.1.0):C108396
> > DW 0 [2018-08-08T13:00:18.274Z; main]: startFullFlush
> > DW 0 [2018-08-08T13:00:18.275Z; main]: main finishFullFlush success=true
> > IW 0 [2018-08-08T13:00:18.275Z; main]: now apply all deletes for all
> > segments buffered updates bytesUsed=0 reader pool bytesUsed=0
> > BD 0 [2018-08-08T13:00:18.275Z; main]: waitApply: no deletes to apply
> > UPGMP 0 [2018-08-08T13:00:18.276Z; main]: findForcedMerges:
> > segmentsToUpgrade={}
> > MS 0 [2018-08-08T13:00:18.282Z; main]: initDynamicDefaults spins=true
> > maxThreadCount=1 maxMergeCount=6
> > MS 0 [2018-08-08T13:00:18.282Z; main]: now merge
> > MS 0 [2018-08-08T13:00:18.282Z; main]:   index: _bhg(7.1.0):C108396
> > MS 0 [2018-08-08T13:00:18.282Z; main]:   no more merges pending; now
> return
> > IndexUpgrader 0 [2018-08-08T13:00:18.282Z; main]: All segments upgraded
> to
> > version 7.1.0
> > IndexUpgrader 0 [2018-08-08T13:00:18.283Z; main]: Enforcing commit to
> > rewrite all index metadata...
> > IW 0 [2018-08-08T13:00:18.283Z; main]: commit: start
> > IW 0 [2018-08-08T13:00:18.283Z; main]: commit: enter lock
> > IW 0 [2018-08-08T13:00:18.283Z; main]: commit: now prepare
> > IW 0 [2018-08-08T13:00:18.283Z; main]: prepareCommit: flush
> > IW 0 [2018-08-08T13:00:18.283Z; main]:   index before flush
> > _bhg(7.1.0):C108396
> > DW 0 [2018-08-08T13:00:18.283Z; main]: startFullFlush
> > IW 0 [2018-08-08T13:00:18.283Z; main]: now apply all deletes for all
> > segments buffered updates bytesUsed=0 reader pool bytesUsed=0
> > BD 0 [2018-08-08T13:00:18.283Z; main]: waitApply: no deletes to apply
> > DW 0 [2018-08-08T13:00:18.284Z; main]: main finishFullFlush success=true
> > IW 0 [2018-08-08T13:00:18.284Z; main]: startCommit(): start
> > IW 0 [2018-08-08T13:00:18.284Z; main]: startCommit
> > index=_bhg(7.1.0):C108396 changeCount=2
> > IW 0 [2018-08-08T13:00:18.293Z; main]: startCommit: wrote pending
> segments
> > file "pending_segments_4vt"
> > IW 0 [2018-08-08T13:00:18.295Z; main]: done all syncs:
> > [_bhg_Lucene50_0.tip, _bhg.fdx, _bhg.fnm, _bhg.nvm, _bhg.fdt, _bhg.si,
> > _bhg_Lucene50_0.pos, _bhg.nvd, _bhg_Lucene50_0.doc, _bhg_Lucene50_0.tim]
> > IW 0 [2018-08-08T13:00:18.295Z; main]: commit: pendingCommit != null
> > IW 0 [2018-08-08T13:00:18.298Z; main]: commit: done writing segments file
> > "segments_4vt"
> > IFD 0 [2018-08-08T13:00:18.298Z; main]: now checkpoint
> > "_bhg(7.1.0):C108396" [1 segments ; isCommit = true]
> > IFD 0 [2018-08-08T13:00:18.298Z; main]: deleteCommits: now decRef commit
> > "segments_4vs"
> > IFD 0 [2018-08-08T13:00:18.298Z; main]: delete [segments_4vs]
> > IFD 0 [2018-08-08T13:00:18.299Z; main]: 0 msec to checkpoint
> > IW 0 [2018-08-08T13:00:18.319Z; main]: commit: took 16.0 msec
> > IW 0 [2018-08-08T13:00:18.319Z; main]: commit: done
> > IndexUpgrader 0 [2018-08-08T13:00:18.319Z; main]: Committed upgraded
> > metadata to index.
> > IW 0 [2018-08-08T13:00:18.319Z; main]: now flush at close
> > IW 0 [2018-08-08T13:00:18.319Z; main]:   start flush:
> applyAllDeletes=true
> > IW 0 [2018-08-08T13:00:18.319Z; main]:   index before flush
> > _bhg(7.1.0):C108396
> > DW 0 [2018-08-08T13:00:18.319Z; main]: startFullFlush
> > DW 0 [2018-08-08T13:00:18.320Z; main]: main finishFullFlush success=true
> > IW 0 [2018-08-08T13:00:18.320Z; main]: now apply all deletes for all
> > segments buffered updates bytesUsed=0 reader pool bytesUsed=0
> > BD 0 [2018-08-08T13:00:18.320Z; main]: waitApply: no deletes to apply
> > MS 0 [2018-08-08T13:00:18.320Z; main]: updateMergeThreads ioThrottle=true
> > targetMBPerSec=10240.0 MB/sec
> > MS 0 [2018-08-08T13:00:18.320Z; main]: now merge
> > MS 0 [2018-08-08T13:00:18.321Z; main]:   index: _bhg(7.1.0):C108396
> > MS 0 [2018-08-08T13:00:18.321Z; main]:   no more merges pending; now
> return
> > IW 0 [2018-08-08T13:00:18.321Z; main]: waitForMerges
> > IW 0 [2018-08-08T13:00:18.321Z; main]: waitForMerges done
> > IW 0 [2018-08-08T13:00:18.321Z; main]: commit: start
> > IW 0 [2018-08-08T13:00:18.321Z; main]: commit: enter lock
> > IW 0 [2018-08-08T13:00:18.321Z; main]: commit: now prepare
> > IW 0 [2018-08-08T13:00:18.321Z; main]: prepareCommit: flush
> > IW 0 [2018-08-08T13:00:18.321Z; main]:   index before flush
> > _bhg(7.1.0):C108396
> > DW 0 [2018-08-08T13:00:18.321Z; main]: startFullFlush
> > IW 0 [2018-08-08T13:00:18.321Z; main]: now apply all deletes for all
> > segments buffered updates bytesUsed=0 reader pool bytesUsed=0
> > BD 0 [2018-08-08T13:00:18.322Z; main]: waitApply: no deletes to apply
> > DW 0 [2018-08-08T13:00:18.322Z; main]: main finishFullFlush success=true
> > IW 0 [2018-08-08T13:00:18.322Z; main]: startCommit(): start
> > IW 0 [2018-08-08T13:00:18.322Z; main]:   skip startCommit(): no changes
> > pending
> > IW 0 [2018-08-08T13:00:18.322Z; main]: commit: pendingCommit == null;
> skip
> > IW 0 [2018-08-08T13:00:18.322Z; main]: commit: took 0.9 msec
> > IW 0 [2018-08-08T13:00:18.322Z; main]: commit: done
> > IW 0 [2018-08-08T13:00:18.322Z; main]: rollback
> > IW 0 [2018-08-08T13:00:18.322Z; main]: all running merges have aborted
> > IW 0 [2018-08-08T13:00:18.323Z; main]: rollback: done finish merges
> > DW 0 [2018-08-08T13:00:18.323Z; main]: abort
> > DW 0 [2018-08-08T13:00:18.323Z; main]: done abort success=true
> > IW 0 [2018-08-08T13:00:18.323Z; main]: rollback:
> infos=_bhg(7.1.0):C108396
> > IFD 0 [2018-08-08T13:00:18.323Z; main]: now checkpoint
> > "_bhg(7.1.0):C108396" [1 segments ; isCommit = false]
> > IFD 0 [2018-08-08T13:00:18.323Z; main]: 0 msec to checkpoint
> >
> > 2018-08-07 16:38 GMT+02:00 Erick Erickson <[hidden email]>:
> >
> >> Bjarke:
> >>
> >> One thing, what version of Solr are you moving _from_ and _to_?
> >> Solr/Lucene only guarantee one major backward revision so you can copy
> >> an index created with Solr 6 to another Solr 6 or Solr 7, but you
> >> couldn't copy an index created with Solr 5 to Solr 7...
> >>
> >> Also note that shard splitting is a very expensive operation, so be
> >> patient....
> >>
> >> Best,
> >> Erick
> >>
> >> On Tue, Aug 7, 2018 at 6:17 AM, Rahul Singh
> >> <[hidden email]> wrote:
> >> > Bjarke,
> >> >
> >> > I am imagining that at some point you may need to shard that data if
> it
> >> grows. Or do you imagine this data to remain stagnant?
> >> >
> >> > Generally you want to add solrcloud to do two things : 1. Increase
> >> availability with replicas 2. Increase available data via shards 3.
> >> Increase fault tolerance with leader and replicas being spread around
> the
> >> cluster.
> >> >
> >> > You would be bypassing general High availability / distributed
> computing
> >> processes by trying to not reindex.
> >> >
> >> > Rahul
> >> > On Aug 7, 2018, 7:06 AM -0400, Bjarke Buur Mortensen <
> >> [hidden email]>, wrote:
> >> >> Hi List,
> >> >>
> >> >> is there a cookbook recipe for moving an existing solr core to a solr
> >> cloud
> >> >> collection.
> >> >>
> >> >> We currently have a single machine with a large core (~150gb), and we
> >> would
> >> >> like to move to solr cloud.
> >> >>
> >> >> I haven't been able to find anything that reuses an existing index,
> so
> >> any
> >> >> pointers much appreciated.
> >> >>
> >> >> Thanks,
> >> >> Bjarke
> >>
>
Reply | Threaded
Open this post in threaded view
|

Re: Recipe for moving to solr cloud without reindexing

Erick Erickson
See: https://issues.apache.org/jira/browse/SOLR-12646

On Wed, Aug 8, 2018 at 11:24 AM, Bjarke Buur Mortensen
<[hidden email]> wrote:

> OK, thanks.
>
> As long as it's my dev box, reindexing is fine.
> I just hope that my assumption holds, that our prod solr is 7x segments
> only.
>
> Thanks again,
> Bjarke
>
> 2018-08-08 20:03 GMT+02:00 Erick Erickson <[hidden email]>:
>
>> Bjarke:
>>
>> Using SPLITSHARD on an index with 6x segments just seems to not work,
>> even outside the standalone-> cloud issue. I'll raise a JIRA.
>> Meanwhile I think you'll have to re-index I'm afraid.
>>
>> Thanks for raising the issue.
>>
>> Erick
>>
>> On Wed, Aug 8, 2018 at 6:34 AM, Bjarke Buur Mortensen
>> <[hidden email]> wrote:
>> > Erick,
>> >
>> > thanks, that is of course something I left out of the original question.
>> > Our Solr is 7.1, so that should not present a problem (crossing fingers).
>> >
>> > However, on my dev box I'm trying out the steps, and here I have some
>> > segments created with version 6 of Solr.
>> >
>> > After having copied data from my non-cloud solr into my
>> > single-shard-single-replica collection and verified that Solr Cloud works
>> > with this collection, I then submit the splitshard command
>> >
>> > http://172.17.0.4:8984
>> > /solr/admin/collections?action=SPLITSHARD&collection=
>> procurement&shard=shard1
>> >
>> > However, this gives me the error:
>> > org.apache.solr.client.solrj.impl.HttpSolrClient$
>> RemoteSolrException:Error
>> > from server at http://172.17.0.4:8984/solr:
>> > java.lang.IllegalArgumentException: Cannot merge a segment that has been
>> > created with major version 6 into this index which has been created by
>> > major version 7"}
>> >
>> > I have tried running both optimize and IndexUpgrader on the index before
>> > shard splitting, but the same error still occurs.
>> >
>> > Any ideas as to why this happens?
>> >
>> > Below is an output from running IndexUpgrader, which I cannot decipher.
>> > It both states that "All segments upgraded to version 7.1.0" and ''all
>> > running merges have aborted" ¯\_(ツ)_/¯
>> >
>> > Thanks a lot,
>> > Bjarke
>> >
>> >
>> > ======================
>> > java -cp
>> > /opt/solr/server/solr-webapp/webapp/WEB-INF/lib/lucene-
>> backward-codecs-7.1.0.jar:/opt/solr/server/solr-webapp/
>> webapp/WEB-INF/lib/lucene-core-7.1.0.jar
>> > org.apache.lucene.index.IndexUpgrader -delete-prior-commits -verbose
>> > /var/solr/cloud/procurement_shard1_replica_n1/data/index
>> > IFD 0 [2018-08-08T13:00:18.244Z; main]: init: current segments file is
>> > "segments_4vs";
>> > deletionPolicy=org.apache.lucene.index.KeepOnlyLastCommitDeletionPoli
>> cy@721e0f4f
>> > IFD 0 [2018-08-08T13:00:18.266Z; main]: init: load commit "segments_4vs"
>> > IFD 0 [2018-08-08T13:00:18.270Z; main]: now checkpoint
>> > "_bhg(7.1.0):C108396" [1 segments ; isCommit = false]
>> > IFD 0 [2018-08-08T13:00:18.270Z; main]: 0 msec to checkpoint
>> > IW 0 [2018-08-08T13:00:18.270Z; main]: init: create=false
>> > IW 0 [2018-08-08T13:00:18.273Z; main]:
>> > dir=MMapDirectory@/var/solr/cloud/procurement_shard1_
>> replica_n1/data/index
>> > lockFactory=org.apache.lucene.store.NativeFSLockFactory@6debcae2
>> > index=_bhg(7.1.0):C108396
>> > version=7.1.0
>> > analyzer=null
>> > ramBufferSizeMB=16.0
>> > maxBufferedDocs=-1
>> > mergedSegmentWarmer=null
>> > delPolicy=org.apache.lucene.index.KeepOnlyLastCommitDeletionPolicy
>> > commit=null
>> > openMode=CREATE_OR_APPEND
>> > similarity=org.apache.lucene.search.similarities.BM25Similarity
>> > mergeScheduler=ConcurrentMergeScheduler: maxThreadCount=-1,
>> > maxMergeCount=-1, ioThrottle=true
>> > codec=Lucene70
>> > infoStream=org.apache.lucene.util.PrintStreamInfoStream
>> > mergePolicy=UpgradeIndexMergePolicy([TieredMergePolicy:
>> maxMergeAtOnce=10,
>> > maxMergeAtOnceExplicit=30, maxMergedSegmentMB=5120.0, floorSegmentMB=2.0,
>> > forceMergeDeletesPctAllowed=10.0, segmentsPerTier=10.0,
>> > maxCFSSegmentSizeMB=8.796093022207999E12, noCFSRatio=0.1)
>> > indexerThreadPool=org.apache.lucene.index.DocumentsWriterPerThreadPool@
>> 5ba23b66
>> > readerPooling=true
>> > perThreadHardLimitMB=1945
>> > useCompoundFile=true
>> > commitOnClose=true
>> > indexSort=null
>> > writer=org.apache.lucene.index.IndexWriter@2ff4f00f
>> >
>> > IW 0 [2018-08-08T13:00:18.273Z; main]: MMapDirectory.UNMAP_SUPPORTED=
>> true
>> > IndexUpgrader 0 [2018-08-08T13:00:18.274Z; main]: Upgrading all pre-7.1.0
>> > segments of index directory
>> > 'MMapDirectory@/var/solr/cloud/procurement_shard1_replica_n1/data/index
>> > lockFactory=org.apache.lucene.store.NativeFSLockFactory@6debcae2' to
>> > version 7.1.0...
>> > IW 0 [2018-08-08T13:00:18.274Z; main]: forceMerge: index now
>> > _bhg(7.1.0):C108396
>> > IW 0 [2018-08-08T13:00:18.274Z; main]: now flush at forceMerge
>> > IW 0 [2018-08-08T13:00:18.274Z; main]:   start flush:
>> applyAllDeletes=true
>> > IW 0 [2018-08-08T13:00:18.274Z; main]:   index before flush
>> > _bhg(7.1.0):C108396
>> > DW 0 [2018-08-08T13:00:18.274Z; main]: startFullFlush
>> > DW 0 [2018-08-08T13:00:18.275Z; main]: main finishFullFlush success=true
>> > IW 0 [2018-08-08T13:00:18.275Z; main]: now apply all deletes for all
>> > segments buffered updates bytesUsed=0 reader pool bytesUsed=0
>> > BD 0 [2018-08-08T13:00:18.275Z; main]: waitApply: no deletes to apply
>> > UPGMP 0 [2018-08-08T13:00:18.276Z; main]: findForcedMerges:
>> > segmentsToUpgrade={}
>> > MS 0 [2018-08-08T13:00:18.282Z; main]: initDynamicDefaults spins=true
>> > maxThreadCount=1 maxMergeCount=6
>> > MS 0 [2018-08-08T13:00:18.282Z; main]: now merge
>> > MS 0 [2018-08-08T13:00:18.282Z; main]:   index: _bhg(7.1.0):C108396
>> > MS 0 [2018-08-08T13:00:18.282Z; main]:   no more merges pending; now
>> return
>> > IndexUpgrader 0 [2018-08-08T13:00:18.282Z; main]: All segments upgraded
>> to
>> > version 7.1.0
>> > IndexUpgrader 0 [2018-08-08T13:00:18.283Z; main]: Enforcing commit to
>> > rewrite all index metadata...
>> > IW 0 [2018-08-08T13:00:18.283Z; main]: commit: start
>> > IW 0 [2018-08-08T13:00:18.283Z; main]: commit: enter lock
>> > IW 0 [2018-08-08T13:00:18.283Z; main]: commit: now prepare
>> > IW 0 [2018-08-08T13:00:18.283Z; main]: prepareCommit: flush
>> > IW 0 [2018-08-08T13:00:18.283Z; main]:   index before flush
>> > _bhg(7.1.0):C108396
>> > DW 0 [2018-08-08T13:00:18.283Z; main]: startFullFlush
>> > IW 0 [2018-08-08T13:00:18.283Z; main]: now apply all deletes for all
>> > segments buffered updates bytesUsed=0 reader pool bytesUsed=0
>> > BD 0 [2018-08-08T13:00:18.283Z; main]: waitApply: no deletes to apply
>> > DW 0 [2018-08-08T13:00:18.284Z; main]: main finishFullFlush success=true
>> > IW 0 [2018-08-08T13:00:18.284Z; main]: startCommit(): start
>> > IW 0 [2018-08-08T13:00:18.284Z; main]: startCommit
>> > index=_bhg(7.1.0):C108396 changeCount=2
>> > IW 0 [2018-08-08T13:00:18.293Z; main]: startCommit: wrote pending
>> segments
>> > file "pending_segments_4vt"
>> > IW 0 [2018-08-08T13:00:18.295Z; main]: done all syncs:
>> > [_bhg_Lucene50_0.tip, _bhg.fdx, _bhg.fnm, _bhg.nvm, _bhg.fdt, _bhg.si,
>> > _bhg_Lucene50_0.pos, _bhg.nvd, _bhg_Lucene50_0.doc, _bhg_Lucene50_0.tim]
>> > IW 0 [2018-08-08T13:00:18.295Z; main]: commit: pendingCommit != null
>> > IW 0 [2018-08-08T13:00:18.298Z; main]: commit: done writing segments file
>> > "segments_4vt"
>> > IFD 0 [2018-08-08T13:00:18.298Z; main]: now checkpoint
>> > "_bhg(7.1.0):C108396" [1 segments ; isCommit = true]
>> > IFD 0 [2018-08-08T13:00:18.298Z; main]: deleteCommits: now decRef commit
>> > "segments_4vs"
>> > IFD 0 [2018-08-08T13:00:18.298Z; main]: delete [segments_4vs]
>> > IFD 0 [2018-08-08T13:00:18.299Z; main]: 0 msec to checkpoint
>> > IW 0 [2018-08-08T13:00:18.319Z; main]: commit: took 16.0 msec
>> > IW 0 [2018-08-08T13:00:18.319Z; main]: commit: done
>> > IndexUpgrader 0 [2018-08-08T13:00:18.319Z; main]: Committed upgraded
>> > metadata to index.
>> > IW 0 [2018-08-08T13:00:18.319Z; main]: now flush at close
>> > IW 0 [2018-08-08T13:00:18.319Z; main]:   start flush:
>> applyAllDeletes=true
>> > IW 0 [2018-08-08T13:00:18.319Z; main]:   index before flush
>> > _bhg(7.1.0):C108396
>> > DW 0 [2018-08-08T13:00:18.319Z; main]: startFullFlush
>> > DW 0 [2018-08-08T13:00:18.320Z; main]: main finishFullFlush success=true
>> > IW 0 [2018-08-08T13:00:18.320Z; main]: now apply all deletes for all
>> > segments buffered updates bytesUsed=0 reader pool bytesUsed=0
>> > BD 0 [2018-08-08T13:00:18.320Z; main]: waitApply: no deletes to apply
>> > MS 0 [2018-08-08T13:00:18.320Z; main]: updateMergeThreads ioThrottle=true
>> > targetMBPerSec=10240.0 MB/sec
>> > MS 0 [2018-08-08T13:00:18.320Z; main]: now merge
>> > MS 0 [2018-08-08T13:00:18.321Z; main]:   index: _bhg(7.1.0):C108396
>> > MS 0 [2018-08-08T13:00:18.321Z; main]:   no more merges pending; now
>> return
>> > IW 0 [2018-08-08T13:00:18.321Z; main]: waitForMerges
>> > IW 0 [2018-08-08T13:00:18.321Z; main]: waitForMerges done
>> > IW 0 [2018-08-08T13:00:18.321Z; main]: commit: start
>> > IW 0 [2018-08-08T13:00:18.321Z; main]: commit: enter lock
>> > IW 0 [2018-08-08T13:00:18.321Z; main]: commit: now prepare
>> > IW 0 [2018-08-08T13:00:18.321Z; main]: prepareCommit: flush
>> > IW 0 [2018-08-08T13:00:18.321Z; main]:   index before flush
>> > _bhg(7.1.0):C108396
>> > DW 0 [2018-08-08T13:00:18.321Z; main]: startFullFlush
>> > IW 0 [2018-08-08T13:00:18.321Z; main]: now apply all deletes for all
>> > segments buffered updates bytesUsed=0 reader pool bytesUsed=0
>> > BD 0 [2018-08-08T13:00:18.322Z; main]: waitApply: no deletes to apply
>> > DW 0 [2018-08-08T13:00:18.322Z; main]: main finishFullFlush success=true
>> > IW 0 [2018-08-08T13:00:18.322Z; main]: startCommit(): start
>> > IW 0 [2018-08-08T13:00:18.322Z; main]:   skip startCommit(): no changes
>> > pending
>> > IW 0 [2018-08-08T13:00:18.322Z; main]: commit: pendingCommit == null;
>> skip
>> > IW 0 [2018-08-08T13:00:18.322Z; main]: commit: took 0.9 msec
>> > IW 0 [2018-08-08T13:00:18.322Z; main]: commit: done
>> > IW 0 [2018-08-08T13:00:18.322Z; main]: rollback
>> > IW 0 [2018-08-08T13:00:18.322Z; main]: all running merges have aborted
>> > IW 0 [2018-08-08T13:00:18.323Z; main]: rollback: done finish merges
>> > DW 0 [2018-08-08T13:00:18.323Z; main]: abort
>> > DW 0 [2018-08-08T13:00:18.323Z; main]: done abort success=true
>> > IW 0 [2018-08-08T13:00:18.323Z; main]: rollback:
>> infos=_bhg(7.1.0):C108396
>> > IFD 0 [2018-08-08T13:00:18.323Z; main]: now checkpoint
>> > "_bhg(7.1.0):C108396" [1 segments ; isCommit = false]
>> > IFD 0 [2018-08-08T13:00:18.323Z; main]: 0 msec to checkpoint
>> >
>> > 2018-08-07 16:38 GMT+02:00 Erick Erickson <[hidden email]>:
>> >
>> >> Bjarke:
>> >>
>> >> One thing, what version of Solr are you moving _from_ and _to_?
>> >> Solr/Lucene only guarantee one major backward revision so you can copy
>> >> an index created with Solr 6 to another Solr 6 or Solr 7, but you
>> >> couldn't copy an index created with Solr 5 to Solr 7...
>> >>
>> >> Also note that shard splitting is a very expensive operation, so be
>> >> patient....
>> >>
>> >> Best,
>> >> Erick
>> >>
>> >> On Tue, Aug 7, 2018 at 6:17 AM, Rahul Singh
>> >> <[hidden email]> wrote:
>> >> > Bjarke,
>> >> >
>> >> > I am imagining that at some point you may need to shard that data if
>> it
>> >> grows. Or do you imagine this data to remain stagnant?
>> >> >
>> >> > Generally you want to add solrcloud to do two things : 1. Increase
>> >> availability with replicas 2. Increase available data via shards 3.
>> >> Increase fault tolerance with leader and replicas being spread around
>> the
>> >> cluster.
>> >> >
>> >> > You would be bypassing general High availability / distributed
>> computing
>> >> processes by trying to not reindex.
>> >> >
>> >> > Rahul
>> >> > On Aug 7, 2018, 7:06 AM -0400, Bjarke Buur Mortensen <
>> >> [hidden email]>, wrote:
>> >> >> Hi List,
>> >> >>
>> >> >> is there a cookbook recipe for moving an existing solr core to a solr
>> >> cloud
>> >> >> collection.
>> >> >>
>> >> >> We currently have a single machine with a large core (~150gb), and we
>> >> would
>> >> >> like to move to solr cloud.
>> >> >>
>> >> >> I haven't been able to find anything that reuses an existing index,
>> so
>> >> any
>> >> >> pointers much appreciated.
>> >> >>
>> >> >> Thanks,
>> >> >> Bjarke
>> >>
>>