how to recover state.json files

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

how to recover state.json files

Yogendra Kumar Soni
How to know attributes like shard name and hash ranges with associated core
names if we lost state.json file from zookeeper.
core.properties only contains core level information but hash ranges are
not stored there.

Does solr stores collection information, shards information anywhere.



--
*Thanks and Regards,*
*Yogendra Kumar Soni*
Reply | Threaded
Open this post in threaded view
|

Re: how to recover state.json files

Bernd Fehling
Have you lost dataDir from all zookeepers?

If not, first take a backup of remaining dataDir and then start that zookeeper.
Take ZooInspector to connect to dataDir at localhost and get your
state.json including all other configs and setting.


Am 09.01.19 um 12:25 schrieb Yogendra Kumar Soni:
> How to know attributes like shard name and hash ranges with associated core
> names if we lost state.json file from zookeeper.
> core.properties only contains core level information but hash ranges are
> not stored there.
>
> Does solr stores collection information, shards information anywhere.
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: how to recover state.json files

Erick Erickson
How did you "lose" the data? Exactly what happened?

Where does the dataDir variable point in your
zoo.cfg file? By default it points to /tmp/zookeeper,
which can be deleted by the op system when
the machine is restarted.

Otherwise you can get/put arbitrary znodes by
using "bin/solr zk cp....". Try "bin/solr zk -help" to
see the options. What I'd do to start is create
a new collection and use the state.json
as a template.

Assuming, of course, that Bernd's suggestion
is impossible.

Best,
Erick

On Wed, Jan 9, 2019 at 5:20 AM Bernd Fehling
<[hidden email]> wrote:

>
> Have you lost dataDir from all zookeepers?
>
> If not, first take a backup of remaining dataDir and then start that zookeeper.
> Take ZooInspector to connect to dataDir at localhost and get your
> state.json including all other configs and setting.
>
>
> Am 09.01.19 um 12:25 schrieb Yogendra Kumar Soni:
> > How to know attributes like shard name and hash ranges with associated core
> > names if we lost state.json file from zookeeper.
> > core.properties only contains core level information but hash ranges are
> > not stored there.
> >
> > Does solr stores collection information, shards information anywhere.
> >
> >
> >
Reply | Threaded
Open this post in threaded view
|

Re: how to recover state.json files

Gus Heck
Not a direct solution, but manipulating data in Zookeeper can be made
easier with https://github.com/rgs1/zk_shell

On Wed, Jan 9, 2019 at 10:26 AM Erick Erickson <[hidden email]>
wrote:

> How did you "lose" the data? Exactly what happened?
>
> Where does the dataDir variable point in your
> zoo.cfg file? By default it points to /tmp/zookeeper,
> which can be deleted by the op system when
> the machine is restarted.
>
> Otherwise you can get/put arbitrary znodes by
> using "bin/solr zk cp....". Try "bin/solr zk -help" to
> see the options. What I'd do to start is create
> a new collection and use the state.json
> as a template.
>
> Assuming, of course, that Bernd's suggestion
> is impossible.
>
> Best,
> Erick
>
> On Wed, Jan 9, 2019 at 5:20 AM Bernd Fehling
> <[hidden email]> wrote:
> >
> > Have you lost dataDir from all zookeepers?
> >
> > If not, first take a backup of remaining dataDir and then start that
> zookeeper.
> > Take ZooInspector to connect to dataDir at localhost and get your
> > state.json including all other configs and setting.
> >
> >
> > Am 09.01.19 um 12:25 schrieb Yogendra Kumar Soni:
> > > How to know attributes like shard name and hash ranges with associated
> core
> > > names if we lost state.json file from zookeeper.
> > > core.properties only contains core level information but hash ranges
> are
> > > not stored there.
> > >
> > > Does solr stores collection information, shards information anywhere.
> > >
> > >
> > >
>


--
http://www.the111shift.com
Reply | Threaded
Open this post in threaded view
|

Re: how to recover state.json files

Yogendra Kumar Soni
It was a mistake that got zookeeper dir deleted  also dataDir was inside
zookeeper dir.
 We manually created zookeeper files taking reference from  another solr
instance, core.properties etc. SolrCloud is up and running and we are able
to search.correct hash ranges for each shard is only missing piece.
Does solr stores shard and hash ranges somewhere similar to core.properties?
Is there any other method to recreate collection using existing core data
and configset files (schema.xml, solrconfig.xml)?


On Wed, Jan 9, 2019 at 9:00 PM Gus Heck <[hidden email]> wrote:

> Not a direct solution, but manipulating data in Zookeeper can be made
> easier with https://github.com/rgs1/zk_shell
>
> On Wed, Jan 9, 2019 at 10:26 AM Erick Erickson <[hidden email]>
> wrote:
>
> > How did you "lose" the data? Exactly what happened?
> >
> > Where does the dataDir variable point in your
> > zoo.cfg file? By default it points to /tmp/zookeeper,
> > which can be deleted by the op system when
> > the machine is restarted.
> >
> > Otherwise you can get/put arbitrary znodes by
> > using "bin/solr zk cp....". Try "bin/solr zk -help" to
> > see the options. What I'd do to start is create
> > a new collection and use the state.json
> > as a template.
> >
> > Assuming, of course, that Bernd's suggestion
> > is impossible.
> >
> > Best,
> > Erick
> >
> > On Wed, Jan 9, 2019 at 5:20 AM Bernd Fehling
> > <[hidden email]> wrote:
> > >
> > > Have you lost dataDir from all zookeepers?
> > >
> > > If not, first take a backup of remaining dataDir and then start that
> > zookeeper.
> > > Take ZooInspector to connect to dataDir at localhost and get your
> > > state.json including all other configs and setting.
> > >
> > >
> > > Am 09.01.19 um 12:25 schrieb Yogendra Kumar Soni:
> > > > How to know attributes like shard name and hash ranges with
> associated
> > core
> > > > names if we lost state.json file from zookeeper.
> > > > core.properties only contains core level information but hash ranges
> > are
> > > > not stored there.
> > > >
> > > > Does solr stores collection information, shards information anywhere.
> > > >
> > > >
> > > >
> >
>
>
> --
> http://www.the111shift.com
>


--
*Thanks and Regards,*
*Yogendra Kumar Soni*
Reply | Threaded
Open this post in threaded view
|

Re: how to recover state.json files

Erick Erickson
bq. Does solr stores shard and hash ranges somewhere similar to core.properties?

no. But it's easy enough to get them, just create another dummy
collection with the same number of shards and copy the hash ranges
from the dummy collection to ZK.

bq.  also dataDir was inside zookeeper dir.

What? Are you saying _solr_'s data dir was a child of ZooKeeper?
That's highly unusual. And you also said that the ZK node was
inadvertently removed. If your dataDir is located under your ZK nodes,
that would explain losing the Solr replica data.

I _strongly_ recommend you separate where Solr stores its indexes from
ZooKeeper's data. I'd also be curious about how big your ZK snapshots
are, it's vaguely possible that the ZK snapshots are including your
indexes, which would be extremely wasteful.

Best,
Erick



On Thu, Jan 10, 2019 at 12:00 AM Yogendra Kumar Soni
<[hidden email]> wrote:

>
> It was a mistake that got zookeeper dir deleted  also dataDir was inside
> zookeeper dir.
>  We manually created zookeeper files taking reference from  another solr
> instance, core.properties etc. SolrCloud is up and running and we are able
> to search.correct hash ranges for each shard is only missing piece.
> Does solr stores shard and hash ranges somewhere similar to core.properties?
> Is there any other method to recreate collection using existing core data
> and configset files (schema.xml, solrconfig.xml)?
>
>
> On Wed, Jan 9, 2019 at 9:00 PM Gus Heck <[hidden email]> wrote:
>
> > Not a direct solution, but manipulating data in Zookeeper can be made
> > easier with https://github.com/rgs1/zk_shell
> >
> > On Wed, Jan 9, 2019 at 10:26 AM Erick Erickson <[hidden email]>
> > wrote:
> >
> > > How did you "lose" the data? Exactly what happened?
> > >
> > > Where does the dataDir variable point in your
> > > zoo.cfg file? By default it points to /tmp/zookeeper,
> > > which can be deleted by the op system when
> > > the machine is restarted.
> > >
> > > Otherwise you can get/put arbitrary znodes by
> > > using "bin/solr zk cp....". Try "bin/solr zk -help" to
> > > see the options. What I'd do to start is create
> > > a new collection and use the state.json
> > > as a template.
> > >
> > > Assuming, of course, that Bernd's suggestion
> > > is impossible.
> > >
> > > Best,
> > > Erick
> > >
> > > On Wed, Jan 9, 2019 at 5:20 AM Bernd Fehling
> > > <[hidden email]> wrote:
> > > >
> > > > Have you lost dataDir from all zookeepers?
> > > >
> > > > If not, first take a backup of remaining dataDir and then start that
> > > zookeeper.
> > > > Take ZooInspector to connect to dataDir at localhost and get your
> > > > state.json including all other configs and setting.
> > > >
> > > >
> > > > Am 09.01.19 um 12:25 schrieb Yogendra Kumar Soni:
> > > > > How to know attributes like shard name and hash ranges with
> > associated
> > > core
> > > > > names if we lost state.json file from zookeeper.
> > > > > core.properties only contains core level information but hash ranges
> > > are
> > > > > not stored there.
> > > > >
> > > > > Does solr stores collection information, shards information anywhere.
> > > > >
> > > > >
> > > > >
> > >
> >
> >
> > --
> > http://www.the111shift.com
> >
>
>
> --
> *Thanks and Regards,*
> *Yogendra Kumar Soni*
Reply | Threaded
Open this post in threaded view
|

Re: how to recover state.json files

Shawn Heisey-2
In reply to this post by Yogendra Kumar Soni
On 1/9/2019 4:25 AM, Yogendra Kumar Soni wrote:
> How to know attributes like shard name and hash ranges with associated core
> names if we lost state.json file from zookeeper.
> core.properties only contains core level information but hash ranges are
> not stored there.
>
> Does solr stores collection information, shards information anywhere.

If you completely lose the information in ZooKeeper, SolrCloud will
cease to function properly.  Some of that information is not stored
anywhere else.  This is why you should have redundancy at the ZK level,
which means a minimum of 3 hosts.Making occasional backups of the ZK
data directory would also a good idea.

Responding to another thread you started:  If you start SolrCloud with
an empty zookeeper, none of the cores relating to collections that are
not found in ZK will start.  This is because without the info in ZK,
Solr has no idea how to use those cores. The data contained in those
cores should not be erased, though. It should be ignored.  The only way
I can imagine the data getting deleted on Solr startup is a situation
where the collection *is* in ZK and the replicas that are already online
are empty.

Thanks,
Shawn

Reply | Threaded
Open this post in threaded view
|

Re: how to recover state.json files

Yogendra Kumar Soni
Thanks Shawn and ErickI got the points
* Solr does not stored every single piece of information of state.json but
can be recreated.



I think there is something not adding up
<<<cores should not be erased, though. It should be ignored.

This was my assumption earlier. Restarted solr cloud without no collections
in zk and got cores directories deleted in disk.

*Checked with du and ls

  The only way
I can imagine the data getting deleted on Solr startup is a situation
where the collection *is* in ZK and the replicas that are already online
are empty.

Collections is not in zk, had only 1 replica per shard. Solr nodes were in
down state

It will be helpful if you can reproduce and suggest.
 delete /collections in zookeeper and restart solr cloud.

On Fri, Jan 11, 2019, 5:36 AM Shawn Heisey <[hidden email] wrote:

> On 1/9/2019 4:25 AM, Yogendra Kumar Soni wrote:
> > How to know attributes like shard name and hash ranges with associated
> core
> > names if we lost state.json file from zookeeper.
> > core.properties only contains core level information but hash ranges are
> > not stored there.
> >
> > Does solr stores collection information, shards information anywhere.
>
> If you completely lose the information in ZooKeeper, SolrCloud will
> cease to function properly.  Some of that information is not stored
> anywhere else.  This is why you should have redundancy at the ZK level,
> which means a minimum of 3 hosts.Making occasional backups of the ZK
> data directory would also a good idea.
>
> Responding to another thread you started:  If you start SolrCloud with
> an empty zookeeper, none of the cores relating to collections that are
> not found in ZK will start.  This is because without the info in ZK,
> Solr has no idea how to use those cores. The data contained in those
> cores should not be erased, though. It should be ignored.  The only way
> I can imagine the data getting deleted on Solr startup is a situation
> where the collection *is* in ZK and the replicas that are already online
> are empty.
>
> Thanks,
> Shawn
>
>