Move SOLR from cloudera HDFS to SOLR on Docker

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Move SOLR from cloudera HDFS to SOLR on Docker

Wael Kader
Hello,

I want to move data from my SOLR setup on Cloudera Hadoop to a docker SOLR
container.
I don't need to run all the hadoop services in my setup as I am only
currently using SOLR from the cloudera HDP.

My concern now is to know what's the best way to move the data and schema
to Docker container.
I don't mind moving data to an older version of SOLR Container to match the
4.10.3 SOLR Version I have on Cloudera.

Much help is appreciated.

--
Regards,
Wael
Reply | Threaded
Open this post in threaded view
|

Re: Move SOLR from cloudera HDFS to SOLR on Docker

Jason Gerlowski
Hi Wael,

Getting configs and data out of Cloudera's HDP is about the same as
moving data between any 2 Solr clusters.

Moving configs is going to be the easy part.

If you're currently using Solr in SolrCloud mode, then your configs
all live in ZooKeeper.  Recent versions of Solr have a utility for
downloading and uploading collection configs from ZooKeeper: run
"bin/solr zk" for more details.  Without checking, I'm not sure
whether this tool is available as far back as 4.10.3.  But the way
that the tool works, I believe the current version would work against
an older SolrCloud install, so you can download a more recent version
and use the tool to extract and reupload your configs where you need
them.

If you're _not_ using SolrCloud, your collection configs will be on
disk, and moving them between installs is as simple as moving them on
disk.

Much more complicated is getting your index data into your new
install.  If you stay on the same Solr version, you should be able to
re-use your existing index files.  That said, recent releases have
seen Solr make strides in becoming cloud/docker aware, or at least
tolerant.  8.3.1 or 7.7.2 will likely be easier to manage on docker
than 4.10.3.  Additionally, 4.10.3 no longer receives any security
backports from the community, and hasn't for some time.  It's worth
considering whether that offers enough benefits to be worth the pain
of reindexing.

Best,

Jason

On Wed, Dec 18, 2019 at 9:26 AM Wael Kader <[hidden email]> wrote:

>
> Hello,
>
> I want to move data from my SOLR setup on Cloudera Hadoop to a docker SOLR
> container.
> I don't need to run all the hadoop services in my setup as I am only
> currently using SOLR from the cloudera HDP.
>
> My concern now is to know what's the best way to move the data and schema
> to Docker container.
> I don't mind moving data to an older version of SOLR Container to match the
> 4.10.3 SOLR Version I have on Cloudera.
>
> Much help is appreciated.
>
> --
> Regards,
> Wael