Index database with SolrJ using xml file directly throws an error

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Index database with SolrJ using xml file directly throws an error

sami
I would like to index my database using SolrJ Java API. I have already tried
to use DIH directly from the Solr server. It works and indexes well. But
when I would like to use the same XML config file with SolrJ it throws an
error.

**Solr version 7.6.0 SolrJ 7.6.0**

Here is the full code I am using:

            String url = "http://localhost:8983/solr/test";
                String dataConfig =
"D:/solr-7.6.0/server/solr/test/conf/solrconfig.xml";
            HttpSolrClient server = new HttpSolrClient.Builder(url).build();
                ModifiableSolrParams params = new ModifiableSolrParams();
                params.set("qt", "/dataimport");
                params.set("command", "full-import");
                params.set("clean", "true");
                params.set("commit", "true");
                params.set("optimize", "true");
                params.set("dataConfig",dataConfig);
                server.query(params);

But using this piece of code throws an error.

    Exception in thread "main"
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
from server at http://localhost:8983/solr/test: Data Config problem: Content
is not allowed in Prolog.

Am I doing it right? Reference:
https://stackoverflow.com/questions/31446644/how-to-do-solr-dataimport-i-e-from-rdbms-using-java-api/54905578#54905578

Is there any other way to index directly.



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Reply | Threaded
Open this post in threaded view
|

Re: Index database with SolrJ using xml file directly throws an error

Erick Erickson
That error usually means there are characters (even spaces) at the
_beginning_ of the xml file. DIH may be more forgiving on that front.

Basically, anything preceding the opening tag may cause this error.

Best,
Erick

On Thu, Feb 28, 2019 at 8:24 AM sami <[hidden email]> wrote:

>
> I would like to index my database using SolrJ Java API. I have already tried
> to use DIH directly from the Solr server. It works and indexes well. But
> when I would like to use the same XML config file with SolrJ it throws an
> error.
>
> **Solr version 7.6.0 SolrJ 7.6.0**
>
> Here is the full code I am using:
>
>             String url = "http://localhost:8983/solr/test";
>                 String dataConfig =
> "D:/solr-7.6.0/server/solr/test/conf/solrconfig.xml";
>                 HttpSolrClient server = new HttpSolrClient.Builder(url).build();
>                 ModifiableSolrParams params = new ModifiableSolrParams();
>                 params.set("qt", "/dataimport");
>                 params.set("command", "full-import");
>                 params.set("clean", "true");
>                 params.set("commit", "true");
>                 params.set("optimize", "true");
>                 params.set("dataConfig",dataConfig);
>                 server.query(params);
>
> But using this piece of code throws an error.
>
>     Exception in thread "main"
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
> from server at http://localhost:8983/solr/test: Data Config problem: Content
> is not allowed in Prolog.
>
> Am I doing it right? Reference:
> https://stackoverflow.com/questions/31446644/how-to-do-solr-dataimport-i-e-from-rdbms-using-java-api/54905578#54905578
>
> Is there any other way to index directly.
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Reply | Threaded
Open this post in threaded view
|

RE: Index database with SolrJ using xml file directly throws an error

Dyer, James-2
In reply to this post by sami
The parameter "dataConfig" should hold an actual xml document to override the data-config.xml file you store in zookeeper (cloud) or the configuration directory (standalone).  Typically you do not use this parameter.  Instead, specify the "config" parameter with the filename (eg. data-config.xml).  This file is the DIH configuration, not solrconfig.xml as you are using.  It is just the filename, or path starting at the base configuration directory, not a full path as you are using.  Unless you want users to override the DIH configuration at request time, it is best to specify the filename using the "config" parameter in the request handler's invariant section in solrconfig.xml.

From: sami <[hidden email]>
Sent: Thursday, February 28, 2019 8:36 AM
To: [hidden email]
Subject: Index database with SolrJ using xml file directly throws an error

I would like to index my database using SolrJ Java API. I have already tried
to use DIH directly from the Solr server. It works and indexes well. But
when I would like to use the same XML config file with SolrJ it throws an
error.

**Solr version 7.6.0 SolrJ 7.6.0**

Here is the full code I am using:

String url = "http://localhost:8983/solr/test";
String dataConfig =
"D:/solr-7.6.0/server/solr/test/conf/solrconfig.xml";
HttpSolrClient server = new HttpSolrClient.Builder(url).build();
ModifiableSolrParams params = new ModifiableSolrParams();
params.set("qt", "/dataimport");
params.set("command", "full-import");
params.set("clean", "true");
params.set("commit", "true");
params.set("optimize", "true");
params.set("dataConfig",dataConfig);
server.query(params);

But using this piece of code throws an error.

Exception in thread "main"
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
from server at http://localhost:8983/solr/test: Data Config problem: Content
is not allowed in Prolog.

Am I doing it right? Reference:
https://stackoverflow.com/questions/31446644/how-to-do-solr-dataimport-i-e-from-rdbms-using-java-api/54905578#54905578<https://stackoverflow.com/questions/31446644/how-to-do-solr-dataimport-i-e-from-rdbms-using-java-api/54905578#54905578>

Is there any other way to index directly.



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html<http://lucene.472066.n3.nabble.com/Solr-User-f472068.html>
Reply | Threaded
Open this post in threaded view
|

RE: Index database with SolrJ using xml file directly throws an error

sami
Hi James,

Thanks for your reply. I am not absolotuely sure I understood everything
correctly here. I would like to index my database to start with fresh index.
I have already done it with DIH execute function.

<http://lucene.472066.n3.nabble.com/file/t494676/test1.png>

It works absolutely fine. But, I want to use SolrJ API instead of using the
inbuilt execute function. The data-config.xml and solrconfig.xml works fine
with my database.

I am using the same data-config.xml file and solrconfig.xml file to do the
indexing with program mentioned in my query.

String url = "http://localhost:8983/solr/test";
HttpSolrClient server = new HttpSolrClient.Builder(url).build();
ModifiableSolrParams params = new ModifiableSolrParams();
params.set("qt", "/dataimport");
params.set("command", "full-import");
params.set("clean", "true");
params.set("commit", "true");
params.set("optimize", "true");
params.set("dataConfig","data-config.xml");  *I tried this too. as you
suggested not to use full path. *
server.query(params);

I checked the xml file for any bogus characters too. BUT the same files work
fine with inbuilt DIH not with the code. What it could be?



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Reply | Threaded
Open this post in threaded view
|

RE: Index database with SolrJ using xml file directly throws an error

Dyer, James-2
Instead of dataConfig=data-config.xml, use config=data-config.xml .

From: sami <[hidden email]>
Sent: Friday, March 1, 2019 3:05 AM
To: [hidden email]
Subject: RE: Index database with SolrJ using xml file directly throws an error

Hi James,

Thanks for your reply. I am not absolotuely sure I understood everything
correctly here. I would like to index my database to start with fresh index.
I have already done it with DIH execute function.

<http://lucene.472066.n3.nabble.com/file/t494676/test1.png<http://lucene.472066.n3.nabble.com/file/t494676/test1.png>>

It works absolutely fine. But, I want to use SolrJ API instead of using the
inbuilt execute function. The data-config.xml and solrconfig.xml works fine
with my database.

I am using the same data-config.xml file and solrconfig.xml file to do the
indexing with program mentioned in my query.

String url = "http://localhost:8983/solr/test";
HttpSolrClient server = new HttpSolrClient.Builder(url).build();
ModifiableSolrParams params = new ModifiableSolrParams();
params.set("qt", "/dataimport");
params.set("command", "full-import");
params.set("clean", "true");
params.set("commit", "true");
params.set("optimize", "true");
params.set("dataConfig","data-config.xml"); *I tried this too. as you
suggested not to use full path. *
server.query(params);

I checked the xml file for any bogus characters too. BUT the same files work
fine with inbuilt DIH not with the code. What it could be?



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html<http://lucene.472066.n3.nabble.com/Solr-User-f472068.html>
Reply | Threaded
Open this post in threaded view
|

RE: Index database with SolrJ using xml file directly throws an error

sami
Thanks James,

it works!



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html