Getting error while excuting full import

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Getting error while excuting full import

ankur.168
This post was updated on .
Hi All, I am trying to use solr with 2 cores interacting with 2 different databases, one core is executing full-import successfully where as when I am running for 2nd one it is throwing table or view not found exception. If I am using the query directly It is running fine. Below is the error meassge I am getting. Kindly help me, not able to understand what could be the issue here. I am using solr 6.4.1. 2017-04-10 09:17:23.167 INFO (Thread-14) [ x:aggr_content] o.a.s.h.d.DataImporter Starting Full Import 2017-04-10 09:17:23.183 WARN (Thread-14) [ x:aggr_content] o.a.s.h.d.SimplePropertiesWriter Unable to read: dataimport.properties 2017-04-10 09:17:23.304 INFO (Thread-14) [ x:aggr_content] o.a.s.h.d.JdbcDataSource Creating a connection for entity aggrPropertiesList with URL: jdbc:oracle:thin:@hostname:1521/serviceId 2017-04-10 09:17:23.465 INFO (qtp1348949648-19) [ x:aggr_content] o.a.s.c.S.Request [aggr_content] webapp=/solr path=/dataimport params={indent=on&wt=json&command=status&_=1491815835958} status=0 QTime=0 2017-04-10 09:17:23.569 INFO (Thread-14) [ x:aggr_content] o.a.s.h.d.JdbcDataSource Time taken for getConnection(): 263 2017-04-10 09:17:23.630 ERROR (Thread-14) [ x:aggr_content] o.a.s.h.d.DocBuilder Exception while processing: aggrList document : SolrInputDocument(fields: []):org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to execute query: SELECT GLBL_ID FROM AGGR_OWNER2.GLBL_DETAILS Processing Document # 1 at org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:69) at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.(JdbcDataSource.java:327) at org.apache.solr.handler.dataimport.JdbcDataSource.createResultSetIterator(JdbcDataSource.java:288) at org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:283) at org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:52) at org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:59) at org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73) at org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:244) at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:475) at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:414) at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:329) at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:232) at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416) at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:475) at org.apache.solr.handler.dataimport.DataImporter.lambda$runAsync$0(DataImporter.java:458) at org.apache.solr.handler.dataimport.DataImporter$$Lambda$93/239004134.run(Unknown Source) at java.lang.Thread.run(Thread.java:745) Caused by: java.sql.SQLSyntaxErrorException: ORA-00942: table or view does not exist at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:447) at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:396) at oracle.jdbc.driver.T4C8Oall.processError(T4C8Oall.java:951) at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:513) at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:227) at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:531) at oracle.jdbc.driver.T4CStatement.doOall8(T4CStatement.java:195) at oracle.jdbc.driver.T4CStatement.executeForDescribe(T4CStatement.java:876) at oracle.jdbc.driver.OracleStatement.executeMaybeDescribe(OracleStatement.java:1175) at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1296) at oracle.jdbc.driver.OracleStatement.executeInternal(OracleStatement.java:1916) at oracle.jdbc.driver.OracleStatement.execute(OracleStatement.java:1878) at oracle.jdbc.driver.OracleStatementWrapper.execute(OracleStatementWrapper.java:318) at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.executeStatement(JdbcDataSource.java:349) at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.(JdbcDataSource.java:321) ... 15 more
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Getting error while excuting full import

Shawn Heisey-2
On 4/10/2017 3:47 AM, ankur.168 wrote:
> Hi All,I am trying to use solr with 2 cores interacting with 2 different
> databases, one core is executing full-import successfully where as when I am
> running for 2nd one it is throwing table or view not found exception. If I
> am using the query directly It is running fine. Below is the error meassge I
> am getting.Kindly help me, not able to understand what could be the issue
> here.I am using solr 6.4.1.
<snip>
> java.sql.SQLSyntaxErrorException: ORA-00942: table or view does not exist at
> oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:447) at
>

You didn't include your dataimport config.  You'll probably need to
redact password information from it before you send it.

If you go to the Logging tab of the admin UI and change the level of the
JdbcDataSource class to DEBUG, then you will find the actual SQL Solr is
sending to the database in the solr.log file when you do another
import.  These logs will not show up in the Logging tab -- you will need
to find the actual logfile on disk.

Thanks,
Shawn

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Getting error while excuting full import

ankur.168
Thanks for replying Shawn,

There was an issue with the db connection url, silly mistake.

I am facing one another problem, do not know if should post in the same thread or as a new post. Anyways posting here only, let me know if needs to be posted as new one.

I am using DIH as you know. I have property_id as a unique key and I have i parent and 14-15 child entities(trying to improve performance for pretty old system hence can't avoid/reduce so many childs).
We have around 2.5 lacs ids in DB. So full import is becoming kind of near impossible for me here. I tried to split this into multiple document files within the same core and added a new data import handler as well. but when I am running import on both urls. The latest data import overrides the previous one, hence I am not able to get complete data.

So I have 2 questions here.

1. Is there a better way of doing indexing and import than the way I am doing it right now?
2. if no, then how can I make full import faster here?

--Ankur
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Getting error while excuting full import

Erick Erickson
I usually prefer SolrJ, here's an article explaining why and
providing sample code.

https://lucidworks.com/2012/02/14/indexing-with-solrj/

You can take the Tika bit out, they're unnecessary for
importing from a DB.

Best,
Erick

On Mon, Apr 17, 2017 at 5:45 AM, ankur.168 <[hidden email]> wrote:

> Thanks for replying Shawn,
>
> There was an issue with the db connection url, silly mistake.
>
> I am facing one another problem, do not know if should post in the same
> thread or as a new post. Anyways posting here only, let me know if needs to
> be posted as new one.
>
> I am using DIH as you know. I have property_id as a unique key and I have i
> parent and 14-15 child entities(trying to improve performance for pretty old
> system hence can't avoid/reduce so many childs).
> We have around 2.5 lacs ids in DB. So full import is becoming kind of near
> impossible for me here. I tried to split this into multiple document files
> within the same core and added a new data import handler as well. but when I
> am running import on both urls. The latest data import overrides the
> previous one, hence I am not able to get complete data.
>
> So I have 2 questions here.
>
> 1. Is there a better way of doing indexing and import than the way I am
> doing it right now?
> 2. if no, then how can I make full import faster here?
>
> --Ankur
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Getting-error-while-excuting-full-import-tp4329153p4330305.html
> Sent from the Solr - User mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Getting error while excuting full import

Mikhail Khludnev-2
In reply to this post by ankur.168
Hello, Ankur,

You probably can set clean=false&commit=false and do clean&commit in the
controlling code. See
https://cwiki.apache.org/confluence/display/solr/Uploading+Structured+Data+Store+Data+with+the+Data+Import+Handler
One of thrilling DIH features is merging sorted children and parents with
join="zipper"

On Mon, Apr 17, 2017 at 3:45 PM, ankur.168 <[hidden email]> wrote:

> Thanks for replying Shawn,
>
> There was an issue with the db connection url, silly mistake.
>
> I am facing one another problem, do not know if should post in the same
> thread or as a new post. Anyways posting here only, let me know if needs to
> be posted as new one.
>
> I am using DIH as you know. I have property_id as a unique key and I have i
> parent and 14-15 child entities(trying to improve performance for pretty
> old
> system hence can't avoid/reduce so many childs).
> We have around 2.5 lacs ids in DB. So full import is becoming kind of near
> impossible for me here. I tried to split this into multiple document files
> within the same core and added a new data import handler as well. but when
> I
> am running import on both urls. The latest data import overrides the
> previous one, hence I am not able to get complete data.
>
> So I have 2 questions here.
>
> 1. Is there a better way of doing indexing and import than the way I am
> doing it right now?
> 2. if no, then how can I make full import faster here?
>
> --Ankur
>
>
>
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/Getting-error-while-excuting-full-import-tp4329153p4330305.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



--
Sincerely yours
Mikhail Khludnev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Getting error while excuting full import

ankur.168
In reply to this post by Erick Erickson
Hi Erick,

Thanks for replying, As you suggest I can use solrJ to map RDBMS fetched data and index/search it later on. but DIH gives multi db connection for full import and other benefits.
Does solrJ supports this or we need to put efforts to make a multithreaded connection pool similar to DIH?
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Getting error while excuting full import

ankur.168
In reply to this post by Mikhail Khludnev-2
Hi Mikhail,

Thanks for replying,

I am currently trying to use zipper join but getting null pointer exception as given below stacktrace

2017-04-18 09:11:51.154 INFO  (qtp1348949648-13) [   x:sample_content] o.a.s.u.p.LogUpdateProcessorFactory [sample_content]  webapp=/solr path=/dataimport params={debug=true&indent=on&commit=true&start=0&clean=true&rows=10&command=full-import&verbose=false&core=sample_content&optimize=false&name=dataimport&wt=json&_=1492506703156}{deleteByQuery=*:* (-1565006716610805760)} 0 615
2017-04-18 09:11:51.173 ERROR (qtp1348949648-13) [   x:sample_content] o.a.s.h.d.DataImporter Full Import failed:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.solr.handler.dataimport.DataImportHandlerException: java.lang.NullPointerException
        at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:270)
        at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416)
        at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:475)
        at org.apache.solr.handler.dataimport.DataImportHandler.handleRequestBody(DataImportHandler.java:180)
        at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:166)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:2306)
        at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:658)
        at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:464)
        at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)
        at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:296)
        at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
        at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
        at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
        at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
        at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
        at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
        at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
        at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
        at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
        at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
        at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
        at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
        at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
        at org.eclipse.jetty.server.Server.handle(Server.java:534)
        at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
        at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
        at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
        at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
        at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
        at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
        at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
        at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
        at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
        at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: org.apache.solr.handler.dataimport.DataImportHandlerException: java.lang.NullPointerException
        at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:416)
        at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:329)
        at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:232)
        ... 34 more
Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException: java.lang.NullPointerException
        at org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:61)
        at org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:247)
        at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:475)
        at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:516)
        at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:414)
        ... 36 more
Caused by: java.lang.NullPointerException
        at org.apache.solr.handler.dataimport.Zipper.supplyNextChild(Zipper.java:73)
        at org.apache.solr.handler.dataimport.EntityProcessorBase.getNext(EntityProcessorBase.java:127)
        at org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:75)
        at org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:244)
        ... 39 more

here is my data-config file entry

<document name="sampleProperties">
        <entity name="propertiesList" processor="SqlEntityProcessor" query="SELECT PROPERTY_ID FROM property order by PROPERTY_ID">
               
                <field column="PROPERTY_ID" name="propertyId" />

                <entity name="description" processor="SqlEntityProcessor" transformer="TemplateTransformer" where="pt.TEXT_ID = t.text_id AND pt.PROPERTY_ID = '${propertiesList.PROPERTY_ID}' AND t.text_id in (SELECT TEXT_ID FROM TEXT) AND pt.DIST_CHANNEL_ID = 'X' AND pt.TEXT_TYPE != 'ACCOM' AND t.TEXT_TYPE != 'ACCOM'" query="SELECT pt.text_format_id, t.text_id, pt.TEXT FROM property_text pt, text t ORDER BY pt.PROPERTY_id" join="zipper">       
                        <field column="TEXT_ID" name="textId" />
                        <field column="TEXT" name="freeText" />
                        <field column="custom_type" name="customType" template="shortDescription"/>
                </entity>
        </entity>
</document>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Getting error while excuting full import

ankur.168
In reply to this post by Mikhail Khludnev-2
Hi Mikhail,

I tried with a simplest zipper entity. Here are the config details-

<document name="sampleProperties">
        <entity name="propertiesList" processor="SqlEntityProcessor" query="SELECT PROPERTY_ID FROM property order by PROPERTY_ID">
               
                <field column="PROPERTY_ID" name="propertyId" />

                <entity name="description" processor="SqlEntityProcessor" where="PROPERTY_ID='${propertiesList.PROPERTY_ID}'" query="SELECT * FROM property_text ORDER BY PROPERTY_ID"
                join="zipper">       
                                <field column="TEXT" name="freeText" />
                        </entity>
        </entity>
</document>

Here child entity have multiple records for a given property id. Hence I believe full import is failing. I have added new logs below. Is there a way Zipper supports multiple records merge?

aused by: java.lang.IllegalArgumentException: expect strictly increasing primary keys for Relation PROPERTY_ID='${propertiesList.PROPERTY_ID}' got: ,
        at org.apache.solr.handler.dataimport.Zipper.onNewParent(Zipper.java:108)
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Getting error while excuting full import

Mikhail Khludnev-2
Hello,

Shouldn't it just be where="PROPERTY_ID=PROPERTY_ID'"  since fields are
named the same in both tables.

On Tue, Apr 18, 2017 at 4:02 PM, ankur.168 <[hidden email]> wrote:

> Hi Mikhail,
>
> I tried with a simplest zipper entity. Here are the config details-
>
> <document name="sampleProperties">
>         <entity name="propertiesList" processor="SqlEntityProcessor"
> query="SELECT
> PROPERTY_ID FROM property order by PROPERTY_ID">
>
>                 <field column="PROPERTY_ID" name="propertyId" />
>
>                 <entity name="description" processor="SqlEntityProcessor"
> where="PROPERTY_ID=PROPERTY_ID'" query="SELECT * FROM
> property_text ORDER BY PROPERTY_ID"
>                 join="zipper">
>                                 <field column="TEXT" name="freeText" />
>                         </entity>
>         </entity>
> </document>
>
> Here child entity have multiple records for a given property id. Hence I
> believe full import is failing. I have added new logs below. Is there a way
> Zipper supports multiple records merge?
>
> aused by: java.lang.IllegalArgumentException: expect strictly increasing
> primary keys for Relation PROPERTY_ID='${propertiesList.PROPERTY_ID}'
> got: ,
>         at org.apache.solr.handler.dataimport.Zipper.onNewParent(
> Zipper.java:108)
>
>
>
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/Getting-error-while-excuting-full-import-tp4329153p4330488.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



--
Sincerely yours
Mikhail Khludnev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Getting error while excuting full import

ankur.168
Yes, both column names are same. But if we just use property_id=property_id in child entity, then how zipper gets to know which child document to merge with which parent?

Any how I just tried with ur suggested where condition which result in arrayindexoutofbound exception, here are the logs

Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException: java.lang.ArrayIndexOutOfBoundsException: -1
        at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:561)
        at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:414)
        ... 36 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
        at org.apache.solr.handler.dataimport.VariableResolver.resolve(VariableResolver.java:110)
        at org.apache.solr.handler.dataimport.ContextImpl.resolve(ContextImpl.java:250)
        at org.apache.solr.handler.dataimport.Zipper.onNewParent(Zipper.java:106)
        at org.apache.solr.handler.dataimport.EntityProcessorBase.init(EntityProcessorBase.java:63)
        at org.apache.solr.handler.dataimport.SqlEntityProcessor.init(SqlEntityProcessor.java:52)
        at org.apache.solr.handler.dataimport.EntityProcessorWrapper.init(EntityProcessorWrapper.java:75)
        at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:433)
        at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:516)
        ... 37 more

Thanks,
--Ankur
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Getting error while excuting full import

Shawn Heisey-2
In reply to this post by ankur.168
On 4/18/2017 12:58 AM, ankur.168 wrote:
> Hi Erick,
>
> Thanks for replying, As you suggest I can use solrJ to map RDBMS fetched
> data and index/search it later on. but DIH gives multi db connection for
> full import and other benefits.
> Does solrJ supports this or we need to put efforts to make a multithreaded
> connection pool similar to DIH?

Each DIH handler is single-threaded.  I have no idea what you are
talking about when you mention multi-threaded in conjunction with DIH.
You can have multiple handlers, and execute all of them in parallel, but
each one can only execute one import at a time, and that import will
only run with one thread.  Within the limitations caused by running
single-threaded, DIH is quite efficient.

With SolrJ, you can do anything the Java language permits you to do,
including very efficient handling of multiple threads, using multiple
databases, and pretty much anything else you can dream up ... but you
must write all the code.

Safety and good performance in a multi-threaded Java program is an art
form.  I hesitate to say that it's *difficult*, but it does create
challenges that may not be trivial to overcome.

The various SolrClient objects from SolrJ are thread-safe, although you
typically must create a custom HttpClient object to use for SolrClient
creation if you're going to be using a lot of threads, because
HttpClient defaults are to only allow a very minimal number of threads.

I've got no experience at all with parent/child documents, so this part
of your troubles is a mystery to me.

Thanks,
Shawn

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Getting error while excuting full import

Mikhail Khludnev-2
In reply to this post by ankur.168
Ok. I've checked AbstractSqlEntityProcessorTestCase.
Please make the next attempt with

where="PROPERTY_ID=propertiesList.PROPERTY_ID"


On Tue, Apr 18, 2017 at 4:35 PM, ankur.168 <[hidden email]> wrote:

> Yes, both column names are same. But if we just use property_id=property_id
> in child entity, then how zipper gets to know which child document to merge
> with which parent?
>
> Any how I just tried with ur suggested where condition which result in
> arrayindexoutofbound exception, here are the logs
>
> Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException:
> java.lang.ArrayIndexOutOfBoundsException: -1
>         at
> org.apache.solr.handler.dataimport.DocBuilder.
> buildDocument(DocBuilder.java:561)
>         at
> org.apache.solr.handler.dataimport.DocBuilder.
> buildDocument(DocBuilder.java:414)
>         ... 36 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
>         at
> org.apache.solr.handler.dataimport.VariableResolver.
> resolve(VariableResolver.java:110)
>         at
> org.apache.solr.handler.dataimport.ContextImpl.
> resolve(ContextImpl.java:250)
>         at org.apache.solr.handler.dataimport.Zipper.onNewParent(
> Zipper.java:106)
>         at
> org.apache.solr.handler.dataimport.EntityProcessorBase.init(
> EntityProcessorBase.java:63)
>         at
> org.apache.solr.handler.dataimport.SqlEntityProcessor.
> init(SqlEntityProcessor.java:52)
>         at
> org.apache.solr.handler.dataimport.EntityProcessorWrapper.init(
> EntityProcessorWrapper.java:75)
>         at
> org.apache.solr.handler.dataimport.DocBuilder.
> buildDocument(DocBuilder.java:433)
>         at
> org.apache.solr.handler.dataimport.DocBuilder.
> buildDocument(DocBuilder.java:516)
>         ... 37 more
>
> Thanks,
> --Ankur
>
>
>
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/Getting-error-while-excuting-full-import-tp4329153p4330498.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



--
Sincerely yours
Mikhail Khludnev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Getting error while excuting full import

ankur.168
In reply to this post by Shawn Heisey-2
Thanks for enlightening,  Shawn :)

I thought DIH does parallel db request for all the entities defined in a document.

I do believe that DIH is easier to use that's why I am trying to find a way to use this in my current system. But as I explained above since I have so many sub entities,each returns list of response which will be joined in to parent. for more than 2 lacs document, full import is taking forever.

What I am looking for is a way to speed up my full import using DIH only. To achieve this I tried to split the document in 2 and do full import parallely. but with this approach latest import overrides other document indexed data, since unique key(property_id) is same for both documents.

One way I could think of is to keep document in different core which will maintain different index files and merge the search results from both cores while performing search on indexed data. But is this a good approach?
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Getting error while excuting full import

Shawn Heisey-2
On 4/18/2017 11:21 PM, ankur.168 wrote:
> I thought DIH does parallel db request for all the entities defined in a
> document.

I do not know anything about that.  It *could* be possible for all the
sub-entities just below another entity to run in parallel, but I've got
no idea whether this is the case.  At the top level, there is only one
thread handling documents one at a time, this I am sure of.

> I do believe that DIH is easier to use that's why I am trying to find a way to use this in my current system. But as I explained above since I have so many sub entities,each returns list of response which will be joined in to parent. for more than 2 lacs document, full import is taking forever.
>
> What I am looking for is a way to speed up my full import using DIH only. To achieve this I tried to split the document in 2 and do full import parallely. but with this approach latest import overrides other document indexed data, since unique key(property_id) is same for both documents.

The way to achieve top speed with DIH is to *not* define nested
entities.  Only define one entity with a single SELECT statement.  Let
the database handle all the JOIN work.  In my DIH config, I do "SELECT *
FROM X WHERE Y" ... X is a view defined on the database server that
handles all the JOINs, and Y is a fairly detailed conditional.

> One way I could think of is to keep document in different core which will maintain different index files and merge the search results from both cores while performing search on indexed data. But is this a good approach?

In order to do a sharded query, the uniqueKey field would need to be
unique across all cores.  My index is sharded manually, each shard does
a separate import when fully rebuilding the index.  The sharding
algorithm is coded into the SQL statement.

Thanks,
Shawn

Loading...