data-import runned by cron job withou wating the end of the previous one

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

data-import runned by cron job withou wating the end of the previous one

sunnyShiny06
Hi,
There is something weird :

I've plan cron job every 5mn which heat delta-import's url and works fine :
The point is : It does look like if it doesn't check every data for updating or creating a new one :
Because every 5mn the delta import is started again : (even like if delta-import is not done)


<str name="status">idle</str>
<str name="importResponse"/>

<lst name="statusMessages">
<str name="Time Elapsed">0:2:23.885</str>
<str name="Total Requests made to DataSource">1</str>
<str name="Total Rows Fetched">1863146</str>
<str name="Total Documents Processed">0</str>
<str name="Total Documents Skipped">0</str>
<str name="Delta Dump started">2008-09-22 17:40:01</str>
<str name="Identifying Delta">2008-09-22 17:40:01</str>
</lst>

and I wonder if it does come from my data-config file parameters :
which is adaptive :


  <dataSource type="JdbcDataSource"
              driver="com.mysql.jdbc.Driver"
              url="jdbc:mysql://master.books.com/books"
              user="solr"
              password="tah1Axie"
<b>        batchSize="-1"
              responseBuffering="adaptive"/>

Thanks,
Reply | Threaded
Open this post in threaded view
|

Re: data-import runned by cron job withou wating the end of the previous one

Shalin Shekhar Mangar
On Mon, Sep 22, 2008 at 9:19 PM, sunnyfr <[hidden email]> wrote:

>
> Hi,
> There is something wierd :
> I've plan cron job every 5mn which heat delta-import's url and works fine :
> The point is : It does look like if it doesn't check every data for
> updating
> or creating a new one :
> Because every 5mn the delta importa is started again : (even like if
> delta-import is not done)
>

That should not be happening. Why do you feel it is starting again without
waiting for the previous import to finish?


>
> <str name="status">idle</str>
> <str name="importResponse"/>
> −
> <lst name="statusMessages">
> <str name="Time Elapsed">0:2:23.885</str>
> <str name="Total Requests made to DataSource">1</str>
> <str name="Total Rows Fetched">1863146</str>
> <str name="Total Documents Processed">0</str>
> <str name="Total Documents Skipped">0</str>
> <str name="Delta Dump started">2008-09-22 17:40:01</str>
> <str name="Identifying Delta">2008-09-22 17:40:01</str>
> </lst>
>

I'm confused by this output. How frequently do you update your database? How
many rows are modified in the database in that 5 minute period?

What is the type of your last modified column in the database on which you
use for identifying the deltas?


>
> and I wonder if it does come from my data-config file parameters :
> which is adaptive :
>
>  <dataSource type="JdbcDataSource"
>              driver="com.mysql.jdbc.Driver"
>              url="jdbc:mysql://master.books.com/books"
>              user="solr"
>              password="tah1Axie"
>        batchSize="-1"
>              responseBuffering="adaptive"/>
>
> Thanks,
>

The part on responseBuffering is not applicable for MySQL so you can remove
that.

--
Regards,
Shalin Shekhar Mangar.