questions about autocommit & committing documents

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

questions about autocommit & committing documents

Andy-152
In the example solrconfig.xml that comes with Solr, the autocommit section:

<autoCommit>
  <maxDocs>10000</maxDocs>
  <maxTime>1000</maxTime>
</autoCommit>

has been commented out.

- With <autoCommit> commented out, does it mean that every new document indexed to Solr is being auto-committed individually? Or that they are not being auto-committed at all?

- If I enable <autoCommit> and set <maxDocs> at 10000, does it mean that my new documents won't be avalable for searching until 10,000 new documents have been added?

- When I add a new document to Solr, do I need to call commit explicitly? If so, how do I do that?
I look at the Solr tutorial ( http://lucene.apache.org/solr/tutorial.html), the command used to index documents (java -jar post.jar solr.xml monitor.xml) doesn't include any explicit call to commit the documents. So I'm not sure if it's necessary.

Thanks






     
Reply | Threaded
Open this post in threaded view
|

Re: questions about autocommit & committing documents

MitchK
Hi Andy,

Andy-152 wrote
<autoCommit> 
  <maxDocs>10000</maxDocs>
  <maxTime>1000</maxTime> 
</autoCommit>

has been commented out.

- With <autoCommit> commented out, does it mean that every new document indexed to Solr is being auto-committed individually? Or that they are not being auto-committed at all?
I am not sure, whether there is a default value, but if not, commenting out would mean that you have to send a commit explicitly.

- If I enable <autoCommit> and set <maxDocs> at 10000, does it mean that my new documents won't be avalable for searching until 10,000 new documents have been added?
Yes, that's correct. However, you can do a commit explicitly, if you want to do so.

- When I add a new document to Solr, do I need to call commit explicitly? If so, how do I do that?
I look at the Solr tutorial ( http://lucene.apache.org/solr/tutorial.html), the command used to index documents (java -jar post.jar solr.xml monitor.xml) doesn't include any explicit call to commit the documents. So I'm not sure if it's necessary.

Thanks
Committing is necessary, since every added document is not visible at query-time, if there was no commit to it.

Kind regards,
Mitch
Reply | Threaded
Open this post in threaded view
|

Re: questions about autocommit & committing documents

Andy-152
Thanks Mitch.

How do I do an explicit commit?

Andy

--- On Sun, 9/26/10, MitchK <[hidden email]> wrote:

> From: MitchK <[hidden email]>
> Subject: Re: questions about autocommit & committing documents
> To: [hidden email]
> Date: Sunday, September 26, 2010, 4:13 AM
>
> Hi Andy,
>
>
> Andy-152 wrote:
> >
> > <autoCommit>
> >   <maxDocs>10000</maxDocs>
> >   <maxTime>1000</maxTime>
> > </autoCommit>
> >
> > has been commented out.
> >
> > - With <autoCommit> commented out, does it mean
> that every new document
> > indexed to Solr is being auto-committed individually?
> Or that they are not
> > being auto-committed at all?
> >
> I am not sure, whether there is a default value, but if
> not, commenting out
> would mean that you have to send a commit explicitly.
>
>
>
> > - If I enable <autoCommit> and set
> <maxDocs> at 10000, does it mean that
> > my new documents won't be avalable for searching until
> 10,000 new
> > documents have been added?
> >
> Yes, that's correct. However, you can do a commit
> explicitly, if you want to
> do so.
>
>
>
> > - When I add a new document to Solr, do I need to call
> commit explicitly?
> > If so, how do I do that?
> > I look at the Solr tutorial (
> > http://lucene.apache.org/solr/tutorial.html), the
> command used to index
> > documents (java -jar post.jar solr.xml monitor.xml)
> doesn't include any
> > explicit call to commit the documents. So I'm not sure
> if it's necessary.
> >
> > Thanks
> >
> Committing is necessary, since every added document is not
> visible at
> query-time, if there was no commit to it.
>
> Kind regards,
> Mitch
> --
> View this message in context: http://lucene.472066.n3.nabble.com/questions-about-autocommit-committing-documents-tp1582487p1582676.html
> Sent from the Solr - User mailing list archive at
> Nabble.com.
>



Reply | Threaded
Open this post in threaded view
|

Re: questions about autocommit & committing documents

MitchK
First: Usually you do not use post.jar for updating your index. It's a simple tool, but normally you use features like the csv- or xml-update-RequestHandler.

Have a look at "UpdateCSV" and "UpdateXMLMessages" in the wiki.
There you can find examples on how to commit explicitly.

With the post.jar you need to set either dcommit=yes or to append "<commit/>", I think.

Hope this helps.

- Mitch
Reply | Threaded
Open this post in threaded view
|

Re: questions about autocommit & committing documents

darul
This post was updated on .
In reply to this post by MitchK
Old entry but I try to configure auto commit.

I am still not sure to understand how Solr handles the commit process.

If both parameters are :

<autoCommit> 
  <maxDocs>10000</maxDocs>
  <maxTime>1000</maxTime> 
</autoCommit>

Does Solr really wait for 10000 documents before sending a commit ?

I was thinking it will use maxTime and then commit a number of documents less than 10000.

Could you please correct this following scenario:
- 20 documents are added.
- After value of maxTime is reached, the 20 documents are committed because less than 10000 ?
- 20000 documents are added.
- After value of maxTime is reached, only the first 10000 documents are committed. The next 10000 will on next iteration of commit phase.

Is it the right way to understand both maxTime and maxDocs parameters ?

Thanks,

- If I enable <autoCommit> and set <maxDocs> at 10000, does it mean that my new documents won't be avalable for searching until 10,000 new documents have been added?
Yes, that's correct. However, you can do a commit explicitly, if you want to do so.

Reply | Threaded
Open this post in threaded view
|

Re: questions about autocommit & committing documents

darul
May someone explain me different use case when both or only one AutoCommit parameters is filled ?

I really need to understand it.

For example with these configurations :

<autoCommit> 
  <maxDocs>10000</maxDocs> 
</autoCommit>

or

<autoCommit> 
  <maxTime>1000</maxTime> 
</autoCommit>

or

<autoCommit> 
  <maxDocs>10000</maxDocs>
  <maxTime>1000</maxTime> 
</autoCommit>

Thanks to everyone
Reply | Threaded
Open this post in threaded view
|

Re: questions about autocommit & committing documents

Erick Erickson
A full commit of all pending documents is performed whenever
the first trigger is reached.

So, maxdocs = 1000. Max time=1 minute.

Index a packet with 999 docs. Index another packet with
50 documents immediately after. One commit of 1049 documents
happens....

Index a packet of 999 docs. Do nothing for a minute. One commit of
999 docs happens because of maxtime...

But I have to ask, "why do you care"? What high level problem
are you trying to handle?

Best
Erick

On Sun, Oct 23, 2011 at 3:03 PM, darul <[hidden email]> wrote:

> May someone explain me different use case when both or only one AutoCommit
> parameters is filled ?
>
> I really need to understand it.
>
> For example with these configurations :
>
> <autoCommit>
>  <maxDocs>10000</maxDocs>
> </autoCommit>
>
> or
>
> <autoCommit>
>  <maxTime>1000</maxTime>
> </autoCommit>
>
> or
>
> <autoCommit>
>  <maxDocs>10000</maxDocs>
>  <maxTime>1000</maxTime>
> </autoCommit>
>
> Thanks to everyone
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/questions-about-autocommit-committing-documents-tp1582487p3445607.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
Reply | Threaded
Open this post in threaded view
|

Re: questions about autocommit & committing documents

darul
Well until now I was using SolrJ API to commit() (for each document added...) changes but wonder in case of a production deployment it was not a best solution to use AutoCommit feature instead.

With AutoCommit parameters, is it mandatory to remove commit() instruction called on CommonsHttpSolrServer

try
{
   getServer().addBean(o);
   getServer().commit(); => to remove ?
...}
     
I just have another questions, I was looking all over the threads but not found any solutions yet about how to get a CallbackHandler with all documents commited. Is there a way simple way to achieve this ?

Thanks again Erick.

Reply | Threaded
Open this post in threaded view
|

Re: questions about autocommit & committing documents

Mark Miller-3
It's not 'mandatory', but it makes no sense to keep it. Even without autocommit, committing after every doc add is horribly inefficient.

On Oct 25, 2011, at 9:45 AM, darul wrote:

> Well until now I was using SolrJ API to commit() (for each document added...)
> changes but wonder in case of a production deployment it was not a best
> solution to use AutoCommit feature instead.
>
> With AutoCommit parameters, is it mandatory to remove commit() instruction
> called on CommonsHttpSolrServer
>
> try
> {
>   getServer().addBean(o);
>   getServer().commit(); => to remove ?
> ...}
>
> I just have another questions, I was looking all over the threads but not
> found any solutions yet about how to get a CallbackHandler with all
> documents commited. Is there a way simple way to achieve this ?
>
> Thanks again Erick.
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/questions-about-autocommit-committing-documents-tp1582487p3450739.html
> Sent from the Solr - User mailing list archive at Nabble.com.

- Mark Miller
lucidimagination.com











Reply | Threaded
Open this post in threaded view
|

Re: questions about autocommit & committing documents

darul
I was not sure thank you.
Reply | Threaded
Open this post in threaded view
|

Re: questions about autocommit & committing documents

Erick Erickson
Not sure what you mean by a callback, can you clarify? You don't get
anything except the return from the add call as far as I know...

Best
Erick

On Tue, Oct 25, 2011 at 4:15 AM, darul <[hidden email]> wrote:
> I was not sure thank you.
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/questions-about-autocommit-committing-documents-tp1582487p3450794.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
Reply | Threaded
Open this post in threaded view
|

Re: questions about autocommit & committing documents

darul
While sending documents with SolrJ Http API...at the end, I am never sure documents are indexed.

I would like to store them somewhere and resend them in case commit has failed.

If commit occurred every 10 minutes for example, and 100 documents are waiting to be commit, server crash or stop..this 100 documents won't be indexed later because my business logic won't send them again...

Then I would like create a Job (cron) which look into a table or somewhere for documents which may not have been indexed due a problem occurred during commit process.