Update Solr Document

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Update Solr Document

Rushikesh Garadade
Hi solr-user,

I am using Solr 7.2. I am newbie in solr , please forgive my nuisance.


Lets say I have one solr collection(say Collection X) with 100
Documents(lets say 5 fields in one Document) and I have another collection(say
Collection Y) with 1 lakh Documents (same 5 fields in one Document that we
have in Collection X).

Then,
If I update only one field from one document from collection THEN
Is "time required to update in collection X" == "time required to update in
collection Y" ?

question is: Is solr updates individual doc irrespective of collection
size? if not how re-indexing works on Document update?

Thanks,
Rushikesh Garadade
Reply | Threaded
Open this post in threaded view
|

Re: Update Solr Document

Emir Arnautović
Hi Rushikesh,
There is no update of documents in Solr - it is always indexing a new document to a new segment. That means that indexing operation is equally heavy on any collection. But that does not mean that updates will take equal time. There are other activities that are heavier on larger indices like commits and segment merges that can prevent updates to be visible or can exhaust resources so that indexing new document takes longer. You will usually indexing throughput going down as index size goes up.

HTH,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 31 May 2018, at 14:59, Rushikesh Garadade <[hidden email]> wrote:
>
> Hi solr-user,
>
> I am using Solr 7.2. I am newbie in solr , please forgive my nuisance.
>
>
> Lets say I have one solr collection(say Collection X) with 100
> Documents(lets say 5 fields in one Document) and I have another collection(say
> Collection Y) with 1 lakh Documents (same 5 fields in one Document that we
> have in Collection X).
>
> Then,
> If I update only one field from one document from collection THEN
> Is "time required to update in collection X" == "time required to update in
> collection Y" ?
>
> question is: Is solr updates individual doc irrespective of collection
> size? if not how re-indexing works on Document update?
>
> Thanks,
> Rushikesh Garadade

Reply | Threaded
Open this post in threaded view
|

Re: Update Solr Document

Alessandro Benedetti
In reply to this post by Rushikesh Garadade
There is no quick answer, it really depends on a lot of factors...
*TL;DR* : Updating a single document field will likely take more time in a
bigger collection.

*Partial Document Update*
First of all, which field are you updating ?
Depending on the type and attributes you may end up in different
scenarios[1].
For example, an in place update would be much more convenient and less
expensive as it will not end up writing a new document in the index .
Viceversa a normal atomic update will cause an internal delete/re-index of
the doc.
What happens next will depend on the commit policies ( or in case you
saturated the internal ram buffer, the content of the segment will be
flushed.

*Solr Commit Policies*
In Solr there is the concept of Soft and hard commit.
A soft commit is cheaper : grants visibility, warms up the caches, does
minimal ( potentially none) disk writing
An hard commit will flush the current segment to the disk in addition (
which brings all the background operations that Emir pointed out).
Help yourself with this Erick's great classic[2]
*Warming the caches* will take more time in a bigger collection ( as the
queries will be executed on a bigger index).
*Merging the segments* in the background, if it's triggered will take more
time in a bigger collection.


[1]
https://lucene.apache.org/solr/guide/6_6/updating-parts-of-documents.html#UpdatingPartsofDocuments-In-PlaceUpdates
<https://lucene.apache.org/solr/guide/6_6/updating-parts-of-documents.html#UpdatingPartsofDocuments-In-PlaceUpdates>  
[2]  understanding-transaction-logs-softcommit-and-commit-in-sorlcloud
<https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/>  



-----
---------------
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
---------------
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io