Unique key constraint and optimistic locking (versioning)

classic Classic list List threaded Threaded
27 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Unique key constraint and optimistic locking (versioning)

Per Steffensen
Hi

Does solr/lucene provide any mechanism for "unique key constraint" and
"optimistic locking (versioning)"?
Unique key constraint: That a client will not succeed creating a new
document in solr/lucene if a document already exists having the same
value in some field (e.g. an id field). Of course implemented right, so
that even though two or more threads are concurrently trying to create a
new document with the same value in this field, only one of them will
succeed.
Optimistic locking (versioning): That a client will only succeed
updating a document if this updated document is based on the version of
the document currently stored in solr/lucene. Implemented in the
optimistic way that clients during an update have to tell which version
of the document they fetched from Solr and that they therefore have used
as a starting-point for their updated document. So basically having a
version field on the document that clients increase by one before
sending to solr for update, and some code in Solr that only makes the
update succeed if the version number of the updated document is exactly
one higher than the version number of the document already stored. Of
course again implemented right, so that even though two or more thrads
are concurrently trying to update a document, and they all have their
updated document based on the current version in solr/lucene, only one
of them will succeed.

Or do I have to do stuff like this myself outside solr/lucene - e.g. in
the client using solr.

Regards, Per Steffensen
Em
Reply | Threaded
Open this post in threaded view
|

Re: Unique key constraint and optimistic locking (versioning)

Em
Hi Per,

Solr provides the so called "UniqueKey"-field.
Refer to the Wiki to learn more:
http://wiki.apache.org/solr/UniqueKey

> Optimistic locking (versioning)
... is not provided by Solr out of the box. If you add a new document
with the same UniqueKey it replaces the old one.
You have to do the versioning on your own (and keep in mind concurrent
updates).

Kind regards,
Em

Am 21.02.2012 13:50, schrieb Per Steffensen:

> Hi
>
> Does solr/lucene provide any mechanism for "unique key constraint" and
> "optimistic locking (versioning)"?
> Unique key constraint: That a client will not succeed creating a new
> document in solr/lucene if a document already exists having the same
> value in some field (e.g. an id field). Of course implemented right, so
> that even though two or more threads are concurrently trying to create a
> new document with the same value in this field, only one of them will
> succeed.
> Optimistic locking (versioning): That a client will only succeed
> updating a document if this updated document is based on the version of
> the document currently stored in solr/lucene. Implemented in the
> optimistic way that clients during an update have to tell which version
> of the document they fetched from Solr and that they therefore have used
> as a starting-point for their updated document. So basically having a
> version field on the document that clients increase by one before
> sending to solr for update, and some code in Solr that only makes the
> update succeed if the version number of the updated document is exactly
> one higher than the version number of the document already stored. Of
> course again implemented right, so that even though two or more thrads
> are concurrently trying to update a document, and they all have their
> updated document based on the current version in solr/lucene, only one
> of them will succeed.
>
> Or do I have to do stuff like this myself outside solr/lucene - e.g. in
> the client using solr.
>
> Regards, Per Steffensen
>
Reply | Threaded
Open this post in threaded view
|

Re: Unique key constraint and optimistic locking (versioning)

Per Steffensen
Thanks a lot. We will use the UniqueKey feature and build versioning
ourselves. Do you think it would be a good idea if we built a versioning
feature into Solr/Lucene instead of doing it outside, so that others can
benefit from the feature as well? Guess contributions will be made
according to http://wiki.apache.org/solr/HowToContribute. It is possible
for "outsiders" (like us) to get a SVN branch at svn.apache.org to
prepare contributions, or do we have to use our own SVN? Are there any
plans migrating lucene/solr codebase to Git, which will make it easier
getting a "separate area" to work on the code (making a Git fork), and
suggest the contribution back to core lucene/solr (doing a Git "pull
request")?

Thanks!
Per Steffensen

Em skrev:

> Hi Per,
>
> Solr provides the so called "UniqueKey"-field.
> Refer to the Wiki to learn more:
> http://wiki.apache.org/solr/UniqueKey
>
>  
>> Optimistic locking (versioning)
>>    
> ... is not provided by Solr out of the box. If you add a new document
> with the same UniqueKey it replaces the old one.
> You have to do the versioning on your own (and keep in mind concurrent
> updates).
>
> Kind regards,
> Em
>
> Am 21.02.2012 13:50, schrieb Per Steffensen:
>  
>> Hi
>>
>> Does solr/lucene provide any mechanism for "unique key constraint" and
>> "optimistic locking (versioning)"?
>> Unique key constraint: That a client will not succeed creating a new
>> document in solr/lucene if a document already exists having the same
>> value in some field (e.g. an id field). Of course implemented right, so
>> that even though two or more threads are concurrently trying to create a
>> new document with the same value in this field, only one of them will
>> succeed.
>> Optimistic locking (versioning): That a client will only succeed
>> updating a document if this updated document is based on the version of
>> the document currently stored in solr/lucene. Implemented in the
>> optimistic way that clients during an update have to tell which version
>> of the document they fetched from Solr and that they therefore have used
>> as a starting-point for their updated document. So basically having a
>> version field on the document that clients increase by one before
>> sending to solr for update, and some code in Solr that only makes the
>> update succeed if the version number of the updated document is exactly
>> one higher than the version number of the document already stored. Of
>> course again implemented right, so that even though two or more thrads
>> are concurrently trying to update a document, and they all have their
>> updated document based on the current version in solr/lucene, only one
>> of them will succeed.
>>
>> Or do I have to do stuff like this myself outside solr/lucene - e.g. in
>> the client using solr.
>>
>> Regards, Per Steffensen
>>
>>    
>
>  

Reply | Threaded
Open this post in threaded view
|

Re: Unique key constraint and optimistic locking (versioning)

Per Steffensen
Per Steffensen skrev:

> Thanks a lot. We will use the UniqueKey feature and build versioning
> ourselves. Do you think it would be a good idea if we built a
> versioning feature into Solr/Lucene instead of doing it outside, so
> that others can benefit from the feature as well? Guess contributions
> will be made according to http://wiki.apache.org/solr/HowToContribute.
> It is possible for "outsiders" (like us) to get a SVN branch at
> svn.apache.org to prepare contributions, or do we have to use our own
> SVN? Are there any plans migrating lucene/solr codebase to Git, which
> will make it easier getting a "separate area" to work on the code
> (making a Git fork), and suggest the contribution back to core
> lucene/solr (doing a Git "pull request")?
Sorry - didnt see the "Eclipse (using Git)" chapter on
http://wiki.apache.org/solr/HowToContribute. We might contribute in that
area.
>
> Thanks!
> Per Steffensen

Reply | Threaded
Open this post in threaded view
|

Re: Unique key constraint and optimistic locking (versioning)

Per Steffensen
In reply to this post by Em
Em skrev:
> Hi Per,
>
> Solr provides the so called "UniqueKey"-field.
> Refer to the Wiki to learn more:
> http://wiki.apache.org/solr/UniqueKey
>  
Belive the uniqueKey does not enforce a "unique key constraint", so that
you are not allowed to create a document with an id's when an document
with the same id already exists. So it is not the whole solution.

>  
>> Optimistic locking (versioning)
>>    
> ... is not provided by Solr out of the box. If you add a new document
> with the same UniqueKey it replaces the old one.
> You have to do the versioning on your own (and keep in mind concurrent
> updates).
>
> Kind regards,
> Em
>
> Am 21.02.2012 13:50, schrieb Per Steffensen:
>  
>> Hi
>>
>> Does solr/lucene provide any mechanism for "unique key constraint" and
>> "optimistic locking (versioning)"?
>> Unique key constraint: That a client will not succeed creating a new
>> document in solr/lucene if a document already exists having the same
>> value in some field (e.g. an id field). Of course implemented right, so
>> that even though two or more threads are concurrently trying to create a
>> new document with the same value in this field, only one of them will
>> succeed.
>> Optimistic locking (versioning): That a client will only succeed
>> updating a document if this updated document is based on the version of
>> the document currently stored in solr/lucene. Implemented in the
>> optimistic way that clients during an update have to tell which version
>> of the document they fetched from Solr and that they therefore have used
>> as a starting-point for their updated document. So basically having a
>> version field on the document that clients increase by one before
>> sending to solr for update, and some code in Solr that only makes the
>> update succeed if the version number of the updated document is exactly
>> one higher than the version number of the document already stored. Of
>> course again implemented right, so that even though two or more thrads
>> are concurrently trying to update a document, and they all have their
>> updated document based on the current version in solr/lucene, only one
>> of them will succeed.
>>
>> Or do I have to do stuff like this myself outside solr/lucene - e.g. in
>> the client using solr.
>>
>> Regards, Per Steffensen
>>
>>    
>
>  

Em
Reply | Threaded
Open this post in threaded view
|

Re: Unique key constraint and optimistic locking (versioning)

Em
Hi Per,

well, Solr has no "Update"-Method like a RDBMS. It is a re-insert of the
whole document. Therefore a document with an existing UniqueKey marks
the old document as deleted and inserts the new one.
However this is not the whole story, since this "constraint" only works
per index/SolrCore/Shard (depending on your use-case).

Does this help you?

Kind regards,
Em

Am 23.02.2012 15:34, schrieb Per Steffensen:

> Em skrev:
>> Hi Per,
>>
>> Solr provides the so called "UniqueKey"-field.
>> Refer to the Wiki to learn more:
>> http://wiki.apache.org/solr/UniqueKey
>>  
> Belive the uniqueKey does not enforce a "unique key constraint", so that
> you are not allowed to create a document with an id's when an document
> with the same id already exists. So it is not the whole solution.
>>  
>>> Optimistic locking (versioning)
>>>    
>> ... is not provided by Solr out of the box. If you add a new document
>> with the same UniqueKey it replaces the old one.
>> You have to do the versioning on your own (and keep in mind concurrent
>> updates).
>>
>> Kind regards,
>> Em
>>
>> Am 21.02.2012 13:50, schrieb Per Steffensen:
>>  
>>> Hi
>>>
>>> Does solr/lucene provide any mechanism for "unique key constraint" and
>>> "optimistic locking (versioning)"?
>>> Unique key constraint: That a client will not succeed creating a new
>>> document in solr/lucene if a document already exists having the same
>>> value in some field (e.g. an id field). Of course implemented right, so
>>> that even though two or more threads are concurrently trying to create a
>>> new document with the same value in this field, only one of them will
>>> succeed.
>>> Optimistic locking (versioning): That a client will only succeed
>>> updating a document if this updated document is based on the version of
>>> the document currently stored in solr/lucene. Implemented in the
>>> optimistic way that clients during an update have to tell which version
>>> of the document they fetched from Solr and that they therefore have used
>>> as a starting-point for their updated document. So basically having a
>>> version field on the document that clients increase by one before
>>> sending to solr for update, and some code in Solr that only makes the
>>> update succeed if the version number of the updated document is exactly
>>> one higher than the version number of the document already stored. Of
>>> course again implemented right, so that even though two or more thrads
>>> are concurrently trying to update a document, and they all have their
>>> updated document based on the current version in solr/lucene, only one
>>> of them will succeed.
>>>
>>> Or do I have to do stuff like this myself outside solr/lucene - e.g. in
>>> the client using solr.
>>>
>>> Regards, Per Steffensen
>>>
>>>    
>>
>>  
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Unique key constraint and optimistic locking (versioning)

Per Steffensen
Em skrev:
> Hi Per,
>
> well, Solr has no "Update"-Method like a RDBMS. It is a re-insert of the
> whole document. Therefore a document with an existing UniqueKey marks
> the old document as deleted and inserts the new one.
>  
Yes I understand. But it is not always what I want to acheive. I want an
error to occur if a document with the same id already exists, when my
intent is to INSERT a new document. When my intent is to UPDATE a
document in solr/lucene I want the old document already in solr/lucene
deleted and the new version of this document added (exactly as you
explain). It will not be possible for solr/lucene to decide what to do
unless I give it some information about my intent - whether it is INSERT
or UPDATE semantics I want. I guess solr/lucene always give me INSERT
sematics when a document with the same id does not already exist, and
that it always give me UPDATE semantics when a document with the same id
does exist? I cannot decide?
> However this is not the whole story, since this "constraint" only works
> per index/SolrCore/Shard (depending on your use-case).
>  
Yes I know. But with the right routing strategy based on id's I will be
able to acheive what I want if the feature was just there per
index/core/shard.
> Does this help you?
>  
Yes it helps me getting sure, that what I am looking for is not there.
There is not built-in way to make solr/lucene give me an error if I try
to insert a new document with an id equal to a document already in the
index/core/shard. The existing document will always be updated
(implemented as "old deleted and new added"). Correct?
> Kind regards,
> Em
>  
Regards, Per Steffensen

Reply | Threaded
Open this post in threaded view
|

Re: Unique key constraint and optimistic locking (versioning)

Erick Erickson
Per:

Yep, you've got it. You could write a custom update handler that queried
(via TermDocs or something) for the ID when your intent was to
INSERT, but it'll have to be custom work. I suppose you could query
with a divide-and-conquer approach, that is query for
id:(1 2 58 90... all your insert IDs) and go/no-go based on whether
your return had any hits, but that supposed you have some idea
whether pre-existing documents are likely.....

But Solr doesn't have anything like you're looking for.

Best
Erick

On Thu, Feb 23, 2012 at 10:32 AM, Per Steffensen <[hidden email]> wrote:

> Em skrev:
>
>> Hi Per,
>>
>> well, Solr has no "Update"-Method like a RDBMS. It is a re-insert of the
>> whole document. Therefore a document with an existing UniqueKey marks
>> the old document as deleted and inserts the new one.
>>
>
> Yes I understand. But it is not always what I want to acheive. I want an
> error to occur if a document with the same id already exists, when my intent
> is to INSERT a new document. When my intent is to UPDATE a document in
> solr/lucene I want the old document already in solr/lucene deleted and the
> new version of this document added (exactly as you explain). It will not be
> possible for solr/lucene to decide what to do unless I give it some
> information about my intent - whether it is INSERT or UPDATE semantics I
> want. I guess solr/lucene always give me INSERT sematics when a document
> with the same id does not already exist, and that it always give me UPDATE
> semantics when a document with the same id does exist? I cannot decide?
>
>> However this is not the whole story, since this "constraint" only works
>> per index/SolrCore/Shard (depending on your use-case).
>>
>
> Yes I know. But with the right routing strategy based on id's I will be able
> to acheive what I want if the feature was just there per index/core/shard.
>>
>> Does this help you?
>>
>
> Yes it helps me getting sure, that what I am looking for is not there. There
> is not built-in way to make solr/lucene give me an error if I try to insert
> a new document with an id equal to a document already in the
> index/core/shard. The existing document will always be updated (implemented
> as "old deleted and new added"). Correct?
>>
>> Kind regards,
>> Em
>>
>
> Regards, Per Steffensen
>
Em
Reply | Threaded
Open this post in threaded view
|

Re: Unique key constraint and optimistic locking (versioning)

Em
In reply to this post by Per Steffensen
Hi Per,

> I want an error to occur if a document with the same id already
> exists, when my intent is to INSERT a new document. When my intent is
> to UPDATE a document in solr/lucene I want the old document already
> in solr/lucene deleted and the new version of this document added
> (exactly as you explain). It will not be possible for solr/lucene to
> decide what to do unless I give it some information about my intent -
> whether it is INSERT or UPDATE semantics I want. I guess solr/lucene
> always give me INSERT sematics when a document with the same id does
> not already exist, and that it always give me UPDATE semantics when a
> document with the same id does exist? I cannot decide?

Given that you've set a uniqueKey-field and there already exists a
document with that uniqueKey, it will delete the old one and insert the
new one. There is really no difference between the semantics - updates
do not exist.
To create a UNIQUE-constraint as you know it from a database you have to
check whether a document is already in the index *or* whether it is
already pending (waiting for getting flushed to the index).
Fortunately Solr manages a so called pending-set with all those
documents waiting for beeing flushed to disk (Solr 3.5).
I think you have to write your own DirectUpdateHandler to achieve what
you want on the Solr-level or to extend Lucenes IndexWriter to do it on
the Lucene-Level.

While doing so, keep track of what is going on in the trunk and how
Near-Real-Time-Search will change the current way of handling updates.

> There is not built-in way to make solr/lucene give me an error if I
> try to insert a new document with an id equal to a document already
> in the index/core/shard. The existing document will always be updated
> (implemented as "old deleted and new added"). Correct?
Exactly.

If you really want to get your hands on that topic I suggest you to
learn more about Lucene's IndexWriter:

http://lucene.apache.org/core/old_versioned_docs/versions/3_5_0/api/all/index.html?org/apache/lucene/index/IndexWriter.html

Kind Regards,
Em
Reply | Threaded
Open this post in threaded view
|

Re: Unique key constraint and optimistic locking (versioning)

Per Steffensen
Em skrev:

> Hi Per,
>
>  
>> I want an error to occur if a document with the same id already
>> exists, when my intent is to INSERT a new document. When my intent is
>> to UPDATE a document in solr/lucene I want the old document already
>> in solr/lucene deleted and the new version of this document added
>> (exactly as you explain). It will not be possible for solr/lucene to
>> decide what to do unless I give it some information about my intent -
>> whether it is INSERT or UPDATE semantics I want. I guess solr/lucene
>> always give me INSERT sematics when a document with the same id does
>> not already exist, and that it always give me UPDATE semantics when a
>> document with the same id does exist? I cannot decide?
>>    
>
> Given that you've set a uniqueKey-field and there already exists a
> document with that uniqueKey, it will delete the old one and insert the
> new one. There is really no difference between the semantics - updates
> do not exist.
> To create a UNIQUE-constraint as you know it from a database you have to
> check whether a document is already in the index *or* whether it is
> already pending (waiting for getting flushed to the index).
> Fortunately Solr manages a so called pending-set with all those
> documents waiting for beeing flushed to disk (Solr 3.5).
>  
We are using latest greates 4.0-SNAPSHOT code, because we want to take
advantage of SolrCloud stuff. Can you give a code-pointer to where I can
find the pending-set stuff? Does solr use this pending-set for query
responses, so that solr deliver 100% real-time search results?
> I think you have to write your own DirectUpdateHandler to achieve what
> you want on the Solr-level or to extend Lucenes IndexWriter to do it on
> the Lucene-Level.
>
> While doing so, keep track of what is going on in the trunk and how
> Near-Real-Time-Search will change the current way of handling updates.
>  
Will do. We already use auto soft commits.

>  
>> There is not built-in way to make solr/lucene give me an error if I
>> try to insert a new document with an id equal to a document already
>> in the index/core/shard. The existing document will always be updated
>> (implemented as "old deleted and new added"). Correct?
>>    
> Exactly.
>
> If you really want to get your hands on that topic I suggest you to
> learn more about Lucene's IndexWriter:
>
> http://lucene.apache.org/core/old_versioned_docs/versions/3_5_0/api/all/index.html?org/apache/lucene/index/IndexWriter.html
>
> Kind Regards,
> Em
>
>  

Reply | Threaded
Open this post in threaded view
|

Re: Unique key constraint and optimistic locking (versioning)

Sami Siren-2
>> Given that you've set a uniqueKey-field and there already exists a
>> document with that uniqueKey, it will delete the old one and insert the
>> new one. There is really no difference between the semantics - updates
>> do not exist.
>> To create a UNIQUE-constraint as you know it from a database you have to
>> check whether a document is already in the index *or* whether it is
>> already pending (waiting for getting flushed to the index).
>> Fortunately Solr manages a so called pending-set with all those
>> documents waiting for beeing flushed to disk (Solr 3.5).
>>
>
> We are using latest greates 4.0-SNAPSHOT code, because we want to take
> advantage of SolrCloud stuff. Can you give a code-pointer to where I can
> find the pending-set stuff?

I am not sure if this is what you're asking but you should be able to
get the latest data from Solr by using
realtime get http://wiki.apache.org/solr/RealTimeGet

--
 Sami Siren
Reply | Threaded
Open this post in threaded view
|

Re: Unique key constraint and optimistic locking (versioning)

Per Steffensen
Sami Siren skrev:

>>> Given that you've set a uniqueKey-field and there already exists a
>>> document with that uniqueKey, it will delete the old one and insert the
>>> new one. There is really no difference between the semantics - updates
>>> do not exist.
>>> To create a UNIQUE-constraint as you know it from a database you have to
>>> check whether a document is already in the index *or* whether it is
>>> already pending (waiting for getting flushed to the index).
>>> Fortunately Solr manages a so called pending-set with all those
>>> documents waiting for beeing flushed to disk (Solr 3.5).
>>>
>>>      
>> We are using latest greates 4.0-SNAPSHOT code, because we want to take
>> advantage of SolrCloud stuff. Can you give a code-pointer to where I can
>> find the pending-set stuff?
>>    
>
> I am not sure if this is what you're asking but you should be able to
> get the latest data from Solr by using
> realtime get http://wiki.apache.org/solr/RealTimeGet
>  
Thanks a lot! I might be very usefull, if this provide 100% real time
get - that is, if it gets the latest version of the document, also when
neither a soft-commit nor a hard-commit has been performed since the
lastest version of the document was indexed. Does it do that, or does it
need a soft commit (then I believe it is only a near real time get
operation)?
> --
>  Sami Siren
>
>  

Reply | Threaded
Open this post in threaded view
|

Re: Unique key constraint and optimistic locking (versioning)

Sami Siren-2
On Fri, Feb 24, 2012 at 12:06 PM, Per Steffensen <[hidden email]> wrote:

> Sami Siren skrev:
>
>>>> Given that you've set a uniqueKey-field and there already exists a
>>>> document with that uniqueKey, it will delete the old one and insert the
>>>> new one. There is really no difference between the semantics - updates
>>>> do not exist.
>>>> To create a UNIQUE-constraint as you know it from a database you have to
>>>> check whether a document is already in the index *or* whether it is
>>>> already pending (waiting for getting flushed to the index).
>>>> Fortunately Solr manages a so called pending-set with all those
>>>> documents waiting for beeing flushed to disk (Solr 3.5).
>>>>
>>>>
>>>
>>> We are using latest greates 4.0-SNAPSHOT code, because we want to take
>>> advantage of SolrCloud stuff. Can you give a code-pointer to where I can
>>> find the pending-set stuff?
>>>
>>
>>
>> I am not sure if this is what you're asking but you should be able to
>> get the latest data from Solr by using
>> realtime get http://wiki.apache.org/solr/RealTimeGet
>>
>
> Thanks a lot! I might be very usefull, if this provide 100% real time get -
> that is, if it gets the latest version of the document, also when neither a
> soft-commit nor a hard-commit has been performed since the lastest version
> of the document was indexed. Does it do that, or does it need a soft commit
> (then I believe it is only a near real time get operation)?

I believe it does not require any kind of commit to happen so it
should really be a real time get as the name suggests.

--
 Sami Siren
Em
Reply | Threaded
Open this post in threaded view
|

Re: Unique key constraint and optimistic locking (versioning)

Em
In reply to this post by Per Steffensen
Hi Per,

> Can you give a code-pointer to where I can find the pending-set stuff?
> Does solr use this pending-set for query responses, so that solr deliver
> 100% real-time search results?
As of Solr 3.5 it can be found within the DirectUpdateHandler and
DirectUpdateHandler2-classes.
I am currently unaware of how things change in 4.0.

Kind regards,
Em
Em
Reply | Threaded
Open this post in threaded view
|

Re: Unique key constraint and optimistic locking (versioning)

Em
In reply to this post by Per Steffensen
This is a really cool feature!
Thanks for pointing us in that direction!

As the "Quick Start" says, a document does not need a commit nor a
soft-commit or anything else to be available via RealTimeGet.

However, regarding a versioning-system, one always has to keep in mind
that an uncommited document is not guaranteed to be persisted in the index.
So if you give a Duplicate-Key-Error, because there is a pending
document with that key and afterwards the server goes down for any
reason, you might end up without that document inside of Solr.
You need a log for failover.

Kind regards,
Em

Am 24.02.2012 11:06, schrieb Per Steffensen:

> Sami Siren skrev:
>>>> Given that you've set a uniqueKey-field and there already exists a
>>>> document with that uniqueKey, it will delete the old one and insert the
>>>> new one. There is really no difference between the semantics - updates
>>>> do not exist.
>>>> To create a UNIQUE-constraint as you know it from a database you
>>>> have to
>>>> check whether a document is already in the index *or* whether it is
>>>> already pending (waiting for getting flushed to the index).
>>>> Fortunately Solr manages a so called pending-set with all those
>>>> documents waiting for beeing flushed to disk (Solr 3.5).
>>>>
>>>>      
>>> We are using latest greates 4.0-SNAPSHOT code, because we want to take
>>> advantage of SolrCloud stuff. Can you give a code-pointer to where I can
>>> find the pending-set stuff?
>>>    
>>
>> I am not sure if this is what you're asking but you should be able to
>> get the latest data from Solr by using
>> realtime get http://wiki.apache.org/solr/RealTimeGet
>>  
> Thanks a lot! I might be very usefull, if this provide 100% real time
> get - that is, if it gets the latest version of the document, also when
> neither a soft-commit nor a hard-commit has been performed since the
> lastest version of the document was indexed. Does it do that, or does it
> need a soft commit (then I believe it is only a near real time get
> operation)?
>> --
>>  Sami Siren
>>
>>  
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Unique key constraint and optimistic locking (versioning)

Yonik Seeley-2-2
On Fri, Feb 24, 2012 at 6:55 AM, Em <[hidden email]> wrote:
> However, regarding a versioning-system, one always has to keep in mind
> that an uncommited document is not guaranteed to be persisted in the index.

We now have durability via an update log.
With a recent nightly trunk build, you can send a document to solr w/o
committing, then kill -9 the JVM, then restart it and the log will be
used to recover that document (and you should be able to see it in the
index)

-Yonik
lucidimagination.com
Reply | Threaded
Open this post in threaded view
|

Re: Unique key constraint and optimistic locking (versioning)

Per Steffensen
In reply to this post by Em
Em skrev:
> This is a really cool feature!
> Thanks for pointing us in that direction!
>  
A feature where you can flag your "index" operation to provide "create
sematics" would be cool. When setting the "create-semantics" flag, an
"index" operation will fail if a document with simular id (or whatever
you use for uniqueKey) already exist. When the flag is not set "index"
semantics will be just as it is today. ElasticSearch has this, except
that they call it "OpType" which has the possible values "create" and
"index" ("index" is default). Most other alternatives to Solr(Cloud)
provide this as well. We need it in my current project. We might make it
"outside" Solr/Lucene but I hope to be able to convince my ProductOwner
to make it as a Solr-feature contributing it back - especiallly if the
Solr community agrees that it would be a nice and commonly usable
feature. Believe it is a commonly usable feature - especially "when
using Solr as a NoSQL data store and not just a search index" (as
http://wiki.apache.org/solr/RealTimeGet says)

> As the "Quick Start" says, a document does not need a commit nor a
> soft-commit or anything else to be available via RealTimeGet.
>
> However, regarding a versioning-system, one always has to keep in mind
> that an uncommited document is not guaranteed to be persisted in the index.
> So if you give a Duplicate-Key-Error, because there is a pending
> document with that key and afterwards the server goes down for any
> reason, you might end up without that document inside of Solr.
> You need a log for failover.
>  
Yes I know. Or you might just not consider a datarecord inserted into
Solr before it has been indexed AND a hard-commit has happened. You can
have many threads working on indexing datarecords into Solr but not
deleting/acknowledging the source for those datarecords before next
hard-commit has happend after index. But I believe it is another issue -
one we also have plans about dealing with.

Thanks everybody!

> Kind regards,
> Em
>
> Am 24.02.2012 11:06, schrieb Per Steffensen:
>  
>> Sami Siren skrev:
>>    
>>>>> Given that you've set a uniqueKey-field and there already exists a
>>>>> document with that uniqueKey, it will delete the old one and insert the
>>>>> new one. There is really no difference between the semantics - updates
>>>>> do not exist.
>>>>> To create a UNIQUE-constraint as you know it from a database you
>>>>> have to
>>>>> check whether a document is already in the index *or* whether it is
>>>>> already pending (waiting for getting flushed to the index).
>>>>> Fortunately Solr manages a so called pending-set with all those
>>>>> documents waiting for beeing flushed to disk (Solr 3.5).
>>>>>
>>>>>      
>>>>>          
>>>> We are using latest greates 4.0-SNAPSHOT code, because we want to take
>>>> advantage of SolrCloud stuff. Can you give a code-pointer to where I can
>>>> find the pending-set stuff?
>>>>    
>>>>        
>>> I am not sure if this is what you're asking but you should be able to
>>> get the latest data from Solr by using
>>> realtime get http://wiki.apache.org/solr/RealTimeGet
>>>  
>>>      
>> Thanks a lot! I might be very usefull, if this provide 100% real time
>> get - that is, if it gets the latest version of the document, also when
>> neither a soft-commit nor a hard-commit has been performed since the
>> lastest version of the document was indexed. Does it do that, or does it
>> need a soft commit (then I believe it is only a near real time get
>> operation)?
>>    
>>> --
>>>  Sami Siren
>>>
>>>  
>>>      
>>    
>
>  

Reply | Threaded
Open this post in threaded view
|

Re: Unique key constraint and optimistic locking (versioning)

Per Steffensen
In reply to this post by Yonik Seeley-2-2
Yonik Seeley skrev:

> On Fri, Feb 24, 2012 at 6:55 AM, Em <[hidden email]> wrote:
>  
>> However, regarding a versioning-system, one always has to keep in mind
>> that an uncommited document is not guaranteed to be persisted in the index.
>>    
>
> We now have durability via an update log.
> With a recent nightly trunk build, you can send a document to solr w/o
> committing, then kill -9 the JVM, then restart it and the log will be
> used to recover that document (and you should be able to see it in the
> index)
>  
Cool. We have a test doing exactly that - indexing 2000 documents into
Solr, kill-9'ing Solr in the middle of the process, starting Solr again
and checking that 2000 documents will eventually be searchable. It
lights red as it is right now, but we are using a 4.0-SNAPSHOT from late
december. We will try to update to newest code and see if it lights
green :-) Do you have to do something to enable this log-and-recover
feature, or does it just run out-of-the-box? Any documentation?
> -Yonik
> lucidimagination.com
>
>  

Reply | Threaded
Open this post in threaded view
|

Re: Unique key constraint and optimistic locking (versioning)

Yonik Seeley-2-2
On Fri, Feb 24, 2012 at 9:04 AM, Per Steffensen <[hidden email]> wrote:
> Cool. We have a test doing exactly that - indexing 2000 documents into Solr,
> kill-9'ing Solr in the middle of the process, starting Solr again and
> checking that 2000 documents will eventually be searchable. It lights red as
> it is right now, but we are using a 4.0-SNAPSHOT from late december. We will
> try to update to newest code and see if it lights green :-) Do you have to
> do something to enable this log-and-recover feature, or does it just run
> out-of-the-box? Any documentation?

Same as realtime-get, you need the update log configured.
It's currently tested in TestRecovery.

-Yonik
lucidimagination.com
Reply | Threaded
Open this post in threaded view
|

Re: Unique key constraint and optimistic locking (versioning)

Per Steffensen
Yonik Seeley skrev:

> On Fri, Feb 24, 2012 at 9:04 AM, Per Steffensen <[hidden email]> wrote:
>  
>> Cool. We have a test doing exactly that - indexing 2000 documents into Solr,
>> kill-9'ing Solr in the middle of the process, starting Solr again and
>> checking that 2000 documents will eventually be searchable. It lights red as
>> it is right now, but we are using a 4.0-SNAPSHOT from late december. We will
>> try to update to newest code and see if it lights green :-) Do you have to
>> do something to enable this log-and-recover feature, or does it just run
>> out-of-the-box? Any documentation?
>>    
>
> Same as realtime-get, you need the update log configured.
> It's currently tested in TestRecovery.
>  
Thanks! Any performance measurements comparing with and without update log?
> -Yonik
> lucidimagination.com
>
>  

12