Is that solr supports multi version operations?

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Is that solr supports multi version operations?

zhenyuan wei
Hi all,
    add solr document with overwrite=false will keepping multi version
documents,
My question is :
    1.  How to search newest documents?with what options?
    2.  How to delete  old version < newest version  documents?

for example:
         {
        "id":"1002",
        "name":["james"],
        "_version_":1611998319085617152,
        "name_str":["james"]},
      {
        "id":"1002",
        "name":["lily"],
        "_version_":1611998307815522304,
        "name_str":["lily"]},
      {
        "id":"1002",
        "name":["lucy"],
        "_version_":1611998248265842688,
        "name_str":["lucy"]}]

1. curl  http://localhost:8983/solr/collection001/query?q=*:*   return all

    how to search to make response return the newest one?
2. how to delete  document of version
[1611998307815522304,1611998248265842688] ,
which is older then 1611998319085617152.
Reply | Threaded
Open this post in threaded view
|

Re: Is that solr supports multi version operations?

Walter Underwood
No. Solr only has one version of a document. It is not a multi-version database.

Each replica will return the newest version it has.

wunder
Walter Underwood
[hidden email]
http://observer.wunderwood.org/  (my blog)

> On Sep 18, 2018, at 7:11 PM, zhenyuan wei <[hidden email]> wrote:
>
> Hi all,
>    add solr document with overwrite=false will keepping multi version
> documents,
> My question is :
>    1.  How to search newest documents?with what options?
>    2.  How to delete  old version < newest version  documents?
>
> for example:
>         {
>        "id":"1002",
>        "name":["james"],
>        "_version_":1611998319085617152,
>        "name_str":["james"]},
>      {
>        "id":"1002",
>        "name":["lily"],
>        "_version_":1611998307815522304,
>        "name_str":["lily"]},
>      {
>        "id":"1002",
>        "name":["lucy"],
>        "_version_":1611998248265842688,
>        "name_str":["lucy"]}]
>
> 1. curl  http://localhost:8983/solr/collection001/query?q=*:*   return all
> ,
>    how to search to make response return the newest one?
> 2. how to delete  document of version
> [1611998307815522304,1611998248265842688] ,
> which is older then 1611998319085617152.

Reply | Threaded
Open this post in threaded view
|

Re: Is that solr supports multi version operations?

Alexandre Rafalovitch
I think if you try hard enough, it is possible to get Solr to keep
multiple documents that would normally keep only the latest version.
They will just have different internal lucene id.

This may of course break a lot of other things like SolrCloud and
possibly facet counts.

So, I would ask the actual business case first. It is entirely
possible that there are other ways to achieve the desired objectives.

Regards,
   Alex.

On 19 September 2018 at 00:17, Walter Underwood <[hidden email]> wrote:

> No. Solr only has one version of a document. It is not a multi-version database.
>
> Each replica will return the newest version it has.
>
> wunder
> Walter Underwood
> [hidden email]
> http://observer.wunderwood.org/  (my blog)
>
>> On Sep 18, 2018, at 7:11 PM, zhenyuan wei <[hidden email]> wrote:
>>
>> Hi all,
>>    add solr document with overwrite=false will keepping multi version
>> documents,
>> My question is :
>>    1.  How to search newest documents?with what options?
>>    2.  How to delete  old version < newest version  documents?
>>
>> for example:
>>         {
>>        "id":"1002",
>>        "name":["james"],
>>        "_version_":1611998319085617152,
>>        "name_str":["james"]},
>>      {
>>        "id":"1002",
>>        "name":["lily"],
>>        "_version_":1611998307815522304,
>>        "name_str":["lily"]},
>>      {
>>        "id":"1002",
>>        "name":["lucy"],
>>        "_version_":1611998248265842688,
>>        "name_str":["lucy"]}]
>>
>> 1. curl  http://localhost:8983/solr/collection001/query?q=*:*   return all
>> ,
>>    how to search to make response return the newest one?
>> 2. how to delete  document of version
>> [1611998307815522304,1611998248265842688] ,
>> which is older then 1611998319085617152.
>
Reply | Threaded
Open this post in threaded view
|

Re: Is that solr supports multi version operations?

zhenyuan wei
Thanks for your explaination. @Alexandre Rafalovitch  @Walter Underwood

    My case is use SOLR as  an Index Service of  some NoSQL systems,it is
 a  common Requirement   to
guarantee the consistency of index&source data .
    There maybe  TWO ways to write source data/index:
     1. write index to solr first, then write source data to NoSQL system.
        if write NoSQL failed,I want to rollback solr update。due to solr
not support rollback,I have ever
       think to use multi-version to implement this  feature, but ,feel
disappointed。

     2. write source data first, then wirte index to solr.  this is my
current implementation。and I found it fit for me。










Alexandre Rafalovitch <[hidden email]> 于2018年9月19日周三 下午1:41写道:

> I think if you try hard enough, it is possible to get Solr to keep
> multiple documents that would normally keep only the latest version.
> They will just have different internal lucene id.
>
> This may of course break a lot of other things like SolrCloud and
> possibly facet counts.
>
> So, I would ask the actual business case first. It is entirely
> possible that there are other ways to achieve the desired objectives.
>
> Regards,
>    Alex.
>
> On 19 September 2018 at 00:17, Walter Underwood <[hidden email]>
> wrote:
> > No. Solr only has one version of a document. It is not a multi-version
> database.
> >
> > Each replica will return the newest version it has.
> >
> > wunder
> > Walter Underwood
> > [hidden email]
> > http://observer.wunderwood.org/  (my blog)
> >
> >> On Sep 18, 2018, at 7:11 PM, zhenyuan wei <[hidden email]> wrote:
> >>
> >> Hi all,
> >>    add solr document with overwrite=false will keepping multi version
> >> documents,
> >> My question is :
> >>    1.  How to search newest documents?with what options?
> >>    2.  How to delete  old version < newest version  documents?
> >>
> >> for example:
> >>         {
> >>        "id":"1002",
> >>        "name":["james"],
> >>        "_version_":1611998319085617152,
> >>        "name_str":["james"]},
> >>      {
> >>        "id":"1002",
> >>        "name":["lily"],
> >>        "_version_":1611998307815522304,
> >>        "name_str":["lily"]},
> >>      {
> >>        "id":"1002",
> >>        "name":["lucy"],
> >>        "_version_":1611998248265842688,
> >>        "name_str":["lucy"]}]
> >>
> >> 1. curl  http://localhost:8983/solr/collection001/query?q=*:*   return
> all
> >> ,
> >>    how to search to make response return the newest one?
> >> 2. how to delete  document of version
> >> [1611998307815522304,1611998248265842688] ,
> >> which is older then 1611998319085617152.
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Is that solr supports multi version operations?

Walter Underwood
You are doing the right thing. Always write to the repository first, then
write to Solr. The repository is the single source of truth.

We write to the repository, then have a process that copies new items
to Solr.

wunder
Walter Underwood
[hidden email]
http://observer.wunderwood.org/  (my blog)

> On Sep 19, 2018, at 3:03 AM, zhenyuan wei <[hidden email]> wrote:
>
> Thanks for your explaination. @Alexandre Rafalovitch  @Walter Underwood
>
>    My case is use SOLR as  an Index Service of  some NoSQL systems,it is
> a  common Requirement   to
> guarantee the consistency of index&source data .
>    There maybe  TWO ways to write source data/index:
>     1. write index to solr first, then write source data to NoSQL system.
>        if write NoSQL failed,I want to rollback solr update。due to solr
> not support rollback,I have ever
>       think to use multi-version to implement this  feature, but ,feel
> disappointed。
>
>     2. write source data first, then wirte index to solr.  this is my
> current implementation。and I found it fit for me。
>
>
>
>
>
>
>
>
>
>
> Alexandre Rafalovitch <[hidden email]> 于2018年9月19日周三 下午1:41写道:
>
>> I think if you try hard enough, it is possible to get Solr to keep
>> multiple documents that would normally keep only the latest version.
>> They will just have different internal lucene id.
>>
>> This may of course break a lot of other things like SolrCloud and
>> possibly facet counts.
>>
>> So, I would ask the actual business case first. It is entirely
>> possible that there are other ways to achieve the desired objectives.
>>
>> Regards,
>>   Alex.
>>
>> On 19 September 2018 at 00:17, Walter Underwood <[hidden email]>
>> wrote:
>>> No. Solr only has one version of a document. It is not a multi-version
>> database.
>>>
>>> Each replica will return the newest version it has.
>>>
>>> wunder
>>> Walter Underwood
>>> [hidden email]
>>> http://observer.wunderwood.org/  (my blog)
>>>
>>>> On Sep 18, 2018, at 7:11 PM, zhenyuan wei <[hidden email]> wrote:
>>>>
>>>> Hi all,
>>>>   add solr document with overwrite=false will keepping multi version
>>>> documents,
>>>> My question is :
>>>>   1.  How to search newest documents?with what options?
>>>>   2.  How to delete  old version < newest version  documents?
>>>>
>>>> for example:
>>>>        {
>>>>       "id":"1002",
>>>>       "name":["james"],
>>>>       "_version_":1611998319085617152,
>>>>       "name_str":["james"]},
>>>>     {
>>>>       "id":"1002",
>>>>       "name":["lily"],
>>>>       "_version_":1611998307815522304,
>>>>       "name_str":["lily"]},
>>>>     {
>>>>       "id":"1002",
>>>>       "name":["lucy"],
>>>>       "_version_":1611998248265842688,
>>>>       "name_str":["lucy"]}]
>>>>
>>>> 1. curl  http://localhost:8983/solr/collection001/query?q=*:*   return
>> all
>>>> ,
>>>>   how to search to make response return the newest one?
>>>> 2. how to delete  document of version
>>>> [1611998307815522304,1611998248265842688] ,
>>>> which is older then 1611998319085617152.
>>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: Is that solr supports multi version operations?

Shawn Heisey-2
In reply to this post by zhenyuan wei
On 9/18/2018 8:11 PM, zhenyuan wei wrote:
> Hi all,
>      add solr document with overwrite=false will keepping multi version
> documents,
> My question is :
>      1.  How to search newest documents?with what options?
>      2.  How to delete  old version < newest version  documents?

When Solr is compiling results, it will only return one copy of a
particular document (based on uniqueKey value).All other copies will be
removed.

I suspect (but do not know for sure) that which document will be
returned is not defined.  On a multi-shard index, if different copies
are in different shards, which one is returned will be decided by which
shard answers the query first, or maybe which one answers last.  If
multiple copies exist in the same core, that's probably more
deterministic, but it might not be the copy you wanted.

Solr isn't designed to have multiple versions of the same uniqueKey in
the index.  Lucene itself doesn't care -- it's going to return all of
them -- but if you want to be sure which one is returned, you'd need to
write the Lucene-based software yourself instead of using Solr.

As you mentioned in the last message, writing to your true data store
and then writing to Solr if that succeeds is a better option.  Or you
could simply write to your data store and then have your indexing
software detect and read the new records from there.

Thanks,
Shawn

Reply | Threaded
Open this post in threaded view
|

Re: Is that solr supports multi version operations?

zhenyuan wei
Yeah~, writing to true data store first, then write to solr.  I found it is
simple to guarantee the finally consistency,
only handling two main exception bellow:
1. If  failed to write to true data store,then client samply  retry its
request。
2. If write true data store success,and failed to write to solr, it will
retry to solr infinitely。
    If write to solr is failed,and server was kill,I can use   the
transaction log of the true data store to replay and write to solr again。




Shawn Heisey <[hidden email]> 于2018年9月19日周三 下午10:38写道:

> On 9/18/2018 8:11 PM, zhenyuan wei wrote:
> > Hi all,
> >      add solr document with overwrite=false will keepping multi version
> > documents,
> > My question is :
> >      1.  How to search newest documents?with what options?
> >      2.  How to delete  old version < newest version  documents?
>
> When Solr is compiling results, it will only return one copy of a
> particular document (based on uniqueKey value).All other copies will be
> removed.
>
> I suspect (but do not know for sure) that which document will be
> returned is not defined.  On a multi-shard index, if different copies
> are in different shards, which one is returned will be decided by which
> shard answers the query first, or maybe which one answers last.  If
> multiple copies exist in the same core, that's probably more
> deterministic, but it might not be the copy you wanted.
>
> Solr isn't designed to have multiple versions of the same uniqueKey in
> the index.  Lucene itself doesn't care -- it's going to return all of
> them -- but if you want to be sure which one is returned, you'd need to
> write the Lucene-based software yourself instead of using Solr.
>
> As you mentioned in the last message, writing to your true data store
> and then writing to Solr if that succeeds is a better option.  Or you
> could simply write to your data store and then have your indexing
> software detect and read the new records from there.
>
> Thanks,
> Shawn
>
>