[jira] [Comment Edited] (SOLR-13320) add a param ignoreDuplicates=true to updates to not overwrite existing docs

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Comment Edited] (SOLR-13320) add a param ignoreDuplicates=true to updates to not overwrite existing docs

JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/SOLR-13320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16794015#comment-16794015 ]

Scott Blum edited comment on SOLR-13320 at 3/15/19 11:19 PM:
-------------------------------------------------------------

[~shalinmangar] lemme break this down a bit...

Imagine you're restoring a collection from a backup, but you want to be able to accept writes while this is in progress.  You start accepting writes (of new data) on the new, empty collection, then in the background you want to backfill from your backup copy, but you don't want to overwrite anything that has been written recently.

Setting "version:-1" on all the incoming, backfill doc is almost what you want-- add any documents that don't exist, but don't overwrite any documents that do exist.  The problem is that the entire batch gets rejected if even one document already exists.  We just want a way to be able to ignore conflicts and quietly drop the offending documents rather than rejecting the entire batch.

"ignoreConflicts" might be a better name.


was (Author: dragonsinth):
Shalin lemme break this down a bit...

Imagine you're restoring a collection from a backup, but you want to be able to accept writes while this is in progress.  You start accepting writes (of new data) on the new, empty collection, then in the background you want to backfill from your backup copy, but you don't want to overwrite anything that has been written recently.

Setting "version:-1" on all the incoming, backfill doc is almost what you want-- add any documents that don't exist, but don't overwrite any documents that do exist.  The problem is that the entire batch gets rejected if even one document already exists.  We just want a way to be able to ignore conflicts and quietly drop the offending documents rather than rejecting the entire batch.

"ignoreConflicts" might be a better name.

> add a param ignoreDuplicates=true to updates to not overwrite existing docs
> ---------------------------------------------------------------------------
>
>                 Key: SOLR-13320
>                 URL: https://issues.apache.org/jira/browse/SOLR-13320
>             Project: Solr
>          Issue Type: New Feature
>      Security Level: Public(Default Security Level. Issues are Public)
>            Reporter: Noble Paul
>            Assignee: Noble Paul
>            Priority: Major
>
> Updates should have an option to ignore duplicate documents and drop them if an option  {{ignoreDuplicates=true}} is specified



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]