[jira] [Commented] (SOLR-12368) in-place DV updates should no longer have to jump through hoops if field does not yet exist

Previous Topic Next Topic
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[jira] [Commented] (SOLR-12368) in-place DV updates should no longer have to jump through hoops if field does not yet exist

JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/SOLR-12368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16478222#comment-16478222 ]

Hoss Man commented on SOLR-12368:

some starting points for folks who want to look into this...

From SOLR-5944...
{quote}Addressed an issue due to which in-place updating of non-existing DVs was throwing exceptions. For this, it was needed to know which fields have already been added to the index, so that if an update is needed to non-existent DV, then we can resort to a traditional full document atomic update. This check could've been easy if access to IW.globalFieldNumberMap was possible publicly. Instead resorted to checking with the RT searcher's list of DVs, and if field not found there then getting the document from tlog (RTG) and checking if the field exists in that document.
And this bit of code added in [5375410807aecf3cc67f82ca1e9ee591f39d0ac7|https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;a=blobdiff;f=solr/core/src/java/org/apache/solr/update/processor/AtomicUpdateDocumentMerger.java;h=4c843ad48db3030fb389077fce48dccb16d92b80;hp=452574e427304d2822cc27bb97d8b810ebb2c582;hb=5375410;hpb=733060121dc6f5cbc1b0e0e1412e396a3241240b]...
+    // third pass: requiring checks against the actual IndexWriter due to internal DV update limitations
+    SolrCore core = cmd.getReq().getCore();
+    RefCounted<IndexWriter> holder = core.getSolrCoreState().getIndexWriter(core);
+    Set<String> fieldNamesFromIndexWriter = null;
+    Set<String> segmentSortingFields = null;
+    try {
+      IndexWriter iw = holder.get();
+      fieldNamesFromIndexWriter = iw.getFieldNames();
+      segmentSortingFields = iw.getConfig().getIndexSortFields();
+    } finally {
+      holder.decref();
+    }
+    for (String fieldName: candidateFields) {
+      if (! fieldNamesFromIndexWriter.contains(fieldName) ) {
+        return Collections.emptySet(); // if this field doesn't exist, DV update can't work
+      }
+      if (segmentSortingFields.contains(fieldName) ) {
+        return Collections.emptySet(); // if this is used for segment sorting, DV updates can't work
+      }
...and this very explicit test of this situation: [https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;a=blob;f=solr/core/src/test/org/apache/solr/update/TestInPlaceUpdatesStandalone.java;h=9a5031fbae7d11a1fffee66216031c8f1b8bff1d;hb=5375410#l405]

> in-place DV updates should no longer have to jump through hoops if field does not yet exist
> -------------------------------------------------------------------------------------------
>                 Key: SOLR-12368
>                 URL: https://issues.apache.org/jira/browse/SOLR-12368
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public)
>            Reporter: Hoss Man
>            Priority: Major
> When SOLR-5944 first added "in-place" DocValue updates to Solr, one of the edge cases thta had to be dealt with was the limitation imposed by IndexWriter that docValues could only be updated if they already existed - if a shard did not yet have a document w/a value in the field where the update was attempted, we would get an error.
> LUCENE-8316 seems to have removed this error, which i believe means we can simplify & speed up some of the checks in Solr, and support this situation as well, rather then falling back on full "read stored fields & reindex" atomic update

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]