[jira] [Commented] (LUCENE-8595) TestMixedDocValuesUpdates.testTryUpdateMultiThreaded fails

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (LUCENE-8595) TestMixedDocValuesUpdates.testTryUpdateMultiThreaded fails

JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-8595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16711868#comment-16711868 ]

Adrien Grand commented on LUCENE-8595:
--------------------------------------

I found the issue: the dedup logic is broken in case a value is both removed and added to the same document in a single batch. Here is a patch:

{code:java}
diff --git a/lucene/core/src/java/org/apache/lucene/index/DocValuesFieldUpdates.java b/lucene/core/src/java/org/apache/lucene/index/DocValuesFieldUpdates.java
index 9bf9179..b0ad088 100644
--- a/lucene/core/src/java/org/apache/lucene/index/DocValuesFieldUpdates.java
+++ b/lucene/core/src/java/org/apache/lucene/index/DocValuesFieldUpdates.java
@@ -392,9 +392,13 @@ abstract class DocValuesFieldUpdates implements Accountable {
       }
       long longDoc = docs.get(idx);
       ++idx;
-      while (idx < size && docs.get(idx) == longDoc) {
+      for (; idx < size; idx++) {
         // scan forward to last update to this doc
-        ++idx;
+        final long nextLongDoc = docs.get(idx);
+        if ((longDoc >>> 1) != (nextLongDoc >>> 1)) {
+          break;
+        }
+        longDoc = nextLongDoc;
       }
       hasValue = (longDoc & HAS_VALUE_MASK) >  0;
       if (hasValue) {
{code}

We have had this bug since we introduced the ability to reset values in LUCENE-8298, recent changes just made this bug more visible: until recently you had to update the same document via two different terms for this bug to occur.

> TestMixedDocValuesUpdates.testTryUpdateMultiThreaded fails
> ----------------------------------------------------------
>
>                 Key: LUCENE-8595
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8595
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/index
>    Affects Versions: master (8.0)
>            Reporter: Michael McCandless
>            Priority: Major
>
> It does reproduce ... I haven't dug in:
>  
> {noformat}
>    [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestMixedDocValuesUpdates -Dtests.method=testTryUpdateMultiThreaded -Dtests.seed=E079543483688908 -Dtests.badapples=true -Dtests.loc\
> ale=mt-MT -Dtests.timezone=VST -Dtests.asserts=true -Dtests.file.encoding=US-ASCII
>    [junit4] FAILURE 0.69s | TestMixedDocValuesUpdates.testTryUpdateMultiThreaded <<<
>    [junit4]    > Throwable #1: java.lang.AssertionError: docID: 63
>    [junit4]    >        at __randomizedtesting.SeedInfo.seed([E079543483688908:4809171572AE9A81]:0)
>    [junit4]    >        at org.apache.lucene.index.TestMixedDocValuesUpdates.testTryUpdateMultiThreaded(TestMixedDocValuesUpdates.java:526)
>    [junit4]    >        at java.lang.Thread.run(Thread.java:745)
>    [junit4]   2> NOTE: test params are: codec=Asserting(Lucene80): {id=PostingsFormat(name=LuceneVarGapFixedInterval)}, docValues:{value=DocValuesFormat(name=Lucene70)}, maxPointsInLeafNode=13\
> 12, maxMBSortInHeap=7.5990910168370895, sim=Asserting(org.apache.lucene.search.similarities.AssertingSimilarity@e08c0f3), locale=mt-MT, timezone=VST
>    [junit4]   2> NOTE: Linux 4.4.0-92-generic amd64/Oracle Corporation 1.8.0_121 (64-bit)/cpus=8,threads=1,free=446496544,total=514850816
>    [junit4]   2> NOTE: All tests run in this JVM: [TestMixedDocValuesUpdates]
>    [junit4] Completed [1/1 (1!)] in 0.83s, 1 test, 1 failure <<< FAILURES!{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]