[jira] [Created] (LUCENE-8255) Can we make index sorting work for soft deletes

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[jira] [Created] (LUCENE-8255) Can we make index sorting work for soft deletes

JIRA jira@apache.org
Simon Willnauer created LUCENE-8255:

             Summary: Can we make index sorting work for soft deletes
                 Key: LUCENE-8255
                 URL: https://issues.apache.org/jira/browse/LUCENE-8255
             Project: Lucene - Core
          Issue Type: Improvement
            Reporter: Simon Willnauer

I phrased this as a question since it's mainly a discussion. I spoke to [~rcmuir] on a couple of occasions about making index sorting work for soft deletes. The issue that prevents this is that soft deletes use updateable DV to mark docs as deleted. This basically means that a sorted segment is not guaranteed to be sorted if it has received any updates. This also means that sorting such a segment on merge has a significant overhead. (I hope [~jimczi] can shed some light on it how much we would have to expect). We also need to add some special casing since we use "merge sorting" and can't go backwards in doc ID which would be violated if a segment received updates. (cc [~jpountz])

The main purpose of doing this is that "soft deleted" documents would either be at the end or in the beginning of the segment such that compression is better if these docs have larger retention policies. 

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]