[jira] Updated: (LUCENE-325) [PATCH] new method expungeDeleted() added to IndexWriter

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[jira] Updated: (LUCENE-325) [PATCH] new method expungeDeleted() added to IndexWriter

Tim Allison (Jira)

     [ https://issues.apache.org/jira/browse/LUCENE-325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless updated LUCENE-325:

    Attachment: LUCENE-325.patch

Attached patch.  All tests pass.  I plan to commit in a day or two.

This adds two methods to IndexWriter:

  expungeDeletes() -- defaults to doWait=true
  expungeDeletes(boolean doWait)

If doWait is false, and you have a MergeScheduler that runs merges in
BG threads, then the call returns immediately.

I extended MergePolicy so it decides what "expunge deletes" really
means (findMergesToExpungeDeletes).  Then, in LogMergePolicy, I
implemented this policy: we merge all adjacent segments (up to
mergeFactor at once) that have deletes.  If only 1 segment has
deletes, it's a singular merge.

> [PATCH] new method expungeDeleted() added to IndexWriter
> --------------------------------------------------------
>                 Key: LUCENE-325
>                 URL: https://issues.apache.org/jira/browse/LUCENE-325
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>    Affects Versions: CVS Nightly - Specify date in submission
>         Environment: Operating System: Windows XP
> Platform: All
>            Reporter: John Wang
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 2.4
>         Attachments: attachment.txt, IndexWriter.patch, IndexWriter.patch, LUCENE-325.patch, TestExpungeDeleted.java
> We make use the docIDs in lucene. I need a way to compact the docIDs in segments
> to remove the "holes" created from doing deletes. The only way to do this is by
> calling IndexWriter.optimize(). This is a very heavy call, for the cases where
> the index is large but with very small number of deleted docs, calling optimize
> is not practical.
> I need a new method: expungeDeleted(), which finds all the segments that have
> delete documents and merge only those segments.
> I have implemented this method and have discussed with Otis about submitting a
> patch. I don't see where I can attached the patch. I will do according to the
> patch guidleine and email the lucene mailing list.
> Thanks
> -John
> I don't see a place where I can

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]