[jira] Created: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

classic Classic list List threaded Threaded
27 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[jira] Created: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Sebastian Nagel (Jira)
IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
---------------------------------------------------------------------------

                 Key: LUCENE-2790
                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
             Project: Lucene - Java
          Issue Type: Improvement
          Components: Index
            Reporter: Shai Erera
            Assignee: Shai Erera
            Priority: Minor
             Fix For: 3.1, 4.0


Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.

I will attach a patch shortly that addresses the issue on trunk.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Sebastian Nagel (Jira)

     [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shai Erera updated LUCENE-2790:
-------------------------------

    Attachment: LUCENE-2790.patch

Patch applied on trunk. I took the opportunity to fix some minor Javadoc warnings as well.

> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12966103#action_12966103 ]

Earwin Burrfoot commented on LUCENE-2790:
-----------------------------------------

Fails addIndexesWithThreads with ConcurrentModificationException, if MergePolicy actually tries to iterate infos passed to useCompoundFile(SIS, SI).

> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)

     [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Earwin Burrfoot updated LUCENE-2790:
------------------------------------

    Attachment: LUCENE-2790.patch

Check this patch out.
It moves noCFS ratio to useCompoundFile(SIS, SI) and drops useCompoundFile from OneMerge, so all decisions about using compound files now happen in a single place.
It also highlights the problem with your patch - when calling useCompoundFile from addIndexes, you should hold a lock, so segmentInfos won't be modified while mergePolicy inspects them.

> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790.patch, LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Issue Comment Edited: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12966108#action_12966108 ]

Earwin Burrfoot edited comment on LUCENE-2790 at 12/2/10 8:12 AM:
------------------------------------------------------------------

Check this patch out.
It changes useCompoundFile(SIS, SI) to respect noCFSRatio and drops useCompoundFile from OneMerge, so all decisions about using compound files now happen in a single place.
It also highlights the problem with your patch - when calling useCompoundFile from addIndexes, you should hold a lock, so segmentInfos won't be modified while mergePolicy inspects them.

      was (Author: earwin):
    Check this patch out.
It moves noCFS ratio to useCompoundFile(SIS, SI) and drops useCompoundFile from OneMerge, so all decisions about using compound files now happen in a single place.
It also highlights the problem with your patch - when calling useCompoundFile from addIndexes, you should hold a lock, so segmentInfos won't be modified while mergePolicy inspects them.
 

> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790.patch, LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12966110#action_12966110 ]

Shai Erera commented on LUCENE-2790:
------------------------------------

test-core passed for me before I uploaded the patch. Can you please post here the 'ant test' command that reproduces it?

I checked who implements useCompoundFile and all I find is LogMP and NoMP, both don't iterate on the SegmentInfos. What MP did you test with?

Anyway, need to take a closer look at that. So if you can paste here the 'ant test' that reproduces it, it'd be great.

> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790.patch, LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12966112#action_12966112 ]

Earwin Burrfoot commented on LUCENE-2790:
-----------------------------------------

bq. I checked who implements useCompoundFile and all I find is LogMP and NoMP, both don't iterate on the SegmentInfos. What MP did you test with?
Apply my patch, it changes LogMP to use SegmentInfos.

bq. So if you can paste here the 'ant test' that reproduces it, it'd be great.
ant test -Dtestcase=TestAddIndexes -Dtestmethod=testAddIndexesWithThreads -Dtests.seed=5369960668186287821:331425426639083833 -Dtests.codec=randomPerField
The test is threaded, so it doesn't fail always.

> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790.patch, LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)

     [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shai Erera updated LUCENE-2790:
-------------------------------

    Attachment: LUCENE-2790.patch

Patch fixes the threading issue Earwin reported, by checking whether to create the CFS in a sync block. Also, after discussing this on IRC, the code is further simplified by creating the compound file before the new segment is committed.

However, some tests still fail on ConcurrentModException. I cannot debug it now, so am posting the patch in case someone wants to take a stab. I can continue later. To reproduce the failure:

ant test -Dtestcase=TestIndexWriter -Dtestmethod=testDeleteUnusedFiles -Dtests.seed=-1861905402886420424:-8896948763797565454 -Dtests.codec=randomPerField



> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)

     [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Earwin Burrfoot updated LUCENE-2790:
------------------------------------

    Attachment: LUCENE-2790.patch

Okay, this patch fixes remaining threading issue in IW.mergeMiddle,
and three tests that were expecting CFS segments and weren't getting ones
due to flush now respecting noCFSRatio and noCFSRatio default of 0.1

> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12966167#action_12966167 ]

Michael McCandless commented on LUCENE-2790:
--------------------------------------------

Patch looks great!

My only concern is... it looks like addIndexes(IR[]), with compound file used in the end, may fail to delete the non-compound files once the SegmentInfo is committed?  Maybe we should add a test to show the failure...

I think we need to do something like this:
{noformat}
          // delete new non cfs files directly: they were never
          // registered with IFD
          deleter.deleteNewFiles(merger.getMergedFiles(merge.info));
{noformat}

> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12966178#action_12966178 ]

Michael McCandless commented on LUCENE-2790:
--------------------------------------------

Hmm... something is amiss.  I hit this failure:
{noformat}
ant test -Dtestcase=TestIndexSplitter -Dtestmethod=test -Dtests.seed=5299033587626573117:-25334708766924714 -Dtests.codec=randomPerField
{noformat}

But it passes on trunk...

> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)

     [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Earwin Burrfoot updated LUCENE-2790:
------------------------------------

    Attachment: LUCENE-2790.patch

Fixed your test failure

> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12966248#action_12966248 ]

Shai Erera commented on LUCENE-2790:
------------------------------------

Patch looks good. All tests pass for me. Let's give it a couple more tries, to allow for random tests to catch us. It'd be good if you can try running them too.

> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12966285#action_12966285 ]

Earwin Burrfoot commented on LUCENE-2790:
-----------------------------------------

Shai, what about:
bq. My only concern is... it looks like addIndexes(IR[]), with compound file used in the end, may fail to delete the non-compound files once the SegmentInfo is committed?
I fixed everything else, but can't answer this question.

> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)

     [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shai Erera updated LUCENE-2790:
-------------------------------

    Attachment: LUCENE-2790.patch

Attached adds a test to TestAddIndexes w/ the fix as Mike proposed. The test fails w/o the fix and passes w/ it.

Also, I noticed that if I don't set noCFSRatio to 1.0, then the added segments are not converted to a CFS. That is because useCompoundFiles on LMP decides not to do that, because the size of the segment, which is 377 bytes, is more than 10% of the total index size, which is ... 0. I wonder if we should handle that case, or leave it as is - at some point, when more documents are added, that segment will be converted to a CFS.

I think that means that the first few segments that will be flushed will remain in non CFS format. I'm fine w/ it, just making sure I understand this right.

> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)

     [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shai Erera updated LUCENE-2790:
-------------------------------

    Attachment: LUCENE-2790.patch

Same patch, only uses MockAnalyzer and not WhitespaceAnalyzer (which failed compilation from command line).

> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12966550#action_12966550 ]

Earwin Burrfoot commented on LUCENE-2790:
-----------------------------------------

Ok, let's commit?

There's no need to force first few commits to CFS. CFS' sole purporse is to keep number of simultaneously open files low. Not likely you gonna see frightening numbers with only a pair of segments in index.
Later these segments are merged (and probably CFSed), so no worries.

> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12966557#action_12966557 ]

Michael McCandless commented on LUCENE-2790:
--------------------------------------------

bq. Ok, let's commit?

+1

> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12966787#action_12966787 ]

Shai Erera commented on LUCENE-2790:
------------------------------------

Do you see any back-compat issues w/ back-porting it to 3x? I'm thinking about the change in behavior of useCompoundFile in LMP which now factors is noCFSRatio. However, I see that noCFSRatio is in 3x's LMP and defaults to 0.1, which already changes behavior, so I think we can apply this change to 3x as well. What do you think?

> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-2790) IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12966788#action_12966788 ]

Shai Erera commented on LUCENE-2790:
------------------------------------

Committed revision 1042101 to trunk.

I will back port to 3x if you agree this isn't a backwards break.

BTW, I did not add a CHANGES entry, because it's an internal optimization we've made to IndexWriter. Hmm .. maybe we should document the changes to LMP.useCompoundFile (that it now factors in the noCFSRatio)?

> IndexWriter should call MP.useCompoundFile and not LogMP.getUseCompoundFile
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-2790
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2790
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch, LUCENE-2790.patch
>
>
> Spin off from here: http://www.gossamer-threads.com/lists/lucene/java-dev/112311.
> I will attach a patch shortly that addresses the issue on trunk.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

12