[jira] Created: (LUCENE-1918) Adding empty ParallelReader indexes to an IndexWriter may cause ArrayIndexOutOfBoundsException or NoSuchElementException

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

[jira] Created: (LUCENE-1918) Adding empty ParallelReader indexes to an IndexWriter may cause ArrayIndexOutOfBoundsException or NoSuchElementException

JIRA jira@apache.org
Adding empty ParallelReader indexes to an IndexWriter may cause ArrayIndexOutOfBoundsException or NoSuchElementException
------------------------------------------------------------------------------------------------------------------------

                 Key: LUCENE-1918
                 URL: https://issues.apache.org/jira/browse/LUCENE-1918
             Project: Lucene - Java
          Issue Type: Bug
          Components: Index
    Affects Versions: 2.4.1, 2.4.2, 2.9
         Environment: any
            Reporter: Christian Kohlschütter
             Fix For: 2.9, 2.4.1


Hi,
I recently stumbled upon this:

It is possible (and perfectly legal) to add empty indexes (IndexReaders) to an IndexWriter. However, when using ParallelReaders in this context, in two situations RuntimeExceptions may occur for no good reason.

Condition 1:
The indexes within the ParallelReader are just empty.

When adding them to the IndexWriter, we get a java.util.NoSuchElementException triggered by ParallelTermEnum's constructor. The reason for that is the TreeMap#firstKey() method which was assumed to return null if there is no entry (which is not true, apparently -- it only returns null if the first key in the Map is null).


Condition 2 (Assuming the aforementioned bug is fixed):
The indexes within the ParallelReader originally contained one or more fields with TermVectors, but all documents have been marked as deleted.

When adding the indexes to the IndexWriter, we get a java.lang.ArrayIndexOutOfBoundsException triggered by TermVectorsWriter#addAllDocVectors. The reason here is that TermVectorsWriter assumes that if the index is marked to have TermVectors, at least one field actually exists for that. This unfortunately is not true, either.

Patches and a testcase demonstrating the two bugs are provided.

Cheers,
Christian

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-1918) Adding empty ParallelReader indexes to an IndexWriter may cause ArrayIndexOutOfBoundsException or NoSuchElementException

JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/LUCENE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Christian Kohlschütter updated LUCENE-1918:
-------------------------------------------

    Attachment: ParallelReaderWithEmptyIndex.patch
                ParallelReaderWithEmptyIndex-testcase.patch

Testcase and bugfixes for trunk (should also be applicable to 2.4.1)


> Adding empty ParallelReader indexes to an IndexWriter may cause ArrayIndexOutOfBoundsException or NoSuchElementException
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1918
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1918
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.4.1, 2.4.2, 2.9
>         Environment: any
>            Reporter: Christian Kohlschütter
>             Fix For: 2.4.1, 2.9
>
>         Attachments: ParallelReaderWithEmptyIndex-testcase.patch, ParallelReaderWithEmptyIndex.patch
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Hi,
> I recently stumbled upon this:
> It is possible (and perfectly legal) to add empty indexes (IndexReaders) to an IndexWriter. However, when using ParallelReaders in this context, in two situations RuntimeExceptions may occur for no good reason.
> Condition 1:
> The indexes within the ParallelReader are just empty.
> When adding them to the IndexWriter, we get a java.util.NoSuchElementException triggered by ParallelTermEnum's constructor. The reason for that is the TreeMap#firstKey() method which was assumed to return null if there is no entry (which is not true, apparently -- it only returns null if the first key in the Map is null).
> Condition 2 (Assuming the aforementioned bug is fixed):
> The indexes within the ParallelReader originally contained one or more fields with TermVectors, but all documents have been marked as deleted.
> When adding the indexes to the IndexWriter, we get a java.lang.ArrayIndexOutOfBoundsException triggered by TermVectorsWriter#addAllDocVectors. The reason here is that TermVectorsWriter assumes that if the index is marked to have TermVectors, at least one field actually exists for that. This unfortunately is not true, either.
> Patches and a testcase demonstrating the two bugs are provided.
> Cheers,
> Christian

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Assigned: (LUCENE-1918) Adding empty ParallelReader indexes to an IndexWriter may cause ArrayIndexOutOfBoundsException or NoSuchElementException

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/LUCENE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless reassigned LUCENE-1918:
------------------------------------------

    Assignee: Michael McCandless

> Adding empty ParallelReader indexes to an IndexWriter may cause ArrayIndexOutOfBoundsException or NoSuchElementException
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1918
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1918
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.4.1, 2.4.2, 2.9
>         Environment: any
>            Reporter: Christian Kohlschütter
>            Assignee: Michael McCandless
>             Fix For: 2.4.1, 2.9
>
>         Attachments: ParallelReaderWithEmptyIndex-testcase.patch, ParallelReaderWithEmptyIndex.patch
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Hi,
> I recently stumbled upon this:
> It is possible (and perfectly legal) to add empty indexes (IndexReaders) to an IndexWriter. However, when using ParallelReaders in this context, in two situations RuntimeExceptions may occur for no good reason.
> Condition 1:
> The indexes within the ParallelReader are just empty.
> When adding them to the IndexWriter, we get a java.util.NoSuchElementException triggered by ParallelTermEnum's constructor. The reason for that is the TreeMap#firstKey() method which was assumed to return null if there is no entry (which is not true, apparently -- it only returns null if the first key in the Map is null).
> Condition 2 (Assuming the aforementioned bug is fixed):
> The indexes within the ParallelReader originally contained one or more fields with TermVectors, but all documents have been marked as deleted.
> When adding the indexes to the IndexWriter, we get a java.lang.ArrayIndexOutOfBoundsException triggered by TermVectorsWriter#addAllDocVectors. The reason here is that TermVectorsWriter assumes that if the index is marked to have TermVectors, at least one field actually exists for that. This unfortunately is not true, either.
> Patches and a testcase demonstrating the two bugs are provided.
> Cheers,
> Christian

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-1918) Adding empty ParallelReader indexes to an IndexWriter may cause ArrayIndexOutOfBoundsException or NoSuchElementException

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/LUCENE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless updated LUCENE-1918:
---------------------------------------

    Attachment: LUCENE-1918.patch

Patch looks good!  Thanks Christian.  Good catches!

I made minor changes to it -- added CHANGES entry, fixed indentaiton, switched the test over to MockRAMDir (and closed them) and added checkIndex calls.


> Adding empty ParallelReader indexes to an IndexWriter may cause ArrayIndexOutOfBoundsException or NoSuchElementException
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1918
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1918
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.4.1, 2.4.2, 2.9
>         Environment: any
>            Reporter: Christian Kohlschütter
>            Assignee: Michael McCandless
>             Fix For: 2.4.1, 2.9
>
>         Attachments: LUCENE-1918.patch, ParallelReaderWithEmptyIndex-testcase.patch, ParallelReaderWithEmptyIndex.patch
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Hi,
> I recently stumbled upon this:
> It is possible (and perfectly legal) to add empty indexes (IndexReaders) to an IndexWriter. However, when using ParallelReaders in this context, in two situations RuntimeExceptions may occur for no good reason.
> Condition 1:
> The indexes within the ParallelReader are just empty.
> When adding them to the IndexWriter, we get a java.util.NoSuchElementException triggered by ParallelTermEnum's constructor. The reason for that is the TreeMap#firstKey() method which was assumed to return null if there is no entry (which is not true, apparently -- it only returns null if the first key in the Map is null).
> Condition 2 (Assuming the aforementioned bug is fixed):
> The indexes within the ParallelReader originally contained one or more fields with TermVectors, but all documents have been marked as deleted.
> When adding the indexes to the IndexWriter, we get a java.lang.ArrayIndexOutOfBoundsException triggered by TermVectorsWriter#addAllDocVectors. The reason here is that TermVectorsWriter assumes that if the index is marked to have TermVectors, at least one field actually exists for that. This unfortunately is not true, either.
> Patches and a testcase demonstrating the two bugs are provided.
> Cheers,
> Christian

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-1918) Adding empty ParallelReader indexes to an IndexWriter may cause ArrayIndexOutOfBoundsException or NoSuchElementException

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757080#action_12757080 ]

Michael McCandless commented on LUCENE-1918:
--------------------------------------------

Mark, I think we should commit this for 2.9?

> Adding empty ParallelReader indexes to an IndexWriter may cause ArrayIndexOutOfBoundsException or NoSuchElementException
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1918
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1918
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.4.1, 2.4.2, 2.9
>         Environment: any
>            Reporter: Christian Kohlschütter
>            Assignee: Michael McCandless
>             Fix For: 2.4.1, 2.9
>
>         Attachments: LUCENE-1918.patch, ParallelReaderWithEmptyIndex-testcase.patch, ParallelReaderWithEmptyIndex.patch
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Hi,
> I recently stumbled upon this:
> It is possible (and perfectly legal) to add empty indexes (IndexReaders) to an IndexWriter. However, when using ParallelReaders in this context, in two situations RuntimeExceptions may occur for no good reason.
> Condition 1:
> The indexes within the ParallelReader are just empty.
> When adding them to the IndexWriter, we get a java.util.NoSuchElementException triggered by ParallelTermEnum's constructor. The reason for that is the TreeMap#firstKey() method which was assumed to return null if there is no entry (which is not true, apparently -- it only returns null if the first key in the Map is null).
> Condition 2 (Assuming the aforementioned bug is fixed):
> The indexes within the ParallelReader originally contained one or more fields with TermVectors, but all documents have been marked as deleted.
> When adding the indexes to the IndexWriter, we get a java.lang.ArrayIndexOutOfBoundsException triggered by TermVectorsWriter#addAllDocVectors. The reason here is that TermVectorsWriter assumes that if the index is marked to have TermVectors, at least one field actually exists for that. This unfortunately is not true, either.
> Patches and a testcase demonstrating the two bugs are provided.
> Cheers,
> Christian

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-1918) Adding empty ParallelReader indexes to an IndexWriter may cause ArrayIndexOutOfBoundsException or NoSuchElementException

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757090#action_12757090 ]

Uwe Schindler commented on LUCENE-1918:
---------------------------------------

I have no problem with this. I think we need a new RC for sure because of LUCENE-1919, which is very tricky and it should be tested by public, who have for sure lot of old-styled TokenStreams.

> Adding empty ParallelReader indexes to an IndexWriter may cause ArrayIndexOutOfBoundsException or NoSuchElementException
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1918
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1918
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.4.1, 2.4.2, 2.9
>         Environment: any
>            Reporter: Christian Kohlschütter
>            Assignee: Michael McCandless
>             Fix For: 2.4.1, 2.9
>
>         Attachments: LUCENE-1918.patch, ParallelReaderWithEmptyIndex-testcase.patch, ParallelReaderWithEmptyIndex.patch
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Hi,
> I recently stumbled upon this:
> It is possible (and perfectly legal) to add empty indexes (IndexReaders) to an IndexWriter. However, when using ParallelReaders in this context, in two situations RuntimeExceptions may occur for no good reason.
> Condition 1:
> The indexes within the ParallelReader are just empty.
> When adding them to the IndexWriter, we get a java.util.NoSuchElementException triggered by ParallelTermEnum's constructor. The reason for that is the TreeMap#firstKey() method which was assumed to return null if there is no entry (which is not true, apparently -- it only returns null if the first key in the Map is null).
> Condition 2 (Assuming the aforementioned bug is fixed):
> The indexes within the ParallelReader originally contained one or more fields with TermVectors, but all documents have been marked as deleted.
> When adding the indexes to the IndexWriter, we get a java.lang.ArrayIndexOutOfBoundsException triggered by TermVectorsWriter#addAllDocVectors. The reason here is that TermVectorsWriter assumes that if the index is marked to have TermVectors, at least one field actually exists for that. This unfortunately is not true, either.
> Patches and a testcase demonstrating the two bugs are provided.
> Cheers,
> Christian

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-1918) Adding empty ParallelReader indexes to an IndexWriter may cause ArrayIndexOutOfBoundsException or NoSuchElementException

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757125#action_12757125 ]

Mark Miller commented on LUCENE-1918:
-------------------------------------

Agreed - we are now stuck with a new rc it sounds, so let's fix what we can.

> Adding empty ParallelReader indexes to an IndexWriter may cause ArrayIndexOutOfBoundsException or NoSuchElementException
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1918
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1918
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.4.1, 2.4.2, 2.9
>         Environment: any
>            Reporter: Christian Kohlschütter
>            Assignee: Michael McCandless
>             Fix For: 2.4.1, 2.9
>
>         Attachments: LUCENE-1918.patch, ParallelReaderWithEmptyIndex-testcase.patch, ParallelReaderWithEmptyIndex.patch
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Hi,
> I recently stumbled upon this:
> It is possible (and perfectly legal) to add empty indexes (IndexReaders) to an IndexWriter. However, when using ParallelReaders in this context, in two situations RuntimeExceptions may occur for no good reason.
> Condition 1:
> The indexes within the ParallelReader are just empty.
> When adding them to the IndexWriter, we get a java.util.NoSuchElementException triggered by ParallelTermEnum's constructor. The reason for that is the TreeMap#firstKey() method which was assumed to return null if there is no entry (which is not true, apparently -- it only returns null if the first key in the Map is null).
> Condition 2 (Assuming the aforementioned bug is fixed):
> The indexes within the ParallelReader originally contained one or more fields with TermVectors, but all documents have been marked as deleted.
> When adding the indexes to the IndexWriter, we get a java.lang.ArrayIndexOutOfBoundsException triggered by TermVectorsWriter#addAllDocVectors. The reason here is that TermVectorsWriter assumes that if the index is marked to have TermVectors, at least one field actually exists for that. This unfortunately is not true, either.
> Patches and a testcase demonstrating the two bugs are provided.
> Cheers,
> Christian

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Resolved: (LUCENE-1918) Adding empty ParallelReader indexes to an IndexWriter may cause ArrayIndexOutOfBoundsException or NoSuchElementException

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/LUCENE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless resolved LUCENE-1918.
----------------------------------------

    Resolution: Fixed

Thanks Christian!

> Adding empty ParallelReader indexes to an IndexWriter may cause ArrayIndexOutOfBoundsException or NoSuchElementException
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1918
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1918
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.4.1, 2.4.2, 2.9
>         Environment: any
>            Reporter: Christian Kohlschütter
>            Assignee: Michael McCandless
>             Fix For: 2.9, 2.4.1
>
>         Attachments: LUCENE-1918.patch, ParallelReaderWithEmptyIndex-testcase.patch, ParallelReaderWithEmptyIndex.patch
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Hi,
> I recently stumbled upon this:
> It is possible (and perfectly legal) to add empty indexes (IndexReaders) to an IndexWriter. However, when using ParallelReaders in this context, in two situations RuntimeExceptions may occur for no good reason.
> Condition 1:
> The indexes within the ParallelReader are just empty.
> When adding them to the IndexWriter, we get a java.util.NoSuchElementException triggered by ParallelTermEnum's constructor. The reason for that is the TreeMap#firstKey() method which was assumed to return null if there is no entry (which is not true, apparently -- it only returns null if the first key in the Map is null).
> Condition 2 (Assuming the aforementioned bug is fixed):
> The indexes within the ParallelReader originally contained one or more fields with TermVectors, but all documents have been marked as deleted.
> When adding the indexes to the IndexWriter, we get a java.lang.ArrayIndexOutOfBoundsException triggered by TermVectorsWriter#addAllDocVectors. The reason here is that TermVectorsWriter assumes that if the index is marked to have TermVectors, at least one field actually exists for that. This unfortunately is not true, either.
> Patches and a testcase demonstrating the two bugs are provided.
> Cheers,
> Christian

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]