[jira] Created: (LUCENE-720) Unit tests TestBackwardsCompatibility and TestIndexFileDeleter might fail depending on JVM

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

[jira] Created: (LUCENE-720) Unit tests TestBackwardsCompatibility and TestIndexFileDeleter might fail depending on JVM

JIRA jira@apache.org
Unit tests TestBackwardsCompatibility and TestIndexFileDeleter might fail depending on JVM
------------------------------------------------------------------------------------------

                 Key: LUCENE-720
                 URL: http://issues.apache.org/jira/browse/LUCENE-720
             Project: Lucene - Java
          Issue Type: Bug
          Components: Index
    Affects Versions: 2.1
            Reporter: Michael Busch
         Assigned To: Michael McCandless
            Priority: Minor


In the two units tests TestBackwardsCompatibility and TestIndexFileDeleter several index file names are hardcoded. For example, in TestBackwardsCompatibility.testExactFileNames() it is tested if the index directory contains exactly the expected files after several operations like addDocument(), deleteDocument() and setNorm() have been performed. Apparently the unit tests pass on the nightly build machine, but in my environment (Windows XP, IBM JVM 1.5) they fail for the following reason:

When IndexReader.setNorm() is called a new norm file for the specified field is created with the file  ending .sx, where x is the number of the field. The problem is that the SegmentMerger can not guarantee to keep the order of the fields, in other words after a merge took place a field can have a different field number. This specific testcase fails, because it expects the file ending .s0, but the file has the ending .s1.

The reason why the field numbers can be different on different JVMs is the use of HashSet in SegmentReader.getFieldNames(). Depending on the HashSet implementation an iterator might not iterate over the entries in insertion order. When I change HashSet to LinkedHashSet, the two testcases pass.

However, even with a LinkedHashSet the order of the field numbers might change during a merge, because the order in which the SegmentMerger merges the FieldInfos depends on the field options like TERMVECTOR, INDEXED... (see SegmentMerger.mergeFields() for details).

So I think we should not use LinkedHashSet but rather change the problematic testcases. Furthermore I'm not sure if we should have hardcoded filenames in the tests anyway, because if we change the index format or file names in the future these test cases would fail without modification.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-720) Unit tests TestBackwardsCompatibility and TestIndexFileDeleter might fail depending on JVM

JIRA jira@apache.org
     [ http://issues.apache.org/jira/browse/LUCENE-720?page=all ]

Michael Busch updated LUCENE-720:
---------------------------------

    Environment: Windows XP, IBM JVM 1.5 SP3

> Unit tests TestBackwardsCompatibility and TestIndexFileDeleter might fail depending on JVM
> ------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-720
>                 URL: http://issues.apache.org/jira/browse/LUCENE-720
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.1
>         Environment: Windows XP, IBM JVM 1.5 SP3
>            Reporter: Michael Busch
>         Assigned To: Michael McCandless
>            Priority: Minor
>
> In the two units tests TestBackwardsCompatibility and TestIndexFileDeleter several index file names are hardcoded. For example, in TestBackwardsCompatibility.testExactFileNames() it is tested if the index directory contains exactly the expected files after several operations like addDocument(), deleteDocument() and setNorm() have been performed. Apparently the unit tests pass on the nightly build machine, but in my environment (Windows XP, IBM JVM 1.5) they fail for the following reason:
> When IndexReader.setNorm() is called a new norm file for the specified field is created with the file  ending .sx, where x is the number of the field. The problem is that the SegmentMerger can not guarantee to keep the order of the fields, in other words after a merge took place a field can have a different field number. This specific testcase fails, because it expects the file ending .s0, but the file has the ending .s1.
> The reason why the field numbers can be different on different JVMs is the use of HashSet in SegmentReader.getFieldNames(). Depending on the HashSet implementation an iterator might not iterate over the entries in insertion order. When I change HashSet to LinkedHashSet, the two testcases pass.
> However, even with a LinkedHashSet the order of the field numbers might change during a merge, because the order in which the SegmentMerger merges the FieldInfos depends on the field options like TERMVECTOR, INDEXED... (see SegmentMerger.mergeFields() for details).
> So I think we should not use LinkedHashSet but rather change the problematic testcases. Furthermore I'm not sure if we should have hardcoded filenames in the tests anyway, because if we change the index format or file names in the future these test cases would fail without modification.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-720) Unit tests TestBackwardsCompatibility and TestIndexFileDeleter might fail depending on JVM

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org
    [ http://issues.apache.org/jira/browse/LUCENE-720?page=comments#action_12451607 ]
           
Michael McCandless commented on LUCENE-720:
-------------------------------------------

Whoa, good catch & explanation of the issue!  Sorry about this.

I will fix these tests so they resolve the field by name and do their comparisons on that basis.

I realize that exact file name tests will need to be updated when we change the file format, but, I really wanted test coverage that verifies the generational logic to derive file names is working correctly for segments, del and separate norms files.

I will temporarily turn off TestBackwardsCompatibility.testExactFileNames, and TestIndexFileDeleter, until I can resolve this.

> Unit tests TestBackwardsCompatibility and TestIndexFileDeleter might fail depending on JVM
> ------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-720
>                 URL: http://issues.apache.org/jira/browse/LUCENE-720
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.1
>         Environment: Windows XP, IBM JVM 1.5 SP3
>            Reporter: Michael Busch
>         Assigned To: Michael McCandless
>            Priority: Minor
>
> In the two units tests TestBackwardsCompatibility and TestIndexFileDeleter several index file names are hardcoded. For example, in TestBackwardsCompatibility.testExactFileNames() it is tested if the index directory contains exactly the expected files after several operations like addDocument(), deleteDocument() and setNorm() have been performed. Apparently the unit tests pass on the nightly build machine, but in my environment (Windows XP, IBM JVM 1.5) they fail for the following reason:
> When IndexReader.setNorm() is called a new norm file for the specified field is created with the file  ending .sx, where x is the number of the field. The problem is that the SegmentMerger can not guarantee to keep the order of the fields, in other words after a merge took place a field can have a different field number. This specific testcase fails, because it expects the file ending .s0, but the file has the ending .s1.
> The reason why the field numbers can be different on different JVMs is the use of HashSet in SegmentReader.getFieldNames(). Depending on the HashSet implementation an iterator might not iterate over the entries in insertion order. When I change HashSet to LinkedHashSet, the two testcases pass.
> However, even with a LinkedHashSet the order of the field numbers might change during a merge, because the order in which the SegmentMerger merges the FieldInfos depends on the field options like TERMVECTOR, INDEXED... (see SegmentMerger.mergeFields() for details).
> So I think we should not use LinkedHashSet but rather change the problematic testcases. Furthermore I'm not sure if we should have hardcoded filenames in the tests anyway, because if we change the index format or file names in the future these test cases would fail without modification.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Resolved: (LUCENE-720) Unit tests TestBackwardsCompatibility and TestIndexFileDeleter might fail depending on JVM

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org
     [ http://issues.apache.org/jira/browse/LUCENE-720?page=all ]

Michael McCandless resolved LUCENE-720.
---------------------------------------

    Fix Version/s: 2.1
       Resolution: Fixed

OK I was able to reproduce the failure using IBMs JDK 5.0 on Linux.  I changed the tests to load the field infos and compute which field index corresponds to the "content" field and then use that field index instead, and the tests pass.

> Unit tests TestBackwardsCompatibility and TestIndexFileDeleter might fail depending on JVM
> ------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-720
>                 URL: http://issues.apache.org/jira/browse/LUCENE-720
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.1
>         Environment: Windows XP, IBM JVM 1.5 SP3
>            Reporter: Michael Busch
>         Assigned To: Michael McCandless
>            Priority: Minor
>             Fix For: 2.1
>
>
> In the two units tests TestBackwardsCompatibility and TestIndexFileDeleter several index file names are hardcoded. For example, in TestBackwardsCompatibility.testExactFileNames() it is tested if the index directory contains exactly the expected files after several operations like addDocument(), deleteDocument() and setNorm() have been performed. Apparently the unit tests pass on the nightly build machine, but in my environment (Windows XP, IBM JVM 1.5) they fail for the following reason:
> When IndexReader.setNorm() is called a new norm file for the specified field is created with the file  ending .sx, where x is the number of the field. The problem is that the SegmentMerger can not guarantee to keep the order of the fields, in other words after a merge took place a field can have a different field number. This specific testcase fails, because it expects the file ending .s0, but the file has the ending .s1.
> The reason why the field numbers can be different on different JVMs is the use of HashSet in SegmentReader.getFieldNames(). Depending on the HashSet implementation an iterator might not iterate over the entries in insertion order. When I change HashSet to LinkedHashSet, the two testcases pass.
> However, even with a LinkedHashSet the order of the field numbers might change during a merge, because the order in which the SegmentMerger merges the FieldInfos depends on the field options like TERMVECTOR, INDEXED... (see SegmentMerger.mergeFields() for details).
> So I think we should not use LinkedHashSet but rather change the problematic testcases. Furthermore I'm not sure if we should have hardcoded filenames in the tests anyway, because if we change the index format or file names in the future these test cases would fail without modification.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-720) Unit tests TestBackwardsCompatibility and TestIndexFileDeleter might fail depending on JVM

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org
    [ http://issues.apache.org/jira/browse/LUCENE-720?page=comments#action_12452001 ]
           
Michael Busch commented on LUCENE-720:
--------------------------------------

The tests now pass on my machine too. Good job, Mike! Thanks.

> Unit tests TestBackwardsCompatibility and TestIndexFileDeleter might fail depending on JVM
> ------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-720
>                 URL: http://issues.apache.org/jira/browse/LUCENE-720
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.1
>         Environment: Windows XP, IBM JVM 1.5 SP3
>            Reporter: Michael Busch
>         Assigned To: Michael McCandless
>            Priority: Minor
>             Fix For: 2.1
>
>
> In the two units tests TestBackwardsCompatibility and TestIndexFileDeleter several index file names are hardcoded. For example, in TestBackwardsCompatibility.testExactFileNames() it is tested if the index directory contains exactly the expected files after several operations like addDocument(), deleteDocument() and setNorm() have been performed. Apparently the unit tests pass on the nightly build machine, but in my environment (Windows XP, IBM JVM 1.5) they fail for the following reason:
> When IndexReader.setNorm() is called a new norm file for the specified field is created with the file  ending .sx, where x is the number of the field. The problem is that the SegmentMerger can not guarantee to keep the order of the fields, in other words after a merge took place a field can have a different field number. This specific testcase fails, because it expects the file ending .s0, but the file has the ending .s1.
> The reason why the field numbers can be different on different JVMs is the use of HashSet in SegmentReader.getFieldNames(). Depending on the HashSet implementation an iterator might not iterate over the entries in insertion order. When I change HashSet to LinkedHashSet, the two testcases pass.
> However, even with a LinkedHashSet the order of the field numbers might change during a merge, because the order in which the SegmentMerger merges the FieldInfos depends on the field options like TERMVECTOR, INDEXED... (see SegmentMerger.mergeFields() for details).
> So I think we should not use LinkedHashSet but rather change the problematic testcases. Furthermore I'm not sure if we should have hardcoded filenames in the tests anyway, because if we change the index format or file names in the future these test cases would fail without modification.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-720) Unit tests TestBackwardsCompatibility and TestIndexFileDeleter might fail depending on JVM

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org
    [ http://issues.apache.org/jira/browse/LUCENE-720?page=comments#action_12452010 ]
           
Michael McCandless commented on LUCENE-720:
-------------------------------------------

Thank you for tracking this down!  I did not realize field number assinment was volatile across JREs.

> Unit tests TestBackwardsCompatibility and TestIndexFileDeleter might fail depending on JVM
> ------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-720
>                 URL: http://issues.apache.org/jira/browse/LUCENE-720
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.1
>         Environment: Windows XP, IBM JVM 1.5 SP3
>            Reporter: Michael Busch
>         Assigned To: Michael McCandless
>            Priority: Minor
>             Fix For: 2.1
>
>
> In the two units tests TestBackwardsCompatibility and TestIndexFileDeleter several index file names are hardcoded. For example, in TestBackwardsCompatibility.testExactFileNames() it is tested if the index directory contains exactly the expected files after several operations like addDocument(), deleteDocument() and setNorm() have been performed. Apparently the unit tests pass on the nightly build machine, but in my environment (Windows XP, IBM JVM 1.5) they fail for the following reason:
> When IndexReader.setNorm() is called a new norm file for the specified field is created with the file  ending .sx, where x is the number of the field. The problem is that the SegmentMerger can not guarantee to keep the order of the fields, in other words after a merge took place a field can have a different field number. This specific testcase fails, because it expects the file ending .s0, but the file has the ending .s1.
> The reason why the field numbers can be different on different JVMs is the use of HashSet in SegmentReader.getFieldNames(). Depending on the HashSet implementation an iterator might not iterate over the entries in insertion order. When I change HashSet to LinkedHashSet, the two testcases pass.
> However, even with a LinkedHashSet the order of the field numbers might change during a merge, because the order in which the SegmentMerger merges the FieldInfos depends on the field options like TERMVECTOR, INDEXED... (see SegmentMerger.mergeFields() for details).
> So I think we should not use LinkedHashSet but rather change the problematic testcases. Furthermore I'm not sure if we should have hardcoded filenames in the tests anyway, because if we change the index format or file names in the future these test cases would fail without modification.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]