TestUTF32ToUTF8.testRandomRegexes fails

classic Classic list List threaded Threaded
25 messages Options
12
Reply | Threaded
Open this post in threaded view
|

TestUTF32ToUTF8.testRandomRegexes fails

Shai Erera
Hi

I was running tests on trunk (after merging the changes from LUCENE-2537) and received this error message:

expected:<true> but was:<false>

junit.framework.AssertionFailedError: expected: but was:
at org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197)
at org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:170)
at org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:285)


NOTE: random seed of testcase 'testRandomRegexes' was: 3510820306304573866

I'm sure it's related to my changes. Has anyone else seen this before?

Shai
Reply | Threaded
Open this post in threaded view
|

Re: TestUTF32ToUTF8.testRandomRegexes fails

Michael McCandless-2
Hmmm this means a bug is lurking.  This is the power of random testing
(that every time we all run tests, we're testing different "paths"
through the code)....

It seems exceptionally unlikely that LUCENE-2537's changes would cause this!

But, unfortunately, when I plug that seed in I don't see it fail,
which is odd.  I'll run a stress test to see if I can tickle the
bug... can you open a Jira issue so we don't lose track?

Mike

On Mon, Jul 26, 2010 at 2:57 AM, Shai Erera <[hidden email]> wrote:

> Hi
>
> I was running tests on trunk (after merging the changes from LUCENE-2537)
> and received this error message:
>
> expected:<true> but was:<false>
>
> junit.framework.AssertionFailedError: expected: but was:
> at
> org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197)
> at
> org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:170)
> at org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:285)
>
> NOTE: random seed of testcase 'testRandomRegexes' was: 3510820306304573866
>
> I'm sure it's related to my changes. Has anyone else seen this before?
>
> Shai
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: TestUTF32ToUTF8.testRandomRegexes fails

Robert Muir
I agree, Shai can you open a bug? I cannot reproduce, did you use an IBM JVM or another environment that might help us figure it out?

On Mon, Jul 26, 2010 at 6:29 AM, Michael McCandless <[hidden email]> wrote:
Hmmm this means a bug is lurking.  This is the power of random testing
(that every time we all run tests, we're testing different "paths"
through the code)....

It seems exceptionally unlikely that LUCENE-2537's changes would cause this!

But, unfortunately, when I plug that seed in I don't see it fail,
which is odd.  I'll run a stress test to see if I can tickle the
bug... can you open a Jira issue so we don't lose track?

Mike

On Mon, Jul 26, 2010 at 2:57 AM, Shai Erera <[hidden email]> wrote:
> Hi
>
> I was running tests on trunk (after merging the changes from LUCENE-2537)
> and received this error message:
>
> expected:<true> but was:<false>
>
> junit.framework.AssertionFailedError: expected: but was:
> at
> org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197)
> at
> org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:170)
> at org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:285)
>
> NOTE: random seed of testcase 'testRandomRegexes' was: 3510820306304573866
>
> I'm sure it's related to my changes. Has anyone else seen this before?
>
> Shai
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]




--
Robert Muir
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: TestUTF32ToUTF8.testRandomRegexes fails

Michael McCandless-2
On a more general note...

Any time any of you out there hit an "odd" test failure, please please
please do just what Shai did: take it to the dev list!

Think of Lucene's unit tests like SETI :)  We are desperately seeking
bugs, and you and your machine may just be lucky enough to find one...
go forth and buy expensive new power hungry computers just so you can
run the random tests over and over, seeking the bugs!

But be sure to include that random seed when you do hit a failure...

Mike

On Mon, Jul 26, 2010 at 8:23 AM, Robert Muir <[hidden email]> wrote:

> I agree, Shai can you open a bug? I cannot reproduce, did you use an IBM JVM
> or another environment that might help us figure it out?
>
> On Mon, Jul 26, 2010 at 6:29 AM, Michael McCandless
> <[hidden email]> wrote:
>>
>> Hmmm this means a bug is lurking.  This is the power of random testing
>> (that every time we all run tests, we're testing different "paths"
>> through the code)....
>>
>> It seems exceptionally unlikely that LUCENE-2537's changes would cause
>> this!
>>
>> But, unfortunately, when I plug that seed in I don't see it fail,
>> which is odd.  I'll run a stress test to see if I can tickle the
>> bug... can you open a Jira issue so we don't lose track?
>>
>> Mike
>>
>> On Mon, Jul 26, 2010 at 2:57 AM, Shai Erera <[hidden email]> wrote:
>> > Hi
>> >
>> > I was running tests on trunk (after merging the changes from
>> > LUCENE-2537)
>> > and received this error message:
>> >
>> > expected:<true> but was:<false>
>> >
>> > junit.framework.AssertionFailedError: expected: but was:
>> > at
>> >
>> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197)
>> > at
>> >
>> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:170)
>> > at
>> > org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:285)
>> >
>> > NOTE: random seed of testcase 'testRandomRegexes' was:
>> > 3510820306304573866
>> >
>> > I'm sure it's related to my changes. Has anyone else seen this before?
>> >
>> > Shai
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>
>
>
> --
> Robert Muir
> [hidden email]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: TestUTF32ToUTF8.testRandomRegexes fails

Shai Erera
Sorry for the delayed response.

I ran it a couple more times, from Eclipse and Ant, and each time it fails (amazing !), w/ different seeds. More seeds that fail:
NOTE: random seed of testcase 'testRandomRegexes' was: -4244174191361080127
NOTE: random seed of testcase 'testRandomRegexes' was: -7059086272401721644
NOTE: random seed of testcase 'testRandomRegexes' was: -1314734215611104147

I use IBM JVM, tried w/ both 1.5 and 1.6 ...

Mike, can we use LUCENE-2565 to track this, or would you prefer that I open a separate one?

Shai

On Mon, Jul 26, 2010 at 3:26 PM, Michael McCandless <[hidden email]> wrote:
On a more general note...

Any time any of you out there hit an "odd" test failure, please please
please do just what Shai did: take it to the dev list!

Think of Lucene's unit tests like SETI :)  We are desperately seeking
bugs, and you and your machine may just be lucky enough to find one...
go forth and buy expensive new power hungry computers just so you can
run the random tests over and over, seeking the bugs!

But be sure to include that random seed when you do hit a failure...

Mike

On Mon, Jul 26, 2010 at 8:23 AM, Robert Muir <[hidden email]> wrote:
> I agree, Shai can you open a bug? I cannot reproduce, did you use an IBM JVM
> or another environment that might help us figure it out?
>
> On Mon, Jul 26, 2010 at 6:29 AM, Michael McCandless
> <[hidden email]> wrote:
>>
>> Hmmm this means a bug is lurking.  This is the power of random testing
>> (that every time we all run tests, we're testing different "paths"
>> through the code)....
>>
>> It seems exceptionally unlikely that LUCENE-2537's changes would cause
>> this!
>>
>> But, unfortunately, when I plug that seed in I don't see it fail,
>> which is odd.  I'll run a stress test to see if I can tickle the
>> bug... can you open a Jira issue so we don't lose track?
>>
>> Mike
>>
>> On Mon, Jul 26, 2010 at 2:57 AM, Shai Erera <[hidden email]> wrote:
>> > Hi
>> >
>> > I was running tests on trunk (after merging the changes from
>> > LUCENE-2537)
>> > and received this error message:
>> >
>> > expected:<true> but was:<false>
>> >
>> > junit.framework.AssertionFailedError: expected: but was:
>> > at
>> >
>> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197)
>> > at
>> >
>> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:170)
>> > at
>> > org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:285)
>> >
>> > NOTE: random seed of testcase 'testRandomRegexes' was:
>> > 3510820306304573866
>> >
>> > I'm sure it's related to my changes. Has anyone else seen this before?
>> >
>> > Shai
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>
>
>
> --
> Robert Muir
> [hidden email]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: TestUTF32ToUTF8.testRandomRegexes fails

Shai Erera
Tried to run it w/ SUN JRE6 and it succeeds ! I've tried several times and it succeeds every time. However, when I revert back to IBM's, it fail immediately.

I can help w/ the debug, if you give me a hint where to look :).

Shai

On Mon, Jul 26, 2010 at 5:57 PM, Shai Erera <[hidden email]> wrote:
Sorry for the delayed response.

I ran it a couple more times, from Eclipse and Ant, and each time it fails (amazing !), w/ different seeds. More seeds that fail:
NOTE: random seed of testcase 'testRandomRegexes' was: -4244174191361080127
NOTE: random seed of testcase 'testRandomRegexes' was: -7059086272401721644
NOTE: random seed of testcase 'testRandomRegexes' was: -1314734215611104147

I use IBM JVM, tried w/ both 1.5 and 1.6 ...

Mike, can we use LUCENE-2565 to track this, or would you prefer that I open a separate one?

Shai


On Mon, Jul 26, 2010 at 3:26 PM, Michael McCandless <[hidden email]> wrote:
On a more general note...

Any time any of you out there hit an "odd" test failure, please please
please do just what Shai did: take it to the dev list!

Think of Lucene's unit tests like SETI :)  We are desperately seeking
bugs, and you and your machine may just be lucky enough to find one...
go forth and buy expensive new power hungry computers just so you can
run the random tests over and over, seeking the bugs!

But be sure to include that random seed when you do hit a failure...

Mike

On Mon, Jul 26, 2010 at 8:23 AM, Robert Muir <[hidden email]> wrote:
> I agree, Shai can you open a bug? I cannot reproduce, did you use an IBM JVM
> or another environment that might help us figure it out?
>
> On Mon, Jul 26, 2010 at 6:29 AM, Michael McCandless
> <[hidden email]> wrote:
>>
>> Hmmm this means a bug is lurking.  This is the power of random testing
>> (that every time we all run tests, we're testing different "paths"
>> through the code)....
>>
>> It seems exceptionally unlikely that LUCENE-2537's changes would cause
>> this!
>>
>> But, unfortunately, when I plug that seed in I don't see it fail,
>> which is odd.  I'll run a stress test to see if I can tickle the
>> bug... can you open a Jira issue so we don't lose track?
>>
>> Mike
>>
>> On Mon, Jul 26, 2010 at 2:57 AM, Shai Erera <[hidden email]> wrote:
>> > Hi
>> >
>> > I was running tests on trunk (after merging the changes from
>> > LUCENE-2537)
>> > and received this error message:
>> >
>> > expected:<true> but was:<false>
>> >
>> > junit.framework.AssertionFailedError: expected: but was:
>> > at
>> >
>> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197)
>> > at
>> >
>> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:170)
>> > at
>> > org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:285)
>> >
>> > NOTE: random seed of testcase 'testRandomRegexes' was:
>> > 3510820306304573866
>> >
>> > I'm sure it's related to my changes. Has anyone else seen this before?
>> >
>> > Shai
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>
>
>
> --
> Robert Muir
> [hidden email]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]



Reply | Threaded
Open this post in threaded view
|

Re: TestUTF32ToUTF8.testRandomRegexes fails

Robert Muir
sounds nasty... its good you are running the tests with this different jvm...

On Mon, Jul 26, 2010 at 11:21 AM, Shai Erera <[hidden email]> wrote:
Tried to run it w/ SUN JRE6 and it succeeds ! I've tried several times and it succeeds every time. However, when I revert back to IBM's, it fail immediately.

I can help w/ the debug, if you give me a hint where to look :).

Shai

On Mon, Jul 26, 2010 at 5:57 PM, Shai Erera <[hidden email]> wrote:
Sorry for the delayed response.

I ran it a couple more times, from Eclipse and Ant, and each time it fails (amazing !), w/ different seeds. More seeds that fail:
NOTE: random seed of testcase 'testRandomRegexes' was: -4244174191361080127
NOTE: random seed of testcase 'testRandomRegexes' was: -7059086272401721644
NOTE: random seed of testcase 'testRandomRegexes' was: -1314734215611104147

I use IBM JVM, tried w/ both 1.5 and 1.6 ...

Mike, can we use LUCENE-2565 to track this, or would you prefer that I open a separate one?

Shai


On Mon, Jul 26, 2010 at 3:26 PM, Michael McCandless <[hidden email]> wrote:
On a more general note...

Any time any of you out there hit an "odd" test failure, please please
please do just what Shai did: take it to the dev list!

Think of Lucene's unit tests like SETI :)  We are desperately seeking
bugs, and you and your machine may just be lucky enough to find one...
go forth and buy expensive new power hungry computers just so you can
run the random tests over and over, seeking the bugs!

But be sure to include that random seed when you do hit a failure...

Mike

On Mon, Jul 26, 2010 at 8:23 AM, Robert Muir <[hidden email]> wrote:
> I agree, Shai can you open a bug? I cannot reproduce, did you use an IBM JVM
> or another environment that might help us figure it out?
>
> On Mon, Jul 26, 2010 at 6:29 AM, Michael McCandless
> <[hidden email]> wrote:
>>
>> Hmmm this means a bug is lurking.  This is the power of random testing
>> (that every time we all run tests, we're testing different "paths"
>> through the code)....
>>
>> It seems exceptionally unlikely that LUCENE-2537's changes would cause
>> this!
>>
>> But, unfortunately, when I plug that seed in I don't see it fail,
>> which is odd.  I'll run a stress test to see if I can tickle the
>> bug... can you open a Jira issue so we don't lose track?
>>
>> Mike
>>
>> On Mon, Jul 26, 2010 at 2:57 AM, Shai Erera <[hidden email]> wrote:
>> > Hi
>> >
>> > I was running tests on trunk (after merging the changes from
>> > LUCENE-2537)
>> > and received this error message:
>> >
>> > expected:<true> but was:<false>
>> >
>> > junit.framework.AssertionFailedError: expected: but was:
>> > at
>> >
>> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197)
>> > at
>> >
>> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:170)
>> > at
>> > org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:285)
>> >
>> > NOTE: random seed of testcase 'testRandomRegexes' was:
>> > 3510820306304573866
>> >
>> > I'm sure it's related to my changes. Has anyone else seen this before?
>> >
>> > Shai
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>
>
>
> --
> Robert Muir
> [hidden email]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]






--
Robert Muir
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: TestUTF32ToUTF8.testRandomRegexes fails

Michael McCandless-2
In reply to this post by Shai Erera
On Mon, Jul 26, 2010 at 10:57 AM, Shai Erera <[hidden email]> wrote:
> Sorry for the delayed response.
>
> I ran it a couple more times, from Eclipse and Ant, and each time it fails
> (amazing !), w/ different seeds. More seeds that fail:
> NOTE: random seed of testcase 'testRandomRegexes' was: -4244174191361080127
> NOTE: random seed of testcase 'testRandomRegexes' was: -7059086272401721644
> NOTE: random seed of testcase 'testRandomRegexes' was: -1314734215611104147
>
> I use IBM JVM, tried w/ both 1.5 and 1.6 ...

Jeez this sounds nasty....

> Mike, can we use LUCENE-2565 to track this, or would you prefer that I open
> a separate one?

Can you open a new one?  That issue is just about the test running
forever, trying to find a good random character :)

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: TestUTF32ToUTF8.testRandomRegexes fails

Shai Erera
In reply to this post by Robert Muir
Ok I've dug deeper into the test. I set the random seed to -9029631602016965389L in setUp(), and discovered that on the 4th iteration it breaks. For some reason though, AutomatonTestUtil.randomRegex generates different strings every time I run the test, even though it uses the same Random object w/ the same seed ...

Anyway, one of the regex that failed was this "l.E" (w/o the quotes) and I think it's a lowercase L, '.' (dot) and 'E' (uppercase). Hope this helps.

Shai

On Mon, Jul 26, 2010 at 6:23 PM, Robert Muir <[hidden email]> wrote:
sounds nasty... its good you are running the tests with this different jvm...


On Mon, Jul 26, 2010 at 11:21 AM, Shai Erera <[hidden email]> wrote:
Tried to run it w/ SUN JRE6 and it succeeds ! I've tried several times and it succeeds every time. However, when I revert back to IBM's, it fail immediately.

I can help w/ the debug, if you give me a hint where to look :).

Shai

On Mon, Jul 26, 2010 at 5:57 PM, Shai Erera <[hidden email]> wrote:
Sorry for the delayed response.

I ran it a couple more times, from Eclipse and Ant, and each time it fails (amazing !), w/ different seeds. More seeds that fail:
NOTE: random seed of testcase 'testRandomRegexes' was: -4244174191361080127
NOTE: random seed of testcase 'testRandomRegexes' was: -7059086272401721644
NOTE: random seed of testcase 'testRandomRegexes' was: -1314734215611104147

I use IBM JVM, tried w/ both 1.5 and 1.6 ...

Mike, can we use LUCENE-2565 to track this, or would you prefer that I open a separate one?

Shai


On Mon, Jul 26, 2010 at 3:26 PM, Michael McCandless <[hidden email]> wrote:
On a more general note...

Any time any of you out there hit an "odd" test failure, please please
please do just what Shai did: take it to the dev list!

Think of Lucene's unit tests like SETI :)  We are desperately seeking
bugs, and you and your machine may just be lucky enough to find one...
go forth and buy expensive new power hungry computers just so you can
run the random tests over and over, seeking the bugs!

But be sure to include that random seed when you do hit a failure...

Mike

On Mon, Jul 26, 2010 at 8:23 AM, Robert Muir <[hidden email]> wrote:
> I agree, Shai can you open a bug? I cannot reproduce, did you use an IBM JVM
> or another environment that might help us figure it out?
>
> On Mon, Jul 26, 2010 at 6:29 AM, Michael McCandless
> <[hidden email]> wrote:
>>
>> Hmmm this means a bug is lurking.  This is the power of random testing
>> (that every time we all run tests, we're testing different "paths"
>> through the code)....
>>
>> It seems exceptionally unlikely that LUCENE-2537's changes would cause
>> this!
>>
>> But, unfortunately, when I plug that seed in I don't see it fail,
>> which is odd.  I'll run a stress test to see if I can tickle the
>> bug... can you open a Jira issue so we don't lose track?
>>
>> Mike
>>
>> On Mon, Jul 26, 2010 at 2:57 AM, Shai Erera <[hidden email]> wrote:
>> > Hi
>> >
>> > I was running tests on trunk (after merging the changes from
>> > LUCENE-2537)
>> > and received this error message:
>> >
>> > expected:<true> but was:<false>
>> >
>> > junit.framework.AssertionFailedError: expected: but was:
>> > at
>> >
>> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197)
>> > at
>> >
>> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:170)
>> > at
>> > org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:285)
>> >
>> > NOTE: random seed of testcase 'testRandomRegexes' was:
>> > 3510820306304573866
>> >
>> > I'm sure it's related to my changes. Has anyone else seen this before?
>> >
>> > Shai
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>
>
>
> --
> Robert Muir
> [hidden email]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]






--
Robert Muir
[hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: TestUTF32ToUTF8.testRandomRegexes fails

Robert Muir
hmm maybe the bug is in AutomatonTestUtil.randomRegex?

can you do me a favor and run -Dtestcase=TestRandomRegex2
This testcase also uses this same randomRegex method.

you can also "crank" it like our other random tests, for instance with -Drandom.multiplier=3

On Mon, Jul 26, 2010 at 11:40 AM, Shai Erera <[hidden email]> wrote:
Ok I've dug deeper into the test. I set the random seed to -9029631602016965389L in setUp(), and discovered that on the 4th iteration it breaks. For some reason though, AutomatonTestUtil.randomRegex generates different strings every time I run the test, even though it uses the same Random object w/ the same seed ...

Anyway, one of the regex that failed was this "l.E" (w/o the quotes) and I think it's a lowercase L, '.' (dot) and 'E' (uppercase). Hope this helps.

Shai

On Mon, Jul 26, 2010 at 6:23 PM, Robert Muir <[hidden email]> wrote:
sounds nasty... its good you are running the tests with this different jvm...


On Mon, Jul 26, 2010 at 11:21 AM, Shai Erera <[hidden email]> wrote:
Tried to run it w/ SUN JRE6 and it succeeds ! I've tried several times and it succeeds every time. However, when I revert back to IBM's, it fail immediately.

I can help w/ the debug, if you give me a hint where to look :).

Shai

On Mon, Jul 26, 2010 at 5:57 PM, Shai Erera <[hidden email]> wrote:
Sorry for the delayed response.

I ran it a couple more times, from Eclipse and Ant, and each time it fails (amazing !), w/ different seeds. More seeds that fail:
NOTE: random seed of testcase 'testRandomRegexes' was: -4244174191361080127
NOTE: random seed of testcase 'testRandomRegexes' was: -7059086272401721644
NOTE: random seed of testcase 'testRandomRegexes' was: -1314734215611104147

I use IBM JVM, tried w/ both 1.5 and 1.6 ...

Mike, can we use LUCENE-2565 to track this, or would you prefer that I open a separate one?

Shai


On Mon, Jul 26, 2010 at 3:26 PM, Michael McCandless <[hidden email]> wrote:
On a more general note...

Any time any of you out there hit an "odd" test failure, please please
please do just what Shai did: take it to the dev list!

Think of Lucene's unit tests like SETI :)  We are desperately seeking
bugs, and you and your machine may just be lucky enough to find one...
go forth and buy expensive new power hungry computers just so you can
run the random tests over and over, seeking the bugs!

But be sure to include that random seed when you do hit a failure...

Mike

On Mon, Jul 26, 2010 at 8:23 AM, Robert Muir <[hidden email]> wrote:
> I agree, Shai can you open a bug? I cannot reproduce, did you use an IBM JVM
> or another environment that might help us figure it out?
>
> On Mon, Jul 26, 2010 at 6:29 AM, Michael McCandless
> <[hidden email]> wrote:
>>
>> Hmmm this means a bug is lurking.  This is the power of random testing
>> (that every time we all run tests, we're testing different "paths"
>> through the code)....
>>
>> It seems exceptionally unlikely that LUCENE-2537's changes would cause
>> this!
>>
>> But, unfortunately, when I plug that seed in I don't see it fail,
>> which is odd.  I'll run a stress test to see if I can tickle the
>> bug... can you open a Jira issue so we don't lose track?
>>
>> Mike
>>
>> On Mon, Jul 26, 2010 at 2:57 AM, Shai Erera <[hidden email]> wrote:
>> > Hi
>> >
>> > I was running tests on trunk (after merging the changes from
>> > LUCENE-2537)
>> > and received this error message:
>> >
>> > expected:<true> but was:<false>
>> >
>> > junit.framework.AssertionFailedError: expected: but was:
>> > at
>> >
>> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197)
>> > at
>> >
>> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:170)
>> > at
>> > org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:285)
>> >
>> > NOTE: random seed of testcase 'testRandomRegexes' was:
>> > 3510820306304573866
>> >
>> > I'm sure it's related to my changes. Has anyone else seen this before?
>> >
>> > Shai
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>
>
>
> --
> Robert Muir
> [hidden email]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]






--
Robert Muir
[hidden email]




--
Robert Muir
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: TestUTF32ToUTF8.testRandomRegexes fails

Robert Muir
sorry, i screwed up the name of the test, i meant TestRegexpRandom2

On Mon, Jul 26, 2010 at 11:46 AM, Robert Muir <[hidden email]> wrote:
hmm maybe the bug is in AutomatonTestUtil.randomRegex?

can you do me a favor and run -Dtestcase=TestRandomRegex2
This testcase also uses this same randomRegex method.

you can also "crank" it like our other random tests, for instance with -Drandom.multiplier=3

On Mon, Jul 26, 2010 at 11:40 AM, Shai Erera <[hidden email]> wrote:
Ok I've dug deeper into the test. I set the random seed to -9029631602016965389L in setUp(), and discovered that on the 4th iteration it breaks. For some reason though, AutomatonTestUtil.randomRegex generates different strings every time I run the test, even though it uses the same Random object w/ the same seed ...

Anyway, one of the regex that failed was this "l.E" (w/o the quotes) and I think it's a lowercase L, '.' (dot) and 'E' (uppercase). Hope this helps.

Shai

On Mon, Jul 26, 2010 at 6:23 PM, Robert Muir <[hidden email]> wrote:
sounds nasty... its good you are running the tests with this different jvm...


On Mon, Jul 26, 2010 at 11:21 AM, Shai Erera <[hidden email]> wrote:
Tried to run it w/ SUN JRE6 and it succeeds ! I've tried several times and it succeeds every time. However, when I revert back to IBM's, it fail immediately.

I can help w/ the debug, if you give me a hint where to look :).

Shai

On Mon, Jul 26, 2010 at 5:57 PM, Shai Erera <[hidden email]> wrote:
Sorry for the delayed response.

I ran it a couple more times, from Eclipse and Ant, and each time it fails (amazing !), w/ different seeds. More seeds that fail:
NOTE: random seed of testcase 'testRandomRegexes' was: -4244174191361080127
NOTE: random seed of testcase 'testRandomRegexes' was: -7059086272401721644
NOTE: random seed of testcase 'testRandomRegexes' was: -1314734215611104147

I use IBM JVM, tried w/ both 1.5 and 1.6 ...

Mike, can we use LUCENE-2565 to track this, or would you prefer that I open a separate one?

Shai


On Mon, Jul 26, 2010 at 3:26 PM, Michael McCandless <[hidden email]> wrote:
On a more general note...

Any time any of you out there hit an "odd" test failure, please please
please do just what Shai did: take it to the dev list!

Think of Lucene's unit tests like SETI :)  We are desperately seeking
bugs, and you and your machine may just be lucky enough to find one...
go forth and buy expensive new power hungry computers just so you can
run the random tests over and over, seeking the bugs!

But be sure to include that random seed when you do hit a failure...

Mike

On Mon, Jul 26, 2010 at 8:23 AM, Robert Muir <[hidden email]> wrote:
> I agree, Shai can you open a bug? I cannot reproduce, did you use an IBM JVM
> or another environment that might help us figure it out?
>
> On Mon, Jul 26, 2010 at 6:29 AM, Michael McCandless
> <[hidden email]> wrote:
>>
>> Hmmm this means a bug is lurking.  This is the power of random testing
>> (that every time we all run tests, we're testing different "paths"
>> through the code)....
>>
>> It seems exceptionally unlikely that LUCENE-2537's changes would cause
>> this!
>>
>> But, unfortunately, when I plug that seed in I don't see it fail,
>> which is odd.  I'll run a stress test to see if I can tickle the
>> bug... can you open a Jira issue so we don't lose track?
>>
>> Mike
>>
>> On Mon, Jul 26, 2010 at 2:57 AM, Shai Erera <[hidden email]> wrote:
>> > Hi
>> >
>> > I was running tests on trunk (after merging the changes from
>> > LUCENE-2537)
>> > and received this error message:
>> >
>> > expected:<true> but was:<false>
>> >
>> > junit.framework.AssertionFailedError: expected: but was:
>> > at
>> >
>> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197)
>> > at
>> >
>> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:170)
>> > at
>> > org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:285)
>> >
>> > NOTE: random seed of testcase 'testRandomRegexes' was:
>> > 3510820306304573866
>> >
>> > I'm sure it's related to my changes. Has anyone else seen this before?
>> >
>> > Shai
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>
>
>
> --
> Robert Muir
> [hidden email]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]






--
Robert Muir
[hidden email]




--
Robert Muir
[hidden email]



--
Robert Muir
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: TestUTF32ToUTF8.testRandomRegexes fails

Michael McCandless-2
In reply to this post by Shai Erera
That's VERY spooky that w/ a fixed seed you see different random
regexps being made.

Mike

On Mon, Jul 26, 2010 at 11:40 AM, Shai Erera <[hidden email]> wrote:

> Ok I've dug deeper into the test. I set the random seed to
> -9029631602016965389L in setUp(), and discovered that on the 4th iteration
> it breaks. For some reason though, AutomatonTestUtil.randomRegex generates
> different strings every time I run the test, even though it uses the same
> Random object w/ the same seed ...
>
> Anyway, one of the regex that failed was this "l.E" (w/o the quotes) and I
> think it's a lowercase L, '.' (dot) and 'E' (uppercase). Hope this helps.
>
> Shai
>
> On Mon, Jul 26, 2010 at 6:23 PM, Robert Muir <[hidden email]> wrote:
>>
>> sounds nasty... its good you are running the tests with this different
>> jvm...
>>
>> On Mon, Jul 26, 2010 at 11:21 AM, Shai Erera <[hidden email]> wrote:
>>>
>>> Tried to run it w/ SUN JRE6 and it succeeds ! I've tried several times
>>> and it succeeds every time. However, when I revert back to IBM's, it fail
>>> immediately.
>>>
>>> I can help w/ the debug, if you give me a hint where to look :).
>>>
>>> Shai
>>>
>>> On Mon, Jul 26, 2010 at 5:57 PM, Shai Erera <[hidden email]> wrote:
>>>>
>>>> Sorry for the delayed response.
>>>>
>>>> I ran it a couple more times, from Eclipse and Ant, and each time it
>>>> fails (amazing !), w/ different seeds. More seeds that fail:
>>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>> -4244174191361080127
>>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>> -7059086272401721644
>>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>> -1314734215611104147
>>>>
>>>> I use IBM JVM, tried w/ both 1.5 and 1.6 ...
>>>>
>>>> Mike, can we use LUCENE-2565 to track this, or would you prefer that I
>>>> open a separate one?
>>>>
>>>> Shai
>>>>
>>>> On Mon, Jul 26, 2010 at 3:26 PM, Michael McCandless
>>>> <[hidden email]> wrote:
>>>>>
>>>>> On a more general note...
>>>>>
>>>>> Any time any of you out there hit an "odd" test failure, please please
>>>>> please do just what Shai did: take it to the dev list!
>>>>>
>>>>> Think of Lucene's unit tests like SETI :)  We are desperately seeking
>>>>> bugs, and you and your machine may just be lucky enough to find one...
>>>>> go forth and buy expensive new power hungry computers just so you can
>>>>> run the random tests over and over, seeking the bugs!
>>>>>
>>>>> But be sure to include that random seed when you do hit a failure...
>>>>>
>>>>> Mike
>>>>>
>>>>> On Mon, Jul 26, 2010 at 8:23 AM, Robert Muir <[hidden email]> wrote:
>>>>> > I agree, Shai can you open a bug? I cannot reproduce, did you use an
>>>>> > IBM JVM
>>>>> > or another environment that might help us figure it out?
>>>>> >
>>>>> > On Mon, Jul 26, 2010 at 6:29 AM, Michael McCandless
>>>>> > <[hidden email]> wrote:
>>>>> >>
>>>>> >> Hmmm this means a bug is lurking.  This is the power of random
>>>>> >> testing
>>>>> >> (that every time we all run tests, we're testing different "paths"
>>>>> >> through the code)....
>>>>> >>
>>>>> >> It seems exceptionally unlikely that LUCENE-2537's changes would
>>>>> >> cause
>>>>> >> this!
>>>>> >>
>>>>> >> But, unfortunately, when I plug that seed in I don't see it fail,
>>>>> >> which is odd.  I'll run a stress test to see if I can tickle the
>>>>> >> bug... can you open a Jira issue so we don't lose track?
>>>>> >>
>>>>> >> Mike
>>>>> >>
>>>>> >> On Mon, Jul 26, 2010 at 2:57 AM, Shai Erera <[hidden email]>
>>>>> >> wrote:
>>>>> >> > Hi
>>>>> >> >
>>>>> >> > I was running tests on trunk (after merging the changes from
>>>>> >> > LUCENE-2537)
>>>>> >> > and received this error message:
>>>>> >> >
>>>>> >> > expected:<true> but was:<false>
>>>>> >> >
>>>>> >> > junit.framework.AssertionFailedError: expected: but was:
>>>>> >> > at
>>>>> >> >
>>>>> >> >
>>>>> >> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197)
>>>>> >> > at
>>>>> >> >
>>>>> >> >
>>>>> >> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:170)
>>>>> >> > at
>>>>> >> >
>>>>> >> > org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:285)
>>>>> >> >
>>>>> >> > NOTE: random seed of testcase 'testRandomRegexes' was:
>>>>> >> > 3510820306304573866
>>>>> >> >
>>>>> >> > I'm sure it's related to my changes. Has anyone else seen this
>>>>> >> > before?
>>>>> >> >
>>>>> >> > Shai
>>>>> >> >
>>>>> >>
>>>>> >>
>>>>> >> ---------------------------------------------------------------------
>>>>> >> To unsubscribe, e-mail: [hidden email]
>>>>> >> For additional commands, e-mail: [hidden email]
>>>>> >>
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> > Robert Muir
>>>>> > [hidden email]
>>>>> >
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: [hidden email]
>>>>> For additional commands, e-mail: [hidden email]
>>>>>
>>>>
>>>
>>
>>
>>
>> --
>> Robert Muir
>> [hidden email]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: TestUTF32ToUTF8.testRandomRegexes fails

Robert Muir
maybe there is a bug in ibm's random generator :)

On Mon, Jul 26, 2010 at 11:50 AM, Michael McCandless <[hidden email]> wrote:
That's VERY spooky that w/ a fixed seed you see different random
regexps being made.

Mike

On Mon, Jul 26, 2010 at 11:40 AM, Shai Erera <[hidden email]> wrote:
> Ok I've dug deeper into the test. I set the random seed to
> -9029631602016965389L in setUp(), and discovered that on the 4th iteration
> it breaks. For some reason though, AutomatonTestUtil.randomRegex generates
> different strings every time I run the test, even though it uses the same
> Random object w/ the same seed ...
>
> Anyway, one of the regex that failed was this "l.E" (w/o the quotes) and I
> think it's a lowercase L, '.' (dot) and 'E' (uppercase). Hope this helps.
>
> Shai
>
> On Mon, Jul 26, 2010 at 6:23 PM, Robert Muir <[hidden email]> wrote:
>>
>> sounds nasty... its good you are running the tests with this different
>> jvm...
>>
>> On Mon, Jul 26, 2010 at 11:21 AM, Shai Erera <[hidden email]> wrote:
>>>
>>> Tried to run it w/ SUN JRE6 and it succeeds ! I've tried several times
>>> and it succeeds every time. However, when I revert back to IBM's, it fail
>>> immediately.
>>>
>>> I can help w/ the debug, if you give me a hint where to look :).
>>>
>>> Shai
>>>
>>> On Mon, Jul 26, 2010 at 5:57 PM, Shai Erera <[hidden email]> wrote:
>>>>
>>>> Sorry for the delayed response.
>>>>
>>>> I ran it a couple more times, from Eclipse and Ant, and each time it
>>>> fails (amazing !), w/ different seeds. More seeds that fail:
>>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>> -4244174191361080127
>>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>> -7059086272401721644
>>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>> -1314734215611104147
>>>>
>>>> I use IBM JVM, tried w/ both 1.5 and 1.6 ...
>>>>
>>>> Mike, can we use LUCENE-2565 to track this, or would you prefer that I
>>>> open a separate one?
>>>>
>>>> Shai
>>>>
>>>> On Mon, Jul 26, 2010 at 3:26 PM, Michael McCandless
>>>> <[hidden email]> wrote:
>>>>>
>>>>> On a more general note...
>>>>>
>>>>> Any time any of you out there hit an "odd" test failure, please please
>>>>> please do just what Shai did: take it to the dev list!
>>>>>
>>>>> Think of Lucene's unit tests like SETI :)  We are desperately seeking
>>>>> bugs, and you and your machine may just be lucky enough to find one...
>>>>> go forth and buy expensive new power hungry computers just so you can
>>>>> run the random tests over and over, seeking the bugs!
>>>>>
>>>>> But be sure to include that random seed when you do hit a failure...
>>>>>
>>>>> Mike
>>>>>
>>>>> On Mon, Jul 26, 2010 at 8:23 AM, Robert Muir <[hidden email]> wrote:
>>>>> > I agree, Shai can you open a bug? I cannot reproduce, did you use an
>>>>> > IBM JVM
>>>>> > or another environment that might help us figure it out?
>>>>> >
>>>>> > On Mon, Jul 26, 2010 at 6:29 AM, Michael McCandless
>>>>> > <[hidden email]> wrote:
>>>>> >>
>>>>> >> Hmmm this means a bug is lurking.  This is the power of random
>>>>> >> testing
>>>>> >> (that every time we all run tests, we're testing different "paths"
>>>>> >> through the code)....
>>>>> >>
>>>>> >> It seems exceptionally unlikely that LUCENE-2537's changes would
>>>>> >> cause
>>>>> >> this!
>>>>> >>
>>>>> >> But, unfortunately, when I plug that seed in I don't see it fail,
>>>>> >> which is odd.  I'll run a stress test to see if I can tickle the
>>>>> >> bug... can you open a Jira issue so we don't lose track?
>>>>> >>
>>>>> >> Mike
>>>>> >>
>>>>> >> On Mon, Jul 26, 2010 at 2:57 AM, Shai Erera <[hidden email]>
>>>>> >> wrote:
>>>>> >> > Hi
>>>>> >> >
>>>>> >> > I was running tests on trunk (after merging the changes from
>>>>> >> > LUCENE-2537)
>>>>> >> > and received this error message:
>>>>> >> >
>>>>> >> > expected:<true> but was:<false>
>>>>> >> >
>>>>> >> > junit.framework.AssertionFailedError: expected: but was:
>>>>> >> > at
>>>>> >> >
>>>>> >> >
>>>>> >> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197)
>>>>> >> > at
>>>>> >> >
>>>>> >> >
>>>>> >> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:170)
>>>>> >> > at
>>>>> >> >
>>>>> >> > org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:285)
>>>>> >> >
>>>>> >> > NOTE: random seed of testcase 'testRandomRegexes' was:
>>>>> >> > 3510820306304573866
>>>>> >> >
>>>>> >> > I'm sure it's related to my changes. Has anyone else seen this
>>>>> >> > before?
>>>>> >> >
>>>>> >> > Shai
>>>>> >> >
>>>>> >>
>>>>> >>
>>>>> >> ---------------------------------------------------------------------
>>>>> >> To unsubscribe, e-mail: [hidden email]
>>>>> >> For additional commands, e-mail: [hidden email]
>>>>> >>
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> > Robert Muir
>>>>> > [hidden email]
>>>>> >
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: [hidden email]
>>>>> For additional commands, e-mail: [hidden email]
>>>>>
>>>>
>>>
>>
>>
>>
>> --
>> Robert Muir
>> [hidden email]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]




--
Robert Muir
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: TestUTF32ToUTF8.testRandomRegexes fails

Shai Erera
I don't know what was the thing w/ the strings generated before, but now I ran the test again w/ the same seed and it generates the same strings. So at least it seems there are no problems w/ the Random class :).

However, the string l.E fails w/ the IBM JVM and succeeds w/ SUN's. Any ideas why? What does the test check anyway?

I ran TRR2, and set the regexp to always be "l.E" and the test passes. The failure comes from

junit.framework.AssertionFailedError: expected:<true> but was:<false>
    at org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:199)
    at org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:171)

I've set regexp to "l.E", and also 'string' inside assertAutomaton to "\u006C\uD9FF\u0045". The byte[] returned from string.getBytes("UTF-8") are [108, 69]. It just ignores the middle character. Perhaps that's why the test fails?

When I run this w/ SUN's JVM, the bytes returned are [108, 63, 69].

If I manually set the bytes, using IBM's, to [108, 63, 69], then the test passes.

Interestingly, Googling for \uD9FF brings back LUCENE-2019 as the first result :). I'll dig some more into this character, and why the IBM and SUN JVMs return different byte[] representation for the same sequence of characters. If you already spot the problem, please let me know.

BTW, the test calls _TestUtil.getRandomMultiplier on every iteration loop, which goes and checks a system property. Perhaps we can extract it to a variable, or include a static constant in LuceneTestCase(J4) or something?

Shai

On Mon, Jul 26, 2010 at 9:22 PM, Robert Muir <[hidden email]> wrote:
maybe there is a bug in ibm's random generator :)


On Mon, Jul 26, 2010 at 11:50 AM, Michael McCandless <[hidden email]> wrote:
That's VERY spooky that w/ a fixed seed you see different random
regexps being made.

Mike

On Mon, Jul 26, 2010 at 11:40 AM, Shai Erera <[hidden email]> wrote:
> Ok I've dug deeper into the test. I set the random seed to
> -9029631602016965389L in setUp(), and discovered that on the 4th iteration
> it breaks. For some reason though, AutomatonTestUtil.randomRegex generates
> different strings every time I run the test, even though it uses the same
> Random object w/ the same seed ...
>
> Anyway, one of the regex that failed was this "l.E" (w/o the quotes) and I
> think it's a lowercase L, '.' (dot) and 'E' (uppercase). Hope this helps.
>
> Shai
>
> On Mon, Jul 26, 2010 at 6:23 PM, Robert Muir <[hidden email]> wrote:
>>
>> sounds nasty... its good you are running the tests with this different
>> jvm...
>>
>> On Mon, Jul 26, 2010 at 11:21 AM, Shai Erera <[hidden email]> wrote:
>>>
>>> Tried to run it w/ SUN JRE6 and it succeeds ! I've tried several times
>>> and it succeeds every time. However, when I revert back to IBM's, it fail
>>> immediately.
>>>
>>> I can help w/ the debug, if you give me a hint where to look :).
>>>
>>> Shai
>>>
>>> On Mon, Jul 26, 2010 at 5:57 PM, Shai Erera <[hidden email]> wrote:
>>>>
>>>> Sorry for the delayed response.
>>>>
>>>> I ran it a couple more times, from Eclipse and Ant, and each time it
>>>> fails (amazing !), w/ different seeds. More seeds that fail:
>>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>> -4244174191361080127
>>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>> -7059086272401721644
>>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>> -1314734215611104147
>>>>
>>>> I use IBM JVM, tried w/ both 1.5 and 1.6 ...
>>>>
>>>> Mike, can we use LUCENE-2565 to track this, or would you prefer that I
>>>> open a separate one?
>>>>
>>>> Shai
>>>>
>>>> On Mon, Jul 26, 2010 at 3:26 PM, Michael McCandless
>>>> <[hidden email]> wrote:
>>>>>
>>>>> On a more general note...
>>>>>
>>>>> Any time any of you out there hit an "odd" test failure, please please
>>>>> please do just what Shai did: take it to the dev list!
>>>>>
>>>>> Think of Lucene's unit tests like SETI :)  We are desperately seeking
>>>>> bugs, and you and your machine may just be lucky enough to find one...
>>>>> go forth and buy expensive new power hungry computers just so you can
>>>>> run the random tests over and over, seeking the bugs!
>>>>>
>>>>> But be sure to include that random seed when you do hit a failure...
>>>>>
>>>>> Mike
>>>>>
>>>>> On Mon, Jul 26, 2010 at 8:23 AM, Robert Muir <[hidden email]> wrote:
>>>>> > I agree, Shai can you open a bug? I cannot reproduce, did you use an
>>>>> > IBM JVM
>>>>> > or another environment that might help us figure it out?
>>>>> >
>>>>> > On Mon, Jul 26, 2010 at 6:29 AM, Michael McCandless
>>>>> > <[hidden email]> wrote:
>>>>> >>
>>>>> >> Hmmm this means a bug is lurking.  This is the power of random
>>>>> >> testing
>>>>> >> (that every time we all run tests, we're testing different "paths"
>>>>> >> through the code)....
>>>>> >>
>>>>> >> It seems exceptionally unlikely that LUCENE-2537's changes would
>>>>> >> cause
>>>>> >> this!
>>>>> >>
>>>>> >> But, unfortunately, when I plug that seed in I don't see it fail,
>>>>> >> which is odd.  I'll run a stress test to see if I can tickle the
>>>>> >> bug... can you open a Jira issue so we don't lose track?
>>>>> >>
>>>>> >> Mike
>>>>> >>
>>>>> >> On Mon, Jul 26, 2010 at 2:57 AM, Shai Erera <[hidden email]>
>>>>> >> wrote:
>>>>> >> > Hi
>>>>> >> >
>>>>> >> > I was running tests on trunk (after merging the changes from
>>>>> >> > LUCENE-2537)
>>>>> >> > and received this error message:
>>>>> >> >
>>>>> >> > expected:<true> but was:<false>
>>>>> >> >
>>>>> >> > junit.framework.AssertionFailedError: expected: but was:
>>>>> >> > at
>>>>> >> >
>>>>> >> >
>>>>> >> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197)
>>>>> >> > at
>>>>> >> >
>>>>> >> >
>>>>> >> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:170)
>>>>> >> > at
>>>>> >> >
>>>>> >> > org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:285)
>>>>> >> >
>>>>> >> > NOTE: random seed of testcase 'testRandomRegexes' was:
>>>>> >> > 3510820306304573866
>>>>> >> >
>>>>> >> > I'm sure it's related to my changes. Has anyone else seen this
>>>>> >> > before?
>>>>> >> >
>>>>> >> > Shai
>>>>> >> >
>>>>> >>
>>>>> >>
>>>>> >> ---------------------------------------------------------------------
>>>>> >> To unsubscribe, e-mail: [hidden email]
>>>>> >> For additional commands, e-mail: [hidden email]
>>>>> >>
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> > Robert Muir
>>>>> > [hidden email]
>>>>> >
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: [hidden email]
>>>>> For additional commands, e-mail: [hidden email]
>>>>>
>>>>
>>>
>>
>>
>>
>> --
>> Robert Muir
>> [hidden email]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]




--
Robert Muir
[hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: TestUTF32ToUTF8.testRandomRegexes fails

Shai Erera
From here: http://www.fileformat.info/info/unicode/char/d9ff/index.htm

Looks like that character is not a valid Unicode character, and perhaps the IBM's JVM behaves correctly? Robert - you're the Unicode expert :).

Shai

On Mon, Jul 26, 2010 at 10:40 PM, Shai Erera <[hidden email]> wrote:
I don't know what was the thing w/ the strings generated before, but now I ran the test again w/ the same seed and it generates the same strings. So at least it seems there are no problems w/ the Random class :).

However, the string l.E fails w/ the IBM JVM and succeeds w/ SUN's. Any ideas why? What does the test check anyway?

I ran TRR2, and set the regexp to always be "l.E" and the test passes. The failure comes from

junit.framework.AssertionFailedError: expected:<true> but was:<false>
    at org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:199)
    at org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:171)

I've set regexp to "l.E", and also 'string' inside assertAutomaton to "\u006C\uD9FF\u0045". The byte[] returned from string.getBytes("UTF-8") are [108, 69]. It just ignores the middle character. Perhaps that's why the test fails?

When I run this w/ SUN's JVM, the bytes returned are [108, 63, 69].

If I manually set the bytes, using IBM's, to [108, 63, 69], then the test passes.

Interestingly, Googling for \uD9FF brings back LUCENE-2019 as the first result :). I'll dig some more into this character, and why the IBM and SUN JVMs return different byte[] representation for the same sequence of characters. If you already spot the problem, please let me know.

BTW, the test calls _TestUtil.getRandomMultiplier on every iteration loop, which goes and checks a system property. Perhaps we can extract it to a variable, or include a static constant in LuceneTestCase(J4) or something?

Shai


On Mon, Jul 26, 2010 at 9:22 PM, Robert Muir <[hidden email]> wrote:
maybe there is a bug in ibm's random generator :)


On Mon, Jul 26, 2010 at 11:50 AM, Michael McCandless <[hidden email]> wrote:
That's VERY spooky that w/ a fixed seed you see different random
regexps being made.

Mike

On Mon, Jul 26, 2010 at 11:40 AM, Shai Erera <[hidden email]> wrote:
> Ok I've dug deeper into the test. I set the random seed to
> -9029631602016965389L in setUp(), and discovered that on the 4th iteration
> it breaks. For some reason though, AutomatonTestUtil.randomRegex generates
> different strings every time I run the test, even though it uses the same
> Random object w/ the same seed ...
>
> Anyway, one of the regex that failed was this "l.E" (w/o the quotes) and I
> think it's a lowercase L, '.' (dot) and 'E' (uppercase). Hope this helps.
>
> Shai
>
> On Mon, Jul 26, 2010 at 6:23 PM, Robert Muir <[hidden email]> wrote:
>>
>> sounds nasty... its good you are running the tests with this different
>> jvm...
>>
>> On Mon, Jul 26, 2010 at 11:21 AM, Shai Erera <[hidden email]> wrote:
>>>
>>> Tried to run it w/ SUN JRE6 and it succeeds ! I've tried several times
>>> and it succeeds every time. However, when I revert back to IBM's, it fail
>>> immediately.
>>>
>>> I can help w/ the debug, if you give me a hint where to look :).
>>>
>>> Shai
>>>
>>> On Mon, Jul 26, 2010 at 5:57 PM, Shai Erera <[hidden email]> wrote:
>>>>
>>>> Sorry for the delayed response.
>>>>
>>>> I ran it a couple more times, from Eclipse and Ant, and each time it
>>>> fails (amazing !), w/ different seeds. More seeds that fail:
>>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>> -4244174191361080127
>>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>> -7059086272401721644
>>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>> -1314734215611104147
>>>>
>>>> I use IBM JVM, tried w/ both 1.5 and 1.6 ...
>>>>
>>>> Mike, can we use LUCENE-2565 to track this, or would you prefer that I
>>>> open a separate one?
>>>>
>>>> Shai
>>>>
>>>> On Mon, Jul 26, 2010 at 3:26 PM, Michael McCandless
>>>> <[hidden email]> wrote:
>>>>>
>>>>> On a more general note...
>>>>>
>>>>> Any time any of you out there hit an "odd" test failure, please please
>>>>> please do just what Shai did: take it to the dev list!
>>>>>
>>>>> Think of Lucene's unit tests like SETI :)  We are desperately seeking
>>>>> bugs, and you and your machine may just be lucky enough to find one...
>>>>> go forth and buy expensive new power hungry computers just so you can
>>>>> run the random tests over and over, seeking the bugs!
>>>>>
>>>>> But be sure to include that random seed when you do hit a failure...
>>>>>
>>>>> Mike
>>>>>
>>>>> On Mon, Jul 26, 2010 at 8:23 AM, Robert Muir <[hidden email]> wrote:
>>>>> > I agree, Shai can you open a bug? I cannot reproduce, did you use an
>>>>> > IBM JVM
>>>>> > or another environment that might help us figure it out?
>>>>> >
>>>>> > On Mon, Jul 26, 2010 at 6:29 AM, Michael McCandless
>>>>> > <[hidden email]> wrote:
>>>>> >>
>>>>> >> Hmmm this means a bug is lurking.  This is the power of random
>>>>> >> testing
>>>>> >> (that every time we all run tests, we're testing different "paths"
>>>>> >> through the code)....
>>>>> >>
>>>>> >> It seems exceptionally unlikely that LUCENE-2537's changes would
>>>>> >> cause
>>>>> >> this!
>>>>> >>
>>>>> >> But, unfortunately, when I plug that seed in I don't see it fail,
>>>>> >> which is odd.  I'll run a stress test to see if I can tickle the
>>>>> >> bug... can you open a Jira issue so we don't lose track?
>>>>> >>
>>>>> >> Mike
>>>>> >>
>>>>> >> On Mon, Jul 26, 2010 at 2:57 AM, Shai Erera <[hidden email]>
>>>>> >> wrote:
>>>>> >> > Hi
>>>>> >> >
>>>>> >> > I was running tests on trunk (after merging the changes from
>>>>> >> > LUCENE-2537)
>>>>> >> > and received this error message:
>>>>> >> >
>>>>> >> > expected:<true> but was:<false>
>>>>> >> >
>>>>> >> > junit.framework.AssertionFailedError: expected: but was:
>>>>> >> > at
>>>>> >> >
>>>>> >> >
>>>>> >> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197)
>>>>> >> > at
>>>>> >> >
>>>>> >> >
>>>>> >> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:170)
>>>>> >> > at
>>>>> >> >
>>>>> >> > org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:285)
>>>>> >> >
>>>>> >> > NOTE: random seed of testcase 'testRandomRegexes' was:
>>>>> >> > 3510820306304573866
>>>>> >> >
>>>>> >> > I'm sure it's related to my changes. Has anyone else seen this
>>>>> >> > before?
>>>>> >> >
>>>>> >> > Shai
>>>>> >> >
>>>>> >>
>>>>> >>
>>>>> >> ---------------------------------------------------------------------
>>>>> >> To unsubscribe, e-mail: [hidden email]
>>>>> >> For additional commands, e-mail: [hidden email]
>>>>> >>
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> > Robert Muir
>>>>> > [hidden email]
>>>>> >
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: [hidden email]
>>>>> For additional commands, e-mail: [hidden email]
>>>>>
>>>>
>>>
>>
>>
>>
>> --
>> Robert Muir
>> [hidden email]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]




--
Robert Muir
[hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: TestUTF32ToUTF8.testRandomRegexes fails

Robert Muir
In reply to this post by Shai Erera
first of all, thanks for taking the time to do all of this debugging!

my guess is this might be related to https://issues.apache.org/jira/browse/LUCENE-2565

does it fail if you apply Mike's patch?

On Mon, Jul 26, 2010 at 3:40 PM, Shai Erera <[hidden email]> wrote:
I don't know what was the thing w/ the strings generated before, but now I ran the test again w/ the same seed and it generates the same strings. So at least it seems there are no problems w/ the Random class :).

However, the string l.E fails w/ the IBM JVM and succeeds w/ SUN's. Any ideas why? What does the test check anyway?

I ran TRR2, and set the regexp to always be "l.E" and the test passes. The failure comes from

junit.framework.AssertionFailedError: expected:<true> but was:<false>
    at org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:199)
    at org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:171)

I've set regexp to "l.E", and also 'string' inside assertAutomaton to "\u006C\uD9FF\u0045". The byte[] returned from string.getBytes("UTF-8") are [108, 69]. It just ignores the middle character. Perhaps that's why the test fails?

When I run this w/ SUN's JVM, the bytes returned are [108, 63, 69].

If I manually set the bytes, using IBM's, to [108, 63, 69], then the test passes.

Interestingly, Googling for \uD9FF brings back LUCENE-2019 as the first result :). I'll dig some more into this character, and why the IBM and SUN JVMs return different byte[] representation for the same sequence of characters. If you already spot the problem, please let me know.

BTW, the test calls _TestUtil.getRandomMultiplier on every iteration loop, which goes and checks a system property. Perhaps we can extract it to a variable, or include a static constant in LuceneTestCase(J4) or something?

Shai


On Mon, Jul 26, 2010 at 9:22 PM, Robert Muir <[hidden email]> wrote:
maybe there is a bug in ibm's random generator :)


On Mon, Jul 26, 2010 at 11:50 AM, Michael McCandless <[hidden email]> wrote:
That's VERY spooky that w/ a fixed seed you see different random
regexps being made.

Mike

On Mon, Jul 26, 2010 at 11:40 AM, Shai Erera <[hidden email]> wrote:
> Ok I've dug deeper into the test. I set the random seed to
> -9029631602016965389L in setUp(), and discovered that on the 4th iteration
> it breaks. For some reason though, AutomatonTestUtil.randomRegex generates
> different strings every time I run the test, even though it uses the same
> Random object w/ the same seed ...
>
> Anyway, one of the regex that failed was this "l.E" (w/o the quotes) and I
> think it's a lowercase L, '.' (dot) and 'E' (uppercase). Hope this helps.
>
> Shai
>
> On Mon, Jul 26, 2010 at 6:23 PM, Robert Muir <[hidden email]> wrote:
>>
>> sounds nasty... its good you are running the tests with this different
>> jvm...
>>
>> On Mon, Jul 26, 2010 at 11:21 AM, Shai Erera <[hidden email]> wrote:
>>>
>>> Tried to run it w/ SUN JRE6 and it succeeds ! I've tried several times
>>> and it succeeds every time. However, when I revert back to IBM's, it fail
>>> immediately.
>>>
>>> I can help w/ the debug, if you give me a hint where to look :).
>>>
>>> Shai
>>>
>>> On Mon, Jul 26, 2010 at 5:57 PM, Shai Erera <[hidden email]> wrote:
>>>>
>>>> Sorry for the delayed response.
>>>>
>>>> I ran it a couple more times, from Eclipse and Ant, and each time it
>>>> fails (amazing !), w/ different seeds. More seeds that fail:
>>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>> -4244174191361080127
>>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>> -7059086272401721644
>>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>> -1314734215611104147
>>>>
>>>> I use IBM JVM, tried w/ both 1.5 and 1.6 ...
>>>>
>>>> Mike, can we use LUCENE-2565 to track this, or would you prefer that I
>>>> open a separate one?
>>>>
>>>> Shai
>>>>
>>>> On Mon, Jul 26, 2010 at 3:26 PM, Michael McCandless
>>>> <[hidden email]> wrote:
>>>>>
>>>>> On a more general note...
>>>>>
>>>>> Any time any of you out there hit an "odd" test failure, please please
>>>>> please do just what Shai did: take it to the dev list!
>>>>>
>>>>> Think of Lucene's unit tests like SETI :)  We are desperately seeking
>>>>> bugs, and you and your machine may just be lucky enough to find one...
>>>>> go forth and buy expensive new power hungry computers just so you can
>>>>> run the random tests over and over, seeking the bugs!
>>>>>
>>>>> But be sure to include that random seed when you do hit a failure...
>>>>>
>>>>> Mike
>>>>>
>>>>> On Mon, Jul 26, 2010 at 8:23 AM, Robert Muir <[hidden email]> wrote:
>>>>> > I agree, Shai can you open a bug? I cannot reproduce, did you use an
>>>>> > IBM JVM
>>>>> > or another environment that might help us figure it out?
>>>>> >
>>>>> > On Mon, Jul 26, 2010 at 6:29 AM, Michael McCandless
>>>>> > <[hidden email]> wrote:
>>>>> >>
>>>>> >> Hmmm this means a bug is lurking.  This is the power of random
>>>>> >> testing
>>>>> >> (that every time we all run tests, we're testing different "paths"
>>>>> >> through the code)....
>>>>> >>
>>>>> >> It seems exceptionally unlikely that LUCENE-2537's changes would
>>>>> >> cause
>>>>> >> this!
>>>>> >>
>>>>> >> But, unfortunately, when I plug that seed in I don't see it fail,
>>>>> >> which is odd.  I'll run a stress test to see if I can tickle the
>>>>> >> bug... can you open a Jira issue so we don't lose track?
>>>>> >>
>>>>> >> Mike
>>>>> >>
>>>>> >> On Mon, Jul 26, 2010 at 2:57 AM, Shai Erera <[hidden email]>
>>>>> >> wrote:
>>>>> >> > Hi
>>>>> >> >
>>>>> >> > I was running tests on trunk (after merging the changes from
>>>>> >> > LUCENE-2537)
>>>>> >> > and received this error message:
>>>>> >> >
>>>>> >> > expected:<true> but was:<false>
>>>>> >> >
>>>>> >> > junit.framework.AssertionFailedError: expected: but was:
>>>>> >> > at
>>>>> >> >
>>>>> >> >
>>>>> >> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197)
>>>>> >> > at
>>>>> >> >
>>>>> >> >
>>>>> >> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:170)
>>>>> >> > at
>>>>> >> >
>>>>> >> > org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:285)
>>>>> >> >
>>>>> >> > NOTE: random seed of testcase 'testRandomRegexes' was:
>>>>> >> > 3510820306304573866
>>>>> >> >
>>>>> >> > I'm sure it's related to my changes. Has anyone else seen this
>>>>> >> > before?
>>>>> >> >
>>>>> >> > Shai
>>>>> >> >
>>>>> >>
>>>>> >>
>>>>> >> ---------------------------------------------------------------------
>>>>> >> To unsubscribe, e-mail: [hidden email]
>>>>> >> For additional commands, e-mail: [hidden email]
>>>>> >>
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> > Robert Muir
>>>>> > [hidden email]
>>>>> >
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: [hidden email]
>>>>> For additional commands, e-mail: [hidden email]
>>>>>
>>>>
>>>
>>
>>
>>
>> --
>> Robert Muir
>> [hidden email]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]




--
Robert Muir
[hidden email]




--
Robert Muir
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: TestUTF32ToUTF8.testRandomRegexes fails

Michael McCandless-2
In reply to this post by Shai Erera
Yeah that char is a high surrogate which is unpaired, which is no good
-- it's invalid.  Cool, though, that Google puts us first when you
search on this character :)

Can you figure out how that bad string was created?  That "if
(random.nextBoolean())" either creates the string randomly (which
should never return unpaired surrogate), or, calls
RandomAcceptedString.getRandomAcceptedString... maybe the bug is in
RAS.

Mike

On Mon, Jul 26, 2010 at 3:41 PM, Shai Erera <[hidden email]> wrote:

> From here: http://www.fileformat.info/info/unicode/char/d9ff/index.htm
>
> Looks like that character is not a valid Unicode character, and perhaps the
> IBM's JVM behaves correctly? Robert - you're the Unicode expert :).
>
> Shai
>
> On Mon, Jul 26, 2010 at 10:40 PM, Shai Erera <[hidden email]> wrote:
>>
>> I don't know what was the thing w/ the strings generated before, but now I
>> ran the test again w/ the same seed and it generates the same strings. So at
>> least it seems there are no problems w/ the Random class :).
>>
>> However, the string l.E fails w/ the IBM JVM and succeeds w/ SUN's. Any
>> ideas why? What does the test check anyway?
>>
>> I ran TRR2, and set the regexp to always be "l.E" and the test passes. The
>> failure comes from
>>
>> junit.framework.AssertionFailedError: expected:<true> but was:<false>
>>     at
>> org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:199)
>>     at
>> org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:171)
>>
>> I've set regexp to "l.E", and also 'string' inside assertAutomaton to
>> "\u006C\uD9FF\u0045". The byte[] returned from string.getBytes("UTF-8") are
>> [108, 69]. It just ignores the middle character. Perhaps that's why the test
>> fails?
>>
>> When I run this w/ SUN's JVM, the bytes returned are [108, 63, 69].
>>
>> If I manually set the bytes, using IBM's, to [108, 63, 69], then the test
>> passes.
>>
>> Interestingly, Googling for \uD9FF brings back LUCENE-2019 as the first
>> result :). I'll dig some more into this character, and why the IBM and SUN
>> JVMs return different byte[] representation for the same sequence of
>> characters. If you already spot the problem, please let me know.
>>
>> BTW, the test calls _TestUtil.getRandomMultiplier on every iteration loop,
>> which goes and checks a system property. Perhaps we can extract it to a
>> variable, or include a static constant in LuceneTestCase(J4) or something?
>>
>> Shai
>>
>> On Mon, Jul 26, 2010 at 9:22 PM, Robert Muir <[hidden email]> wrote:
>>>
>>> maybe there is a bug in ibm's random generator :)
>>>
>>> On Mon, Jul 26, 2010 at 11:50 AM, Michael McCandless
>>> <[hidden email]> wrote:
>>>>
>>>> That's VERY spooky that w/ a fixed seed you see different random
>>>> regexps being made.
>>>>
>>>> Mike
>>>>
>>>> On Mon, Jul 26, 2010 at 11:40 AM, Shai Erera <[hidden email]> wrote:
>>>> > Ok I've dug deeper into the test. I set the random seed to
>>>> > -9029631602016965389L in setUp(), and discovered that on the 4th
>>>> > iteration
>>>> > it breaks. For some reason though, AutomatonTestUtil.randomRegex
>>>> > generates
>>>> > different strings every time I run the test, even though it uses the
>>>> > same
>>>> > Random object w/ the same seed ...
>>>> >
>>>> > Anyway, one of the regex that failed was this "l.E" (w/o the quotes)
>>>> > and I
>>>> > think it's a lowercase L, '.' (dot) and 'E' (uppercase). Hope this
>>>> > helps.
>>>> >
>>>> > Shai
>>>> >
>>>> > On Mon, Jul 26, 2010 at 6:23 PM, Robert Muir <[hidden email]> wrote:
>>>> >>
>>>> >> sounds nasty... its good you are running the tests with this
>>>> >> different
>>>> >> jvm...
>>>> >>
>>>> >> On Mon, Jul 26, 2010 at 11:21 AM, Shai Erera <[hidden email]>
>>>> >> wrote:
>>>> >>>
>>>> >>> Tried to run it w/ SUN JRE6 and it succeeds ! I've tried several
>>>> >>> times
>>>> >>> and it succeeds every time. However, when I revert back to IBM's, it
>>>> >>> fail
>>>> >>> immediately.
>>>> >>>
>>>> >>> I can help w/ the debug, if you give me a hint where to look :).
>>>> >>>
>>>> >>> Shai
>>>> >>>
>>>> >>> On Mon, Jul 26, 2010 at 5:57 PM, Shai Erera <[hidden email]>
>>>> >>> wrote:
>>>> >>>>
>>>> >>>> Sorry for the delayed response.
>>>> >>>>
>>>> >>>> I ran it a couple more times, from Eclipse and Ant, and each time
>>>> >>>> it
>>>> >>>> fails (amazing !), w/ different seeds. More seeds that fail:
>>>> >>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>> >>>> -4244174191361080127
>>>> >>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>> >>>> -7059086272401721644
>>>> >>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>> >>>> -1314734215611104147
>>>> >>>>
>>>> >>>> I use IBM JVM, tried w/ both 1.5 and 1.6 ...
>>>> >>>>
>>>> >>>> Mike, can we use LUCENE-2565 to track this, or would you prefer
>>>> >>>> that I
>>>> >>>> open a separate one?
>>>> >>>>
>>>> >>>> Shai
>>>> >>>>
>>>> >>>> On Mon, Jul 26, 2010 at 3:26 PM, Michael McCandless
>>>> >>>> <[hidden email]> wrote:
>>>> >>>>>
>>>> >>>>> On a more general note...
>>>> >>>>>
>>>> >>>>> Any time any of you out there hit an "odd" test failure, please
>>>> >>>>> please
>>>> >>>>> please do just what Shai did: take it to the dev list!
>>>> >>>>>
>>>> >>>>> Think of Lucene's unit tests like SETI :)  We are desperately
>>>> >>>>> seeking
>>>> >>>>> bugs, and you and your machine may just be lucky enough to find
>>>> >>>>> one...
>>>> >>>>> go forth and buy expensive new power hungry computers just so you
>>>> >>>>> can
>>>> >>>>> run the random tests over and over, seeking the bugs!
>>>> >>>>>
>>>> >>>>> But be sure to include that random seed when you do hit a
>>>> >>>>> failure...
>>>> >>>>>
>>>> >>>>> Mike
>>>> >>>>>
>>>> >>>>> On Mon, Jul 26, 2010 at 8:23 AM, Robert Muir <[hidden email]>
>>>> >>>>> wrote:
>>>> >>>>> > I agree, Shai can you open a bug? I cannot reproduce, did you
>>>> >>>>> > use an
>>>> >>>>> > IBM JVM
>>>> >>>>> > or another environment that might help us figure it out?
>>>> >>>>> >
>>>> >>>>> > On Mon, Jul 26, 2010 at 6:29 AM, Michael McCandless
>>>> >>>>> > <[hidden email]> wrote:
>>>> >>>>> >>
>>>> >>>>> >> Hmmm this means a bug is lurking.  This is the power of random
>>>> >>>>> >> testing
>>>> >>>>> >> (that every time we all run tests, we're testing different
>>>> >>>>> >> "paths"
>>>> >>>>> >> through the code)....
>>>> >>>>> >>
>>>> >>>>> >> It seems exceptionally unlikely that LUCENE-2537's changes
>>>> >>>>> >> would
>>>> >>>>> >> cause
>>>> >>>>> >> this!
>>>> >>>>> >>
>>>> >>>>> >> But, unfortunately, when I plug that seed in I don't see it
>>>> >>>>> >> fail,
>>>> >>>>> >> which is odd.  I'll run a stress test to see if I can tickle
>>>> >>>>> >> the
>>>> >>>>> >> bug... can you open a Jira issue so we don't lose track?
>>>> >>>>> >>
>>>> >>>>> >> Mike
>>>> >>>>> >>
>>>> >>>>> >> On Mon, Jul 26, 2010 at 2:57 AM, Shai Erera <[hidden email]>
>>>> >>>>> >> wrote:
>>>> >>>>> >> > Hi
>>>> >>>>> >> >
>>>> >>>>> >> > I was running tests on trunk (after merging the changes from
>>>> >>>>> >> > LUCENE-2537)
>>>> >>>>> >> > and received this error message:
>>>> >>>>> >> >
>>>> >>>>> >> > expected:<true> but was:<false>
>>>> >>>>> >> >
>>>> >>>>> >> > junit.framework.AssertionFailedError: expected: but was:
>>>> >>>>> >> > at
>>>> >>>>> >> >
>>>> >>>>> >> >
>>>> >>>>> >> >
>>>> >>>>> >> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197)
>>>> >>>>> >> > at
>>>> >>>>> >> >
>>>> >>>>> >> >
>>>> >>>>> >> >
>>>> >>>>> >> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:170)
>>>> >>>>> >> > at
>>>> >>>>> >> >
>>>> >>>>> >> >
>>>> >>>>> >> > org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:285)
>>>> >>>>> >> >
>>>> >>>>> >> > NOTE: random seed of testcase 'testRandomRegexes' was:
>>>> >>>>> >> > 3510820306304573866
>>>> >>>>> >> >
>>>> >>>>> >> > I'm sure it's related to my changes. Has anyone else seen
>>>> >>>>> >> > this
>>>> >>>>> >> > before?
>>>> >>>>> >> >
>>>> >>>>> >> > Shai
>>>> >>>>> >> >
>>>> >>>>> >>
>>>> >>>>> >>
>>>> >>>>> >>
>>>> >>>>> >> ---------------------------------------------------------------------
>>>> >>>>> >> To unsubscribe, e-mail: [hidden email]
>>>> >>>>> >> For additional commands, e-mail: [hidden email]
>>>> >>>>> >>
>>>> >>>>> >
>>>> >>>>> >
>>>> >>>>> >
>>>> >>>>> > --
>>>> >>>>> > Robert Muir
>>>> >>>>> > [hidden email]
>>>> >>>>> >
>>>> >>>>>
>>>> >>>>>
>>>> >>>>> ---------------------------------------------------------------------
>>>> >>>>> To unsubscribe, e-mail: [hidden email]
>>>> >>>>> For additional commands, e-mail: [hidden email]
>>>> >>>>>
>>>> >>>>
>>>> >>>
>>>> >>
>>>> >>
>>>> >>
>>>> >> --
>>>> >> Robert Muir
>>>> >> [hidden email]
>>>> >
>>>> >
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: [hidden email]
>>>> For additional commands, e-mail: [hidden email]
>>>>
>>>
>>>
>>>
>>> --
>>> Robert Muir
>>> [hidden email]
>>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: TestUTF32ToUTF8.testRandomRegexes fails

Robert Muir
In reply to this post by Shai Erera
I think Sun's String ctor probably does CodingErrorAction.REPLACE (insert the 0x3f: question mark char) and IBM's probably does CodingErrorAction.IGNORE (drops it)

i dont know who is right, both suck in my opinion, i like CodingErrorAction.REPORT (throw an exception).

On Mon, Jul 26, 2010 at 3:41 PM, Shai Erera <[hidden email]> wrote:
From here: http://www.fileformat.info/info/unicode/char/d9ff/index.htm

Looks like that character is not a valid Unicode character, and perhaps the IBM's JVM behaves correctly? Robert - you're the Unicode expert :).

Shai


On Mon, Jul 26, 2010 at 10:40 PM, Shai Erera <[hidden email]> wrote:
I don't know what was the thing w/ the strings generated before, but now I ran the test again w/ the same seed and it generates the same strings. So at least it seems there are no problems w/ the Random class :).

However, the string l.E fails w/ the IBM JVM and succeeds w/ SUN's. Any ideas why? What does the test check anyway?

I ran TRR2, and set the regexp to always be "l.E" and the test passes. The failure comes from

junit.framework.AssertionFailedError: expected:<true> but was:<false>
    at org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:199)
    at org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:171)

I've set regexp to "l.E", and also 'string' inside assertAutomaton to "\u006C\uD9FF\u0045". The byte[] returned from string.getBytes("UTF-8") are [108, 69]. It just ignores the middle character. Perhaps that's why the test fails?

When I run this w/ SUN's JVM, the bytes returned are [108, 63, 69].

If I manually set the bytes, using IBM's, to [108, 63, 69], then the test passes.

Interestingly, Googling for \uD9FF brings back LUCENE-2019 as the first result :). I'll dig some more into this character, and why the IBM and SUN JVMs return different byte[] representation for the same sequence of characters. If you already spot the problem, please let me know.

BTW, the test calls _TestUtil.getRandomMultiplier on every iteration loop, which goes and checks a system property. Perhaps we can extract it to a variable, or include a static constant in LuceneTestCase(J4) or something?

Shai


On Mon, Jul 26, 2010 at 9:22 PM, Robert Muir <[hidden email]> wrote:
maybe there is a bug in ibm's random generator :)


On Mon, Jul 26, 2010 at 11:50 AM, Michael McCandless <[hidden email]> wrote:
That's VERY spooky that w/ a fixed seed you see different random
regexps being made.

Mike

On Mon, Jul 26, 2010 at 11:40 AM, Shai Erera <[hidden email]> wrote:
> Ok I've dug deeper into the test. I set the random seed to
> -9029631602016965389L in setUp(), and discovered that on the 4th iteration
> it breaks. For some reason though, AutomatonTestUtil.randomRegex generates
> different strings every time I run the test, even though it uses the same
> Random object w/ the same seed ...
>
> Anyway, one of the regex that failed was this "l.E" (w/o the quotes) and I
> think it's a lowercase L, '.' (dot) and 'E' (uppercase). Hope this helps.
>
> Shai
>
> On Mon, Jul 26, 2010 at 6:23 PM, Robert Muir <[hidden email]> wrote:
>>
>> sounds nasty... its good you are running the tests with this different
>> jvm...
>>
>> On Mon, Jul 26, 2010 at 11:21 AM, Shai Erera <[hidden email]> wrote:
>>>
>>> Tried to run it w/ SUN JRE6 and it succeeds ! I've tried several times
>>> and it succeeds every time. However, when I revert back to IBM's, it fail
>>> immediately.
>>>
>>> I can help w/ the debug, if you give me a hint where to look :).
>>>
>>> Shai
>>>
>>> On Mon, Jul 26, 2010 at 5:57 PM, Shai Erera <[hidden email]> wrote:
>>>>
>>>> Sorry for the delayed response.
>>>>
>>>> I ran it a couple more times, from Eclipse and Ant, and each time it
>>>> fails (amazing !), w/ different seeds. More seeds that fail:
>>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>> -4244174191361080127
>>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>> -7059086272401721644
>>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>> -1314734215611104147
>>>>
>>>> I use IBM JVM, tried w/ both 1.5 and 1.6 ...
>>>>
>>>> Mike, can we use LUCENE-2565 to track this, or would you prefer that I
>>>> open a separate one?
>>>>
>>>> Shai
>>>>
>>>> On Mon, Jul 26, 2010 at 3:26 PM, Michael McCandless
>>>> <[hidden email]> wrote:
>>>>>
>>>>> On a more general note...
>>>>>
>>>>> Any time any of you out there hit an "odd" test failure, please please
>>>>> please do just what Shai did: take it to the dev list!
>>>>>
>>>>> Think of Lucene's unit tests like SETI :)  We are desperately seeking
>>>>> bugs, and you and your machine may just be lucky enough to find one...
>>>>> go forth and buy expensive new power hungry computers just so you can
>>>>> run the random tests over and over, seeking the bugs!
>>>>>
>>>>> But be sure to include that random seed when you do hit a failure...
>>>>>
>>>>> Mike
>>>>>
>>>>> On Mon, Jul 26, 2010 at 8:23 AM, Robert Muir <[hidden email]> wrote:
>>>>> > I agree, Shai can you open a bug? I cannot reproduce, did you use an
>>>>> > IBM JVM
>>>>> > or another environment that might help us figure it out?
>>>>> >
>>>>> > On Mon, Jul 26, 2010 at 6:29 AM, Michael McCandless
>>>>> > <[hidden email]> wrote:
>>>>> >>
>>>>> >> Hmmm this means a bug is lurking.  This is the power of random
>>>>> >> testing
>>>>> >> (that every time we all run tests, we're testing different "paths"
>>>>> >> through the code)....
>>>>> >>
>>>>> >> It seems exceptionally unlikely that LUCENE-2537's changes would
>>>>> >> cause
>>>>> >> this!
>>>>> >>
>>>>> >> But, unfortunately, when I plug that seed in I don't see it fail,
>>>>> >> which is odd.  I'll run a stress test to see if I can tickle the
>>>>> >> bug... can you open a Jira issue so we don't lose track?
>>>>> >>
>>>>> >> Mike
>>>>> >>
>>>>> >> On Mon, Jul 26, 2010 at 2:57 AM, Shai Erera <[hidden email]>
>>>>> >> wrote:
>>>>> >> > Hi
>>>>> >> >
>>>>> >> > I was running tests on trunk (after merging the changes from
>>>>> >> > LUCENE-2537)
>>>>> >> > and received this error message:
>>>>> >> >
>>>>> >> > expected:<true> but was:<false>
>>>>> >> >
>>>>> >> > junit.framework.AssertionFailedError: expected: but was:
>>>>> >> > at
>>>>> >> >
>>>>> >> >
>>>>> >> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197)
>>>>> >> > at
>>>>> >> >
>>>>> >> >
>>>>> >> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:170)
>>>>> >> > at
>>>>> >> >
>>>>> >> > org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:285)
>>>>> >> >
>>>>> >> > NOTE: random seed of testcase 'testRandomRegexes' was:
>>>>> >> > 3510820306304573866
>>>>> >> >
>>>>> >> > I'm sure it's related to my changes. Has anyone else seen this
>>>>> >> > before?
>>>>> >> >
>>>>> >> > Shai
>>>>> >> >
>>>>> >>
>>>>> >>
>>>>> >> ---------------------------------------------------------------------
>>>>> >> To unsubscribe, e-mail: [hidden email]
>>>>> >> For additional commands, e-mail: [hidden email]
>>>>> >>
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> > Robert Muir
>>>>> > [hidden email]
>>>>> >
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: [hidden email]
>>>>> For additional commands, e-mail: [hidden email]
>>>>>
>>>>
>>>
>>
>>
>>
>> --
>> Robert Muir
>> [hidden email]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]




--
Robert Muir
[hidden email]





--
Robert Muir
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: TestUTF32ToUTF8.testRandomRegexes fails

Michael McCandless-2
In reply to this post by Michael McCandless-2
OK I think likely this is a bug in RAS.  And we are just seeing the
difference in how Oracle's & IBM's JREs handle an unpaired
surrogate...

Lemme work out a patch...

Mike

On Mon, Jul 26, 2010 at 4:13 PM, Michael McCandless
<[hidden email]> wrote:

> Yeah that char is a high surrogate which is unpaired, which is no good
> -- it's invalid.  Cool, though, that Google puts us first when you
> search on this character :)
>
> Can you figure out how that bad string was created?  That "if
> (random.nextBoolean())" either creates the string randomly (which
> should never return unpaired surrogate), or, calls
> RandomAcceptedString.getRandomAcceptedString... maybe the bug is in
> RAS.
>
> Mike
>
> On Mon, Jul 26, 2010 at 3:41 PM, Shai Erera <[hidden email]> wrote:
>> From here: http://www.fileformat.info/info/unicode/char/d9ff/index.htm
>>
>> Looks like that character is not a valid Unicode character, and perhaps the
>> IBM's JVM behaves correctly? Robert - you're the Unicode expert :).
>>
>> Shai
>>
>> On Mon, Jul 26, 2010 at 10:40 PM, Shai Erera <[hidden email]> wrote:
>>>
>>> I don't know what was the thing w/ the strings generated before, but now I
>>> ran the test again w/ the same seed and it generates the same strings. So at
>>> least it seems there are no problems w/ the Random class :).
>>>
>>> However, the string l.E fails w/ the IBM JVM and succeeds w/ SUN's. Any
>>> ideas why? What does the test check anyway?
>>>
>>> I ran TRR2, and set the regexp to always be "l.E" and the test passes. The
>>> failure comes from
>>>
>>> junit.framework.AssertionFailedError: expected:<true> but was:<false>
>>>     at
>>> org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:199)
>>>     at
>>> org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:171)
>>>
>>> I've set regexp to "l.E", and also 'string' inside assertAutomaton to
>>> "\u006C\uD9FF\u0045". The byte[] returned from string.getBytes("UTF-8") are
>>> [108, 69]. It just ignores the middle character. Perhaps that's why the test
>>> fails?
>>>
>>> When I run this w/ SUN's JVM, the bytes returned are [108, 63, 69].
>>>
>>> If I manually set the bytes, using IBM's, to [108, 63, 69], then the test
>>> passes.
>>>
>>> Interestingly, Googling for \uD9FF brings back LUCENE-2019 as the first
>>> result :). I'll dig some more into this character, and why the IBM and SUN
>>> JVMs return different byte[] representation for the same sequence of
>>> characters. If you already spot the problem, please let me know.
>>>
>>> BTW, the test calls _TestUtil.getRandomMultiplier on every iteration loop,
>>> which goes and checks a system property. Perhaps we can extract it to a
>>> variable, or include a static constant in LuceneTestCase(J4) or something?
>>>
>>> Shai
>>>
>>> On Mon, Jul 26, 2010 at 9:22 PM, Robert Muir <[hidden email]> wrote:
>>>>
>>>> maybe there is a bug in ibm's random generator :)
>>>>
>>>> On Mon, Jul 26, 2010 at 11:50 AM, Michael McCandless
>>>> <[hidden email]> wrote:
>>>>>
>>>>> That's VERY spooky that w/ a fixed seed you see different random
>>>>> regexps being made.
>>>>>
>>>>> Mike
>>>>>
>>>>> On Mon, Jul 26, 2010 at 11:40 AM, Shai Erera <[hidden email]> wrote:
>>>>> > Ok I've dug deeper into the test. I set the random seed to
>>>>> > -9029631602016965389L in setUp(), and discovered that on the 4th
>>>>> > iteration
>>>>> > it breaks. For some reason though, AutomatonTestUtil.randomRegex
>>>>> > generates
>>>>> > different strings every time I run the test, even though it uses the
>>>>> > same
>>>>> > Random object w/ the same seed ...
>>>>> >
>>>>> > Anyway, one of the regex that failed was this "l.E" (w/o the quotes)
>>>>> > and I
>>>>> > think it's a lowercase L, '.' (dot) and 'E' (uppercase). Hope this
>>>>> > helps.
>>>>> >
>>>>> > Shai
>>>>> >
>>>>> > On Mon, Jul 26, 2010 at 6:23 PM, Robert Muir <[hidden email]> wrote:
>>>>> >>
>>>>> >> sounds nasty... its good you are running the tests with this
>>>>> >> different
>>>>> >> jvm...
>>>>> >>
>>>>> >> On Mon, Jul 26, 2010 at 11:21 AM, Shai Erera <[hidden email]>
>>>>> >> wrote:
>>>>> >>>
>>>>> >>> Tried to run it w/ SUN JRE6 and it succeeds ! I've tried several
>>>>> >>> times
>>>>> >>> and it succeeds every time. However, when I revert back to IBM's, it
>>>>> >>> fail
>>>>> >>> immediately.
>>>>> >>>
>>>>> >>> I can help w/ the debug, if you give me a hint where to look :).
>>>>> >>>
>>>>> >>> Shai
>>>>> >>>
>>>>> >>> On Mon, Jul 26, 2010 at 5:57 PM, Shai Erera <[hidden email]>
>>>>> >>> wrote:
>>>>> >>>>
>>>>> >>>> Sorry for the delayed response.
>>>>> >>>>
>>>>> >>>> I ran it a couple more times, from Eclipse and Ant, and each time
>>>>> >>>> it
>>>>> >>>> fails (amazing !), w/ different seeds. More seeds that fail:
>>>>> >>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>>> >>>> -4244174191361080127
>>>>> >>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>>> >>>> -7059086272401721644
>>>>> >>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>>> >>>> -1314734215611104147
>>>>> >>>>
>>>>> >>>> I use IBM JVM, tried w/ both 1.5 and 1.6 ...
>>>>> >>>>
>>>>> >>>> Mike, can we use LUCENE-2565 to track this, or would you prefer
>>>>> >>>> that I
>>>>> >>>> open a separate one?
>>>>> >>>>
>>>>> >>>> Shai
>>>>> >>>>
>>>>> >>>> On Mon, Jul 26, 2010 at 3:26 PM, Michael McCandless
>>>>> >>>> <[hidden email]> wrote:
>>>>> >>>>>
>>>>> >>>>> On a more general note...
>>>>> >>>>>
>>>>> >>>>> Any time any of you out there hit an "odd" test failure, please
>>>>> >>>>> please
>>>>> >>>>> please do just what Shai did: take it to the dev list!
>>>>> >>>>>
>>>>> >>>>> Think of Lucene's unit tests like SETI :)  We are desperately
>>>>> >>>>> seeking
>>>>> >>>>> bugs, and you and your machine may just be lucky enough to find
>>>>> >>>>> one...
>>>>> >>>>> go forth and buy expensive new power hungry computers just so you
>>>>> >>>>> can
>>>>> >>>>> run the random tests over and over, seeking the bugs!
>>>>> >>>>>
>>>>> >>>>> But be sure to include that random seed when you do hit a
>>>>> >>>>> failure...
>>>>> >>>>>
>>>>> >>>>> Mike
>>>>> >>>>>
>>>>> >>>>> On Mon, Jul 26, 2010 at 8:23 AM, Robert Muir <[hidden email]>
>>>>> >>>>> wrote:
>>>>> >>>>> > I agree, Shai can you open a bug? I cannot reproduce, did you
>>>>> >>>>> > use an
>>>>> >>>>> > IBM JVM
>>>>> >>>>> > or another environment that might help us figure it out?
>>>>> >>>>> >
>>>>> >>>>> > On Mon, Jul 26, 2010 at 6:29 AM, Michael McCandless
>>>>> >>>>> > <[hidden email]> wrote:
>>>>> >>>>> >>
>>>>> >>>>> >> Hmmm this means a bug is lurking.  This is the power of random
>>>>> >>>>> >> testing
>>>>> >>>>> >> (that every time we all run tests, we're testing different
>>>>> >>>>> >> "paths"
>>>>> >>>>> >> through the code)....
>>>>> >>>>> >>
>>>>> >>>>> >> It seems exceptionally unlikely that LUCENE-2537's changes
>>>>> >>>>> >> would
>>>>> >>>>> >> cause
>>>>> >>>>> >> this!
>>>>> >>>>> >>
>>>>> >>>>> >> But, unfortunately, when I plug that seed in I don't see it
>>>>> >>>>> >> fail,
>>>>> >>>>> >> which is odd.  I'll run a stress test to see if I can tickle
>>>>> >>>>> >> the
>>>>> >>>>> >> bug... can you open a Jira issue so we don't lose track?
>>>>> >>>>> >>
>>>>> >>>>> >> Mike
>>>>> >>>>> >>
>>>>> >>>>> >> On Mon, Jul 26, 2010 at 2:57 AM, Shai Erera <[hidden email]>
>>>>> >>>>> >> wrote:
>>>>> >>>>> >> > Hi
>>>>> >>>>> >> >
>>>>> >>>>> >> > I was running tests on trunk (after merging the changes from
>>>>> >>>>> >> > LUCENE-2537)
>>>>> >>>>> >> > and received this error message:
>>>>> >>>>> >> >
>>>>> >>>>> >> > expected:<true> but was:<false>
>>>>> >>>>> >> >
>>>>> >>>>> >> > junit.framework.AssertionFailedError: expected: but was:
>>>>> >>>>> >> > at
>>>>> >>>>> >> >
>>>>> >>>>> >> >
>>>>> >>>>> >> >
>>>>> >>>>> >> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197)
>>>>> >>>>> >> > at
>>>>> >>>>> >> >
>>>>> >>>>> >> >
>>>>> >>>>> >> >
>>>>> >>>>> >> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:170)
>>>>> >>>>> >> > at
>>>>> >>>>> >> >
>>>>> >>>>> >> >
>>>>> >>>>> >> > org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:285)
>>>>> >>>>> >> >
>>>>> >>>>> >> > NOTE: random seed of testcase 'testRandomRegexes' was:
>>>>> >>>>> >> > 3510820306304573866
>>>>> >>>>> >> >
>>>>> >>>>> >> > I'm sure it's related to my changes. Has anyone else seen
>>>>> >>>>> >> > this
>>>>> >>>>> >> > before?
>>>>> >>>>> >> >
>>>>> >>>>> >> > Shai
>>>>> >>>>> >> >
>>>>> >>>>> >>
>>>>> >>>>> >>
>>>>> >>>>> >>
>>>>> >>>>> >> ---------------------------------------------------------------------
>>>>> >>>>> >> To unsubscribe, e-mail: [hidden email]
>>>>> >>>>> >> For additional commands, e-mail: [hidden email]
>>>>> >>>>> >>
>>>>> >>>>> >
>>>>> >>>>> >
>>>>> >>>>> >
>>>>> >>>>> > --
>>>>> >>>>> > Robert Muir
>>>>> >>>>> > [hidden email]
>>>>> >>>>> >
>>>>> >>>>>
>>>>> >>>>>
>>>>> >>>>> ---------------------------------------------------------------------
>>>>> >>>>> To unsubscribe, e-mail: [hidden email]
>>>>> >>>>> For additional commands, e-mail: [hidden email]
>>>>> >>>>>
>>>>> >>>>
>>>>> >>>
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> --
>>>>> >> Robert Muir
>>>>> >> [hidden email]
>>>>> >
>>>>> >
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: [hidden email]
>>>>> For additional commands, e-mail: [hidden email]
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Robert Muir
>>>> [hidden email]
>>>
>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: TestUTF32ToUTF8.testRandomRegexes fails

Michael McCandless-2
Shai can you try the patch on LUCENE-2568?  Thanks.

Mike

On Mon, Jul 26, 2010 at 4:25 PM, Michael McCandless
<[hidden email]> wrote:

> OK I think likely this is a bug in RAS.  And we are just seeing the
> difference in how Oracle's & IBM's JREs handle an unpaired
> surrogate...
>
> Lemme work out a patch...
>
> Mike
>
> On Mon, Jul 26, 2010 at 4:13 PM, Michael McCandless
> <[hidden email]> wrote:
>> Yeah that char is a high surrogate which is unpaired, which is no good
>> -- it's invalid.  Cool, though, that Google puts us first when you
>> search on this character :)
>>
>> Can you figure out how that bad string was created?  That "if
>> (random.nextBoolean())" either creates the string randomly (which
>> should never return unpaired surrogate), or, calls
>> RandomAcceptedString.getRandomAcceptedString... maybe the bug is in
>> RAS.
>>
>> Mike
>>
>> On Mon, Jul 26, 2010 at 3:41 PM, Shai Erera <[hidden email]> wrote:
>>> From here: http://www.fileformat.info/info/unicode/char/d9ff/index.htm
>>>
>>> Looks like that character is not a valid Unicode character, and perhaps the
>>> IBM's JVM behaves correctly? Robert - you're the Unicode expert :).
>>>
>>> Shai
>>>
>>> On Mon, Jul 26, 2010 at 10:40 PM, Shai Erera <[hidden email]> wrote:
>>>>
>>>> I don't know what was the thing w/ the strings generated before, but now I
>>>> ran the test again w/ the same seed and it generates the same strings. So at
>>>> least it seems there are no problems w/ the Random class :).
>>>>
>>>> However, the string l.E fails w/ the IBM JVM and succeeds w/ SUN's. Any
>>>> ideas why? What does the test check anyway?
>>>>
>>>> I ran TRR2, and set the regexp to always be "l.E" and the test passes. The
>>>> failure comes from
>>>>
>>>> junit.framework.AssertionFailedError: expected:<true> but was:<false>
>>>>     at
>>>> org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:199)
>>>>     at
>>>> org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:171)
>>>>
>>>> I've set regexp to "l.E", and also 'string' inside assertAutomaton to
>>>> "\u006C\uD9FF\u0045". The byte[] returned from string.getBytes("UTF-8") are
>>>> [108, 69]. It just ignores the middle character. Perhaps that's why the test
>>>> fails?
>>>>
>>>> When I run this w/ SUN's JVM, the bytes returned are [108, 63, 69].
>>>>
>>>> If I manually set the bytes, using IBM's, to [108, 63, 69], then the test
>>>> passes.
>>>>
>>>> Interestingly, Googling for \uD9FF brings back LUCENE-2019 as the first
>>>> result :). I'll dig some more into this character, and why the IBM and SUN
>>>> JVMs return different byte[] representation for the same sequence of
>>>> characters. If you already spot the problem, please let me know.
>>>>
>>>> BTW, the test calls _TestUtil.getRandomMultiplier on every iteration loop,
>>>> which goes and checks a system property. Perhaps we can extract it to a
>>>> variable, or include a static constant in LuceneTestCase(J4) or something?
>>>>
>>>> Shai
>>>>
>>>> On Mon, Jul 26, 2010 at 9:22 PM, Robert Muir <[hidden email]> wrote:
>>>>>
>>>>> maybe there is a bug in ibm's random generator :)
>>>>>
>>>>> On Mon, Jul 26, 2010 at 11:50 AM, Michael McCandless
>>>>> <[hidden email]> wrote:
>>>>>>
>>>>>> That's VERY spooky that w/ a fixed seed you see different random
>>>>>> regexps being made.
>>>>>>
>>>>>> Mike
>>>>>>
>>>>>> On Mon, Jul 26, 2010 at 11:40 AM, Shai Erera <[hidden email]> wrote:
>>>>>> > Ok I've dug deeper into the test. I set the random seed to
>>>>>> > -9029631602016965389L in setUp(), and discovered that on the 4th
>>>>>> > iteration
>>>>>> > it breaks. For some reason though, AutomatonTestUtil.randomRegex
>>>>>> > generates
>>>>>> > different strings every time I run the test, even though it uses the
>>>>>> > same
>>>>>> > Random object w/ the same seed ...
>>>>>> >
>>>>>> > Anyway, one of the regex that failed was this "l.E" (w/o the quotes)
>>>>>> > and I
>>>>>> > think it's a lowercase L, '.' (dot) and 'E' (uppercase). Hope this
>>>>>> > helps.
>>>>>> >
>>>>>> > Shai
>>>>>> >
>>>>>> > On Mon, Jul 26, 2010 at 6:23 PM, Robert Muir <[hidden email]> wrote:
>>>>>> >>
>>>>>> >> sounds nasty... its good you are running the tests with this
>>>>>> >> different
>>>>>> >> jvm...
>>>>>> >>
>>>>>> >> On Mon, Jul 26, 2010 at 11:21 AM, Shai Erera <[hidden email]>
>>>>>> >> wrote:
>>>>>> >>>
>>>>>> >>> Tried to run it w/ SUN JRE6 and it succeeds ! I've tried several
>>>>>> >>> times
>>>>>> >>> and it succeeds every time. However, when I revert back to IBM's, it
>>>>>> >>> fail
>>>>>> >>> immediately.
>>>>>> >>>
>>>>>> >>> I can help w/ the debug, if you give me a hint where to look :).
>>>>>> >>>
>>>>>> >>> Shai
>>>>>> >>>
>>>>>> >>> On Mon, Jul 26, 2010 at 5:57 PM, Shai Erera <[hidden email]>
>>>>>> >>> wrote:
>>>>>> >>>>
>>>>>> >>>> Sorry for the delayed response.
>>>>>> >>>>
>>>>>> >>>> I ran it a couple more times, from Eclipse and Ant, and each time
>>>>>> >>>> it
>>>>>> >>>> fails (amazing !), w/ different seeds. More seeds that fail:
>>>>>> >>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>>>> >>>> -4244174191361080127
>>>>>> >>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>>>> >>>> -7059086272401721644
>>>>>> >>>> NOTE: random seed of testcase 'testRandomRegexes' was:
>>>>>> >>>> -1314734215611104147
>>>>>> >>>>
>>>>>> >>>> I use IBM JVM, tried w/ both 1.5 and 1.6 ...
>>>>>> >>>>
>>>>>> >>>> Mike, can we use LUCENE-2565 to track this, or would you prefer
>>>>>> >>>> that I
>>>>>> >>>> open a separate one?
>>>>>> >>>>
>>>>>> >>>> Shai
>>>>>> >>>>
>>>>>> >>>> On Mon, Jul 26, 2010 at 3:26 PM, Michael McCandless
>>>>>> >>>> <[hidden email]> wrote:
>>>>>> >>>>>
>>>>>> >>>>> On a more general note...
>>>>>> >>>>>
>>>>>> >>>>> Any time any of you out there hit an "odd" test failure, please
>>>>>> >>>>> please
>>>>>> >>>>> please do just what Shai did: take it to the dev list!
>>>>>> >>>>>
>>>>>> >>>>> Think of Lucene's unit tests like SETI :)  We are desperately
>>>>>> >>>>> seeking
>>>>>> >>>>> bugs, and you and your machine may just be lucky enough to find
>>>>>> >>>>> one...
>>>>>> >>>>> go forth and buy expensive new power hungry computers just so you
>>>>>> >>>>> can
>>>>>> >>>>> run the random tests over and over, seeking the bugs!
>>>>>> >>>>>
>>>>>> >>>>> But be sure to include that random seed when you do hit a
>>>>>> >>>>> failure...
>>>>>> >>>>>
>>>>>> >>>>> Mike
>>>>>> >>>>>
>>>>>> >>>>> On Mon, Jul 26, 2010 at 8:23 AM, Robert Muir <[hidden email]>
>>>>>> >>>>> wrote:
>>>>>> >>>>> > I agree, Shai can you open a bug? I cannot reproduce, did you
>>>>>> >>>>> > use an
>>>>>> >>>>> > IBM JVM
>>>>>> >>>>> > or another environment that might help us figure it out?
>>>>>> >>>>> >
>>>>>> >>>>> > On Mon, Jul 26, 2010 at 6:29 AM, Michael McCandless
>>>>>> >>>>> > <[hidden email]> wrote:
>>>>>> >>>>> >>
>>>>>> >>>>> >> Hmmm this means a bug is lurking.  This is the power of random
>>>>>> >>>>> >> testing
>>>>>> >>>>> >> (that every time we all run tests, we're testing different
>>>>>> >>>>> >> "paths"
>>>>>> >>>>> >> through the code)....
>>>>>> >>>>> >>
>>>>>> >>>>> >> It seems exceptionally unlikely that LUCENE-2537's changes
>>>>>> >>>>> >> would
>>>>>> >>>>> >> cause
>>>>>> >>>>> >> this!
>>>>>> >>>>> >>
>>>>>> >>>>> >> But, unfortunately, when I plug that seed in I don't see it
>>>>>> >>>>> >> fail,
>>>>>> >>>>> >> which is odd.  I'll run a stress test to see if I can tickle
>>>>>> >>>>> >> the
>>>>>> >>>>> >> bug... can you open a Jira issue so we don't lose track?
>>>>>> >>>>> >>
>>>>>> >>>>> >> Mike
>>>>>> >>>>> >>
>>>>>> >>>>> >> On Mon, Jul 26, 2010 at 2:57 AM, Shai Erera <[hidden email]>
>>>>>> >>>>> >> wrote:
>>>>>> >>>>> >> > Hi
>>>>>> >>>>> >> >
>>>>>> >>>>> >> > I was running tests on trunk (after merging the changes from
>>>>>> >>>>> >> > LUCENE-2537)
>>>>>> >>>>> >> > and received this error message:
>>>>>> >>>>> >> >
>>>>>> >>>>> >> > expected:<true> but was:<false>
>>>>>> >>>>> >> >
>>>>>> >>>>> >> > junit.framework.AssertionFailedError: expected: but was:
>>>>>> >>>>> >> > at
>>>>>> >>>>> >> >
>>>>>> >>>>> >> >
>>>>>> >>>>> >> >
>>>>>> >>>>> >> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.assertAutomaton(TestUTF32ToUTF8.java:197)
>>>>>> >>>>> >> > at
>>>>>> >>>>> >> >
>>>>>> >>>>> >> >
>>>>>> >>>>> >> >
>>>>>> >>>>> >> > org.apache.lucene.util.automaton.TestUTF32ToUTF8.testRandomRegexes(TestUTF32ToUTF8.java:170)
>>>>>> >>>>> >> > at
>>>>>> >>>>> >> >
>>>>>> >>>>> >> >
>>>>>> >>>>> >> > org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:285)
>>>>>> >>>>> >> >
>>>>>> >>>>> >> > NOTE: random seed of testcase 'testRandomRegexes' was:
>>>>>> >>>>> >> > 3510820306304573866
>>>>>> >>>>> >> >
>>>>>> >>>>> >> > I'm sure it's related to my changes. Has anyone else seen
>>>>>> >>>>> >> > this
>>>>>> >>>>> >> > before?
>>>>>> >>>>> >> >
>>>>>> >>>>> >> > Shai
>>>>>> >>>>> >> >
>>>>>> >>>>> >>
>>>>>> >>>>> >>
>>>>>> >>>>> >>
>>>>>> >>>>> >> ---------------------------------------------------------------------
>>>>>> >>>>> >> To unsubscribe, e-mail: [hidden email]
>>>>>> >>>>> >> For additional commands, e-mail: [hidden email]
>>>>>> >>>>> >>
>>>>>> >>>>> >
>>>>>> >>>>> >
>>>>>> >>>>> >
>>>>>> >>>>> > --
>>>>>> >>>>> > Robert Muir
>>>>>> >>>>> > [hidden email]
>>>>>> >>>>> >
>>>>>> >>>>>
>>>>>> >>>>>
>>>>>> >>>>> ---------------------------------------------------------------------
>>>>>> >>>>> To unsubscribe, e-mail: [hidden email]
>>>>>> >>>>> For additional commands, e-mail: [hidden email]
>>>>>> >>>>>
>>>>>> >>>>
>>>>>> >>>
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> --
>>>>>> >> Robert Muir
>>>>>> >> [hidden email]
>>>>>> >
>>>>>> >
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: [hidden email]
>>>>>> For additional commands, e-mail: [hidden email]
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Robert Muir
>>>>> [hidden email]
>>>>
>>>
>>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

12