Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] InternalAutoDateHistogramTests::testReduceRandom fails reproducable for certain seed #39497

Closed
hendrikmuhs opened this issue Feb 28, 2019 · 11 comments · Fixed by #54602
Closed
Assignees
Labels

Comments

@hendrikmuhs
Copy link

The following test failed in a PR-build and is reproducible for me on my local box:

./gradlew :server:unitTest -Dtests.seed=24D81295C2796C6C -Dtests.class=org.elasticsearch.search.aggregations.bucket.histogram.InternalAutoDateHistogramTests -Dtests.method="testReduceRandom" -Dtests.security.manager=true -Dtests.locale=el-CY -Dtests.timezone=Europe/Nicosia -Dcompiler.java=11 -Druntime.java=8

fails with:

  1> [2019-02-28T10:25:36,507][INFO ][o.e.s.a.b.h.InternalAutoDateHistogramTests] [testReduceRandom] before test
  1> [2019-02-28T10:25:36,723][INFO ][o.e.s.a.b.h.InternalAutoDateHistogramTests] [testReduceRandom] after test
  2> REPRODUCE WITH: ./gradlew :server:unitTest -Dtests.seed=24D81295C2796C6C -Dtests.class=org.elasticsearch.search.aggregations.bucket.histogram.InternalAutoDateHistogramTests -Dtests.method="testReduceRandom" -Dtests.security.manager=true -Dtests.locale=el-CY -Dtests.timezone=Europe/Nicosia -Dcompiler.java=11 -Druntime.java=8
FAILURE 1.13s | InternalAutoDateHistogramTests.testReduceRandom <<< FAILURES!
   > Throwable #1: java.lang.AssertionError: expected:<{1548979200000=1700}> but was:<{1548979200000=1569, 1551398400000=131}>
   >    at __randomizedtesting.SeedInfo.seed([24D81295C2796C6C:E4E5BE84E424F53A]:0)
   >    at org.elasticsearch.search.aggregations.bucket.histogram.InternalAutoDateHistogramTests.assertReduced(InternalAutoDateHistogramTests.java:200)
   >    at org.elasticsearch.search.aggregations.bucket.histogram.InternalAutoDateHistogramTests.assertReduced(InternalAutoDateHistogramTests.java:50)
   >    at org.elasticsearch.test.InternalAggregationTestCase.testReduceRandom(InternalAggregationTestCase.java:294)
   >    at java.lang.Thread.run(Thread.java:748)
  2> NOTE: leaving temporary files on disk at: /home/hendrik/work/git-elastic/elasticsearch/server/build/testrun/unitTest/J0/temp/org.elasticsearch.search.aggregations.bucket.histogram.InternalAutoDateHistogramTests_24D81295C2796C6C-001
  2> NOTE: test params are: codec=Asserting(Lucene80): {}, docValues:{}, maxPointsInLeafNode=1501, maxMBSortInHeap=7.215678116490036, sim=Asserting(org.apache.lucene.search.similarities.AssertingSimilarity@5f67521f), locale=el-CY, timezone=Europe/Nicosia
  2> NOTE: Linux 4.18.0-15-generic amd64/Oracle Corporation 1.8.0_191 (64-bit)/cpus=4,threads=1,free=449796568,total=514850816
  2> NOTE: All tests run in this JVM: [InternalAutoDateHistogramTests]
Completed [1/1] in 2.83s, 1 test, 1 failure <<< FAILURES!


> Task :server:unitTest FAILED

Tested only on master but likely applies to other versions.

@hendrikmuhs hendrikmuhs added :Analytics/Aggregations Aggregations >test-failure Triaged test failures from CI v8.0.0 labels Feb 28, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo

@original-brownbear
Copy link
Member

Also fails for

./gradlew :server:unitTest -Dtests.seed=7DBCFCFC5EFB7BF8 -Dtests.class=org.elasticsearch.search.aggregations.bucket.histogram.InternalAutoDateHistogramTests -Dtests.method="testReduceRandom" -Dtests.security.manager=true -Dtests.locale=fil-PH -Dtests.timezone=Asia/Krasnoyarsk -Dcompiler.java=11 -Druntime.java=11
➜  elasticsearch git:(master) ./gradlew :server:unitTest -Dtests.seed=7DBCFCFC5EFB7BF8 -Dtests.class=org.elasticsearch.search.aggregations.bucket.histogram.InternalAutoDateHistogramTests -Dtests.method="testReduceRandom" -Dtests.security.manager=true -Dtests.locale=fil-PH -Dtests.timezone=Asia/Krasnoyarsk -Dcompiler.java=11 -Druntime.java=11

> Configure project :benchmarks
=======================================
Elasticsearch Build Hamster says Hello!
  Gradle Version        : 5.2.1
  OS Info               : Linux 4.18.0-15-generic (amd64)
  JDK Version           : 11 (Oracle Corporation 11.0.1 [OpenJDK 64-Bit Server VM 11.0.1+13-Ubuntu-3ubuntu3.18.10.1])
  JAVA_HOME             : /usr/lib/jvm/java-11-openjdk-amd64
  Random Testing Seed   : 7DBCFCFC5EFB7BF8
=======================================
Suite: org.elasticsearch.search.aggregations.bucket.histogram.InternalAutoDateHistogramTests
  1> [2019-03-01T01:57:18,664][INFO ][o.e.s.a.b.h.InternalAutoDateHistogramTests] [testReduceRandom] before test
  1> [2019-03-01T01:57:18,762][INFO ][o.e.s.a.b.h.InternalAutoDateHistogramTests] [testReduceRandom] after test
  2> REPRODUCE WITH: ./gradlew :server:unitTest -Dtests.seed=7DBCFCFC5EFB7BF8 -Dtests.class=org.elasticsearch.search.aggregations.bucket.histogram.InternalAutoDateHistogramTests -Dtests.method="testReduceRandom" -Dtests.security.manager=true -Dtests.locale=fil-PH -Dtests.timezone=Asia/Krasnoyarsk -Dcompiler.java=11 -Druntime.java=11
FAILURE 0.41s | InternalAutoDateHistogramTests.testReduceRandom <<< FAILURES!
   > Throwable #1: java.lang.AssertionError: expected:<{1548979200000=989}> but was:<{1548979200000=928, 1551398400000=61}>
   >    at __randomizedtesting.SeedInfo.seed([7DBCFCFC5EFB7BF8:BD8150ED78A6E2AE]:0)
   >    at org.elasticsearch.search.aggregations.bucket.histogram.InternalAutoDateHistogramTests.assertReduced(InternalAutoDateHistogramTests.java:200)
   >    at org.elasticsearch.search.aggregations.bucket.histogram.InternalAutoDateHistogramTests.assertReduced(InternalAutoDateHistogramTests.java:50)
   >    at org.elasticsearch.test.InternalAggregationTestCase.testReduceRandom(InternalAggregationTestCase.java:294)
   >    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   >    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   >    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   >    at java.base/java.lang.reflect.Method.invoke(Method.java:566)
   >    at java.base/java.lang.Thread.run(Thread.java:834)
  2> NOTE: leaving temporary files on disk at: /home/brownbear/src/elasticsearch/server/build/testrun/unitTest/J0/temp/org.elasticsearch.search.aggregations.bucket.histogram.InternalAutoDateHistogramTests_7DBCFCFC5EFB7BF8-001
  2> NOTE: test params are: codec=Asserting(Lucene80): {}, docValues:{}, maxPointsInLeafNode=1350, maxMBSortInHeap=7.054482784277555, sim=Asserting(org.apache.lucene.search.similarities.AssertingSimilarity@73807316), locale=fil-PH, timezone=Asia/Krasnoyarsk
  2> NOTE: Linux 4.18.0-15-generic amd64/Oracle Corporation 11.0.1 (64-bit)/cpus=16,threads=1,free=502433536,total=536870912
  2> NOTE: All tests run in this JVM: [InternalAutoDateHistogramTests]
Completed [1/1] in 1.18s, 1 test, 1 failure <<< FAILURES!


> Task :server:unitTest FAILED
   [junit4] <JUnit4> says g'day! Master seed: 7DBCFCFC5EFB7BF8
==> Test Info: seed=7DBCFCFC5EFB7BF8; jvm=1; suite=1
Tests with failures:
  - org.elasticsearch.search.aggregations.bucket.histogram.InternalAutoDateHistogramTests.testReduceRandom

   [junit4] JVM J0:     0.26 ..     1.76 =     1.51s
   [junit4] Execution time total: 1.77 sec.
   [junit4] Tests summary: 1 suite, 1 test, 1 failure

FAILURE: Build failed with an exception.

* What went wrong:
Execution failed for task ':server:unitTest'.
> There were test failures: 1 suite, 1 test, 1 failure [seed: 7DBCFCFC5EFB7BF8]

➜ elasticsearch git:(master) ./gradlew :server:unitTest -Dtests.seed=7DBCFCFC5EFB7BF8 -Dtests.class=org.elasticsearch.search.aggregations.bucket.histogram.InternalAutoDateHistogramTests -Dtests.method="testReduceRandom" -Dtests.security.manager=true -Dtests.locale=fil-PH -Dtests.timezone=Asia/Krasnoyarsk -Dcompiler.java=11 -Druntime.java=11

Configure project :benchmarks
=======================================
Elasticsearch Build Hamster says Hello!
Gradle Version : 5.2.1
OS Info : Linux 4.18.0-15-generic (amd64)
JDK Version : 11 (Oracle Corporation 11.0.1 [OpenJDK 64-Bit Server VM 11.0.1+13-Ubuntu-3ubuntu3.18.10.1])
JAVA_HOME : /usr/lib/jvm/java-11-openjdk-amd64
Random Testing Seed : 7DBCFCFC5EFB7BF8
=======================================
Suite: org.elasticsearch.search.aggregations.bucket.histogram.InternalAutoDateHistogramTests
1> [2019-03-01T01:57:18,664][INFO ][o.e.s.a.b.h.InternalAutoDateHistogramTests] [testReduceRandom] before test
1> [2019-03-01T01:57:18,762][INFO ][o.e.s.a.b.h.InternalAutoDateHistogramTests] [testReduceRandom] after test
2> REPRODUCE WITH: ./gradlew :server:unitTest -Dtests.seed=7DBCFCFC5EFB7BF8 -Dtests.class=org.elasticsearch.search.aggregations.bucket.histogram.InternalAutoDateHistogramTests -Dtests.method="testReduceRandom" -Dtests.security.manager=true -Dtests.locale=fil-PH -Dtests.timezone=Asia/Krasnoyarsk -Dcompiler.java=11 -Druntime.java=11
FAILURE 0.41s | InternalAutoDateHistogramTests.testReduceRandom <<< FAILURES!
Throwable #1: java.lang.AssertionError: expected:<{1548979200000=989}> but was:<{1548979200000=928, 1551398400000=61}>
at __randomizedtesting.SeedInfo.seed([7DBCFCFC5EFB7BF8:BD8150ED78A6E2AE]:0)
at org.elasticsearch.search.aggregations.bucket.histogram.InternalAutoDateHistogramTests.assertReduced(InternalAutoDateHistogramTests.java:200)
at org.elasticsearch.search.aggregations.bucket.histogram.InternalAutoDateHistogramTests.assertReduced(InternalAutoDateHistogramTests.java:50)
at org.elasticsearch.test.InternalAggregationTestCase.testReduceRandom(InternalAggregationTestCase.java:294)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at java.base/java.lang.Thread.run(Thread.java:834)
2> NOTE: leaving temporary files on disk at: /home/brownbear/src/elasticsearch/server/build/testrun/unitTest/J0/temp/org.elasticsearch.search.aggregations.bucket.histogram.InternalAutoDateHistogramTests_7DBCFCFC5EFB7BF8-001
2> NOTE: test params are: codec=Asserting(Lucene80): {}, docValues:{}, maxPointsInLeafNode=1350, maxMBSortInHeap=7.054482784277555, sim=Asserting(org.apache.lucene.search.similarities.AssertingSimilarity@73807316), locale=fil-PH, timezone=Asia/Krasnoyarsk
2> NOTE: Linux 4.18.0-15-generic amd64/Oracle Corporation 11.0.1 (64-bit)/cpus=16,threads=1,free=502433536,total=536870912
2> NOTE: All tests run in this JVM: [InternalAutoDateHistogramTests]
Completed [1/1] in 1.18s, 1 test, 1 failure <<< FAILURES!

Task :server:unitTest FAILED
[junit4] says g'day! Master seed: 7DBCFCFC5EFB7BF8
==> Test Info: seed=7DBCFCFC5EFB7BF8; jvm=1; suite=1
Tests with failures:

  • org.elasticsearch.search.aggregations.bucket.histogram.InternalAutoDateHistogramTests.testReduceRandom

[junit4] JVM J0: 0.26 .. 1.76 = 1.51s
[junit4] Execution time total: 1.77 sec.
[junit4] Tests summary: 1 suite, 1 test, 1 failure

FAILURE: Build failed with an exception.

  • What went wrong:
    Execution failed for task ':server:unitTest'.

There were test failures: 1 suite, 1 test, 1 failure [seed: 7DBCFCFC5EFB7BF8]


@jkakavas
Copy link
Member

This failed in a PR too #38382. This should be muted

@not-napoleon
Copy link
Member

Found a case of this failing in 7.x as well: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+7.x+intake/360/

REPRODUCE WITH: ./gradlew :server:unitTest \
  -Dtests.seed=79976EC1854DEAAF \
  -Dtests.class=org.elasticsearch.search.aggregations.bucket.histogram.InternalAutoDateHistogramTests \
  -Dtests.method="testReduceRandom" \
  -Dtests.security.manager=true \
  -Dtests.locale=th \
  -Dtests.timezone=America/Winnipeg \
  -Dcompiler.java=11 \
  -Druntime.java=8

I'm going to backport the mute to the 7.x branch as well, and add the tag to this ticket.

@pcsanwald
Copy link
Contributor

Thanks for the mute, I'll take a look at this one.

@pcsanwald pcsanwald self-assigned this Mar 4, 2019
@polyfractal polyfractal removed the v7.0.0 label Apr 9, 2019
@jakelandis jakelandis added v7.3.0 and removed v7.2.0 labels Jun 17, 2019
@pcsanwald pcsanwald self-assigned this Jun 21, 2019
@pcsanwald
Copy link
Contributor

@polyfractal I've just been in this code, I'll have a look at this one

@pcsanwald
Copy link
Contributor

I've been attempting to repro this issue locally with no luck, by un-muting testReduceRandom and using essentially a nested loop of test iterations via a bash while loop and -Dtests.iters, a trick I learned from @DaveCTurner :

date | tee pass-times.log; while ./gradlew :server:test -Dtests.seed=7DBCFCFC5EFB7BF8 -Dtests.class=org.elasticsearch.search.aggregations.bucket.histogram.InternalAutoDateHistogramTests -Dtests.method="testReduceRandom" -Dtests.security.manager=true -Dtests.locale=fil-PH -Dtests.timezone=Asia/Krasnoyarsk -Dtests.iters=1000 2>&1 >> testoutput.log; do date | tee -a pass-times.log; done

date | tee pass-times.log; while ./gradlew :server:test -Dtests.seed=24D81295C2796C6C -Dtests.class=org.elasticsearch.search.aggregations.bucket.histogram.InternalAutoDateHistogramTests -Dtests.method="testReduceRandom" -Dtests.security.manager=true -Dtests.locale=el-CY -Dtests.timezone=Europe/Nicosia -Dtests.iters=1000 2>&1 >> testoutput.log; do date | tee -a pass-times.log; done

Neither of these cases, reported by @hendrikmuhs and @original-brownbear, repro for me after letting the above run for a long time.

So, given that brute force reproducing has not worked, we must fall back to... logic and reasoning.

Both these failures are in this kinda scenario: expected:<{1548979200000=989}> but was:<{1548979200000=928, 1551398400000=61} where we are expecting a single bucket, but, there are in fact multiple buckets. In the past, this has been a bug in the rounding interval selected (actually a bug in how the test calculates expected results), so this is the avenue I'll explore.

@pcsanwald
Copy link
Contributor

Closing, as I believe the fix in #44461 covers the cases highlighted in this. of course we will reopen this if my assertion is incorrect.

@polyfractal polyfractal self-assigned this Aug 12, 2019
@pcsanwald
Copy link
Contributor

The failure that occurred is of a different variety than earlier failures, the earlier failures were of the form "we expected one bucket but we got two", the latest failure is due to an incorrect innerInterval calculation:

expected:<{1561939200000=1986, 1564531200000=52}> but was:<{1561939200000=1986, 1564617600000=52}>

Definitely worth digging into, just wanted to mention that this latest error is a bit different nature than earlier errors, which were probably due to the bug we found.

@colings86 colings86 added v7.5.0 and removed v7.4.0 labels Aug 30, 2019
@jimczi jimczi added v7.6.0 and removed v7.5.0 labels Nov 12, 2019
@polyfractal polyfractal added v7.7.0 and removed v7.6.0 labels Jan 15, 2020
@bpintea bpintea added v7.8.0 and removed v7.7.0 labels Mar 25, 2020
nik9000 added a commit to nik9000/elasticsearch that referenced this issue Apr 1, 2020
The test had errors around time units that have different length - think
leap years or months that aren't 30 days. This fixes those errors. In
the proces I've changed a bunch of things to debug the problem:

* Replace `currentTimeMillis` with a random time. Now the test fails
  randomly! Wonderful. Much better than on random days of the month.
* Generate buckets "closer together" to test random reduction. Without
  this we were super frequently getting stuck in the "year of century"
  rounding because *some* of the of the buckets we built were far apart.
  This generates a much greater variety of tests.
* Implement `toString` on `RoundingInfo` so I can debug without going
  crazy.
* Switch keys in the bucket assertions from epoch millis to `Instant`s
  so we can read the failures.

Closes elastic#54540
Closes elastic#39497
nik9000 added a commit that referenced this issue Apr 2, 2020
The test had errors around time units that have different length - think
leap years or months that aren't 30 days. This fixes those errors. In
the proces I've changed a bunch of things to debug the problem:

* Replace `currentTimeMillis` with a random time. Now the test fails
  randomly! Wonderful. Much better than on random days of the month.
* Generate buckets "closer together" to test random reduction. Without
  this we were super frequently getting stuck in the "year of century"
  rounding because *some* of the of the buckets we built were far apart.
  This generates a much greater variety of tests.
* Implement `toString` on `RoundingInfo` so I can debug without going
  crazy.
* Switch keys in the bucket assertions from epoch millis to `Instant`s
  so we can read the failures.

Closes #54540
Closes #39497
nik9000 added a commit to nik9000/elasticsearch that referenced this issue Apr 2, 2020
The test had errors around time units that have different length - think
leap years or months that aren't 30 days. This fixes those errors. In
the proces I've changed a bunch of things to debug the problem:

* Replace `currentTimeMillis` with a random time. Now the test fails
  randomly! Wonderful. Much better than on random days of the month.
* Generate buckets "closer together" to test random reduction. Without
  this we were super frequently getting stuck in the "year of century"
  rounding because *some* of the of the buckets we built were far apart.
  This generates a much greater variety of tests.
* Implement `toString` on `RoundingInfo` so I can debug without going
  crazy.
* Switch keys in the bucket assertions from epoch millis to `Instant`s
  so we can read the failures.

Closes elastic#54540
Closes elastic#39497
nik9000 added a commit to nik9000/elasticsearch that referenced this issue Apr 2, 2020
The test had errors around time units that have different length - think
leap years or months that aren't 30 days. This fixes those errors. In
the proces I've changed a bunch of things to debug the problem:

* Replace `currentTimeMillis` with a random time. Now the test fails
  randomly! Wonderful. Much better than on random days of the month.
* Generate buckets "closer together" to test random reduction. Without
  this we were super frequently getting stuck in the "year of century"
  rounding because *some* of the of the buckets we built were far apart.
  This generates a much greater variety of tests.
* Implement `toString` on `RoundingInfo` so I can debug without going
  crazy.
* Switch keys in the bucket assertions from epoch millis to `Instant`s
  so we can read the failures.

Closes elastic#54540
Closes elastic#39497
nik9000 added a commit that referenced this issue Apr 3, 2020
The test had errors around time units that have different length - think
leap years or months that aren't 30 days. This fixes those errors. In
the proces I've changed a bunch of things to debug the problem:

* Replace `currentTimeMillis` with a random time. Now the test fails
  randomly! Wonderful. Much better than on random days of the month.
* Generate buckets "closer together" to test random reduction. Without
  this we were super frequently getting stuck in the "year of century"
  rounding because *some* of the of the buckets we built were far apart.
  This generates a much greater variety of tests.
* Implement `toString` on `RoundingInfo` so I can debug without going
  crazy.
* Switch keys in the bucket assertions from epoch millis to `Instant`s
  so we can read the failures.

Closes #54540
Closes #39497
nik9000 added a commit that referenced this issue Apr 3, 2020
The test had errors around time units that have different length - think
leap years or months that aren't 30 days. This fixes those errors. In
the proces I've changed a bunch of things to debug the problem:

* Replace `currentTimeMillis` with a random time. Now the test fails
  randomly! Wonderful. Much better than on random days of the month.
* Generate buckets "closer together" to test random reduction. Without
  this we were super frequently getting stuck in the "year of century"
  rounding because *some* of the of the buckets we built were far apart.
  This generates a much greater variety of tests.
* Implement `toString` on `RoundingInfo` so I can debug without going
  crazy.
* Switch keys in the bucket assertions from epoch millis to `Instant`s
  so we can read the failures.

Closes #54540
Closes #39497
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.