Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] DedicatedClusterSnapshotRestoreIT#testMasterShutdownDuringFailedSnapshot Captured an uncaught exception ClosedByInterruptException #25062

Closed
polyfractal opened this issue Jun 5, 2017 · 6 comments
Assignees
Labels
>test Issues or PRs that are addressing/adding tests

Comments

@polyfractal
Copy link
Contributor

https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+5.x+java9-periodic/2759/console

@abeyad Potentially related to #24452? Unsure, since I couldn't see the failure messages in that thread.

Couldn't reproduce locally*

 gradle :core:integTest -Dtests.seed=263187C96437C1C1 -Dtests.class=org.elasticsearch.snapshots.DedicatedClusterSnapshotRestoreIT -Dtests.method="testMasterShutdownDuringFailedSnapshot" -Dtests.security.manager=true -Dtests.jvm.argline="--add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.nio.file=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.locks=ALL-UNNAMED --add-opens=java.base/java.util.regex=ALL-UNNAMED" -Dtests.locale=gsw-CH -Dtests.timezone=Europe/Chisinau
com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught exception in thread: Thread[id=1023, name=elasticsearch[node_tm1][snapshot][T#1], state=RUNNABLE, group=TGRP-DedicatedClusterSnapshotRestoreIT]
	at __randomizedtesting.SeedInfo.seed([263187C96437C1C1:480D62689B8DC637]:0)
Caused by: java.lang.AssertionError: On Linux and MacOSX fsyncing a directory should not throw IOException, we just don't want to rely on that in production (undocumented). Got: java.nio.channels.ClosedByInterruptException
	at __randomizedtesting.SeedInfo.seed([263187C96437C1C1]:0)
	at org.apache.lucene.util.IOUtils.fsync(IOUtils.java:423)
	at org.elasticsearch.common.blobstore.fs.FsBlobContainer.move(FsBlobContainer.java:145)
	at org.elasticsearch.snapshots.mockstore.BlobContainerWrapper.move(BlobContainerWrapper.java:76)
	at org.elasticsearch.snapshots.mockstore.MockRepository$MockBlobStore$MockBlobContainer.move(MockRepository.java:327)
	at org.elasticsearch.repositories.blobstore.BlobStoreRepository.writeAtomic(BlobStoreRepository.java:950)
	at org.elasticsearch.repositories.blobstore.BlobStoreRepository.writeIndexGen(BlobStoreRepository.java:839)
	at org.elasticsearch.repositories.blobstore.BlobStoreRepository.finalizeSnapshot(BlobStoreRepository.java:563)
	at org.elasticsearch.snapshots.SnapshotsService$5.run(SnapshotsService.java:950)
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:569)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1161)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:844)

*It wouldn't reproduce because the --add-opens=java.base/java.lang=ALL-UNNAMED option seems to break the JVM on my laptop

Unrecognized option: --add-opens=java.base/java.lang=ALL-UNNAMED
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.

I think that is unrelated to the CI failure though.

@polyfractal polyfractal added the >test Issues or PRs that are addressing/adding tests label Jun 5, 2017
@jasontedor
Copy link
Member

It wouldn't reproduce because the --add-opens=java.base/java.lang=ALL-UNNAMED option seems to break the JVM on my laptop

That's because this is a JDK 9 build and those are JDK 9 flags and I suspect that you were not using JDK 9.

@polyfractal
Copy link
Contributor Author

Yep, that would most certainly be the case. Oops :)

@jimczi
Copy link
Contributor

jimczi commented Jun 8, 2017

A reproducible failure on master:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+periodic/3106/console

gradle :core:integTest -Dtests.seed=E174387F042AEDA5 -Dtests.class=org.elasticsearch.snapshots.DedicatedClusterSnapshotRestoreIT -Dtests.method="testMasterShutdownDuringFailedSnapshot" -Dtests.security.manager=true -Dtests.locale=cs -Dtests.timezone=Antarctica/South_Pole

This is not related to java 9, other similar failures:

https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+multijob-unix-compatibility/os=ubuntu/1062

https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+5.x+multijob-unix-compatibility/os=debian/837

@jaymode
Copy link
Member

jaymode commented Jun 12, 2017

Another reproducible failure on 5.x (macOS):
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+5.x+multijob-unix-compatibility/os=sles/864/console

com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught exception in thread: Thread[id=4477, name=elasticsearch[node_tm1][snapshot][T#1], state=RUNNABLE, group=TGRP-DedicatedClusterSnapshotRestoreIT]
	at __randomizedtesting.SeedInfo.seed([FF459C1B16690789:917979BAE9D3007F]:0)
Caused by: java.lang.AssertionError: On Linux and MacOSX fsyncing a directory should not throw IOException, we just don't want to rely on that in production (undocumented). Got: java.nio.channels.ClosedByInterruptException
	at __randomizedtesting.SeedInfo.seed([FF459C1B16690789]:0)
	at org.apache.lucene.util.IOUtils.fsync(IOUtils.java:475)
	at org.elasticsearch.common.blobstore.fs.FsBlobContainer.move(FsBlobContainer.java:145)
	at org.elasticsearch.snapshots.mockstore.BlobContainerWrapper.move(BlobContainerWrapper.java:76)
	at org.elasticsearch.snapshots.mockstore.MockRepository$MockBlobStore$MockBlobContainer.move(MockRepository.java:327)
	at org.elasticsearch.repositories.blobstore.BlobStoreRepository.writeAtomic(BlobStoreRepository.java:950)
	at org.elasticsearch.repositories.blobstore.BlobStoreRepository.writeIndexGen(BlobStoreRepository.java:839)
	at org.elasticsearch.repositories.blobstore.BlobStoreRepository.finalizeSnapshot(BlobStoreRepository.java:563)
	at org.elasticsearch.snapshots.SnapshotsService$5.run(SnapshotsService.java:950)
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:569)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)

Reproduce line:

gradle :core:integTest -Dtests.seed=FF459C1B16690789 -Dtests.class=org.elasticsearch.snapshots.DedicatedClusterSnapshotRestoreIT -Dtests.method="testMasterShutdownDuringFailedSnapshot" -Dtests.security.manager=true -Dtests.locale=ar-LY -Dtests.timezone=Pacific/Enderbury

@jimczi
Copy link
Contributor

jimczi commented Jun 16, 2017

Another one, still reproducible, @abeyad can you confirm that this is related to #24452 ?

https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+5.x+multijob-unix-compatibility/os=centos/881

gradle :core:integTest -Dtests.seed=F4B1E0D9FCFB6CD -Dtests.class=org.elasticsearch.snapshots.DedicatedClusterSnapshotRestoreIT -Dtests.method="testMasterShutdownDuringFailedSnapshot" -Dtests.security.manager=true -Dtests.locale=en-SG -Dtests.timezone=Pacific/Truk

@abeyad
Copy link

abeyad commented Jun 16, 2017

@imotov and I debugged this issue - there are a couple things going on here for which I opened a new issue #25281 and silenced the test in 0c69734. I'm closing this issue as the problem has been identified and the new issue opened, and the AwaitsFix will be removed on the test once #25281 is resolved.

@abeyad abeyad closed this as completed Jun 16, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>test Issues or PRs that are addressing/adding tests
Projects
None yet
Development

No branches or pull requests

5 participants