Skip to content

[CI] GoogleCloudStorageBlobRepositoryTests Failures (and Slowness) #46772

@original-brownbear

Description

@original-brownbear

Build Scan: https://gradle-enterprise.elastic.co/s/ua5pqoxpzjces

This is a bit of strange failure. It seems that the mocked GCS simulated a failure on a bulk delete action during snapshot delete but it was not retried as expected (or the failure simulation misfired somehow).

com.google.cloud.storage.StorageException: Error writing request body to serverClose stacktrace
at __randomizedtesting.SeedInfo.seed([ED3C81C2EDCEF7BA:26D30512855F2828]:0)
at com.google.cloud.storage.spi.v1.HttpStorageRpc.translate(HttpStorageRpc.java:227)
at com.google.cloud.storage.spi.v1.HttpStorageRpc.access$300(HttpStorageRpc.java:86)
at com.google.cloud.storage.spi.v1.HttpStorageRpc$DefaultRpcBatch.submit(HttpStorageRpc.java:203)
at com.google.cloud.storage.StorageBatch.submit(StorageBatch.java:149)
at org.elasticsearch.repositories.gcs.GoogleCloudStorageBlobStore.lambda$deleteBlobsIgnoringIfNotExists$12(GoogleCloudStorageBlobStore.java:358)
at java.security.AccessController.doPrivileged(Native Method)
at org.elasticsearch.repositories.gcs.SocketAccess.doPrivilegedIOException(SocketAccess.java:44)
at org.elasticsearch.repositories.gcs.GoogleCloudStorageBlobStore.deleteBlobsIgnoringIfNotExists(GoogleCloudStorageBlobStore.java:337)
at org.elasticsearch.repositories.gcs.GoogleCloudStorageBlobContainer.deleteBlobsIgnoringIfNotExists(GoogleCloudStorageBlobContainer.java:87)
at org.elasticsearch.repositories.blobstore.BlobStoreRepository.deleteSnapshot(BlobStoreRepository.java:409)
at org.elasticsearch.snapshots.SnapshotsService.lambda$deleteSnapshotFromRepository$12(SnapshotsService.java:1344)
at org.elasticsearch.action.ActionRunnable$1.doRun(ActionRunnable.java:45)
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:769)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.lang.Thread.run(Thread.java:834)
Caused by: java.io.IOException: Error writing request body to serverOpen stacktrace

Moreover though, this should not have failed the snapshot delete call (and subsequently the whole test) and instead only have lead to a WARN log.

Moreover, running this test-(suite) with the seed that failed here:

./gradlew ':plugins:repository-gcs:test' --tests "org.elasticsearch.repositories.gcs.GoogleCloudStorageBlobStoreRepositoryTests.testIndicesDeletedFromRepository" -Dtests.seed=ED3C81C2EDCEF7BA -Dtests.security.manager=true -Dtests.locale=vun-TZ -Dtests.timezone=America/Chicago -Dcompiler.java=12 -Druntime.java=11

is very slow. It takes about a minute on very fast hardware for each test, making me wonder if maybe there's some timeout interaction here.

I'll investigate this a little as well to make sure nothing evil snuck into the snapshot state machine and now fails the deletes, but as discussed assigning you @tlrx since you're already working on the GCS mock.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions