SLM metadata records enormous string describing last failure #71325
Labels
>bug
:Data Management/ILM+SLM
Index and Snapshot lifecycle management
Team:Data Management
Meta label for data/management team
Elasticsearch version (
bin/elasticsearch --version
): Reported in 7.11.2 but still inmaster
.Plugins installed: Cloud
JVM version (
java -version
): N/AOS version (
uname -a
if on a Unix-like system): CloudDescription of the problem including expected versus actual behavior:
A user reported a multi-megabyte response to the Get snapshot lifecycle policy API, which they found surprising. The bulk of the content was reporting in great detail every single shard failure complete with stack trace, which are all included as suppressed exceptions here ...
elasticsearch/x-pack/plugin/ilm/src/main/java/org/elasticsearch/xpack/slm/SnapshotLifecycleTask.java
Line 115 in a92a647
... and converted to a string here:
elasticsearch/x-pack/plugin/ilm/src/main/java/org/elasticsearch/xpack/slm/SnapshotLifecycleTask.java
Line 221 in a92a647
I'm not sure all this detail is useful, and it certainly seems bad to keep something so large in the cluster state. Can we trim this down somehow?
Provide logs (if relevant):
"details": "{\"type\":\"snapshot_exception\",\"reason\":\"[found-snapshots:cloud-snapshot-2021.04.01-REDACTED] failed to create snapshot successfully, REDACTED(>800) out of REDACTED(>800) total shards failed\",\"stack_trace\":\"SnapshotException[[found-snapshots:cloud-snapshot-2021.04.01-REDACTED] failed to create snapshot successfully, REDACTED(>800) out of REDACTED(>800) total shards failed]\\n\\tat org.elasticsearch.xpack.slm.SnapshotLifecycleTask$1.onResponse(SnapshotLifecycleTask.java:111)\\n\\tat org.elasticsearch.xpack.slm.SnapshotLifecycleTask$1.onResponse(SnapshotLifecycleTask.java:93)\\n\\tat org.elasticsearch.action.support.ContextPreservingActionListener.onResponse(ContextPreservingActionListener.java:32)\\n\\tat org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:83)\\n\\tat org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:77)\\n\\tat org.elasticsearch.action.support.ContextPreservingActionListener.onResponse(ContextPreservingActionListener.java:32)\\n\\tat org.elasticsearch.action.ActionListener$2.onResponse(ActionListener.java:143)\\n\\tat org.elasticsearch.action.ActionListener$MappedActionListener.onResponse(ActionListener.java:76)\\n\\tat org.elasticsearch.action.ActionListener.onResponse(ActionListener.java:216)\\n\\tat org.elasticsearch.snapshots.SnapshotsService.completeListenersIgnoringException(SnapshotsService.java:2681)\\n\\tat org.elasticsearch.snapshots.SnapshotsService.lambda$finalizeSnapshotEntry$34(SnapshotsService.java:1577)\\n\\tat org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:117)\\n\\tat org.elasticsearch.repositories.blobstore.BlobStoreRepository.lambda$finalizeSnapshot$37(BlobStoreRepository.java:1130)\\n\\tat org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:117)\\n\\tat org.elasticsearch.action.ActionRunnable.lambda$supply$0(ActionRunnable.java:47)\\n\\tat org.elasticsearch.action.ActionRunnable$2.doRun(ActionRunnable.java:62)\\n\\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:732)\\n\\tat org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)\\n\\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)\\n\\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)\\n\\tat java.base/java.lang.Thread.run(Thread.java:832)\\n\\tSuppressed: [REDACTED/REDACTED][[REDACTED][0]] IndexShardSnapshotFailedException[NoSuchFileException[Blob object [snapshots/REDACTED/indices/REDACTED/0/index-REDACTED] not found: 404 Not Found\\nGET https://storage.googleapis.com/download/storage/v1/b/REDACTED/o/snapshots%REDACTED%2Findices%REDACTED%2F0%2Findex-REDACTED?alt=media\\nNo such object: REDACTED/snapshots/REDACTED/indices/REDACTED/0/index-REDACTED]]\\n\\t\\tat org.elasticsearch.snapshots.SnapshotShardFailure.<init>(SnapshotShardFailure.java:66)\\n\\t\\tat org.elasticsearch.snapshots.SnapshotShardFailure.<init>(SnapshotShardFailure.java:54)\\n\\t\\tat org.elasticsearch.snapshots.SnapshotsService.finalizeSnapshotEntry(SnapshotsService.java:1524)\\n\\t\\tat org.elasticsearch.snapshots.SnapshotsService.access$2100(SnapshotsService.java:115)\\n\\t\\tat org.elasticsearch.snapshots.SnapshotsService$7.onResponse(SnapshotsService.java:1472)\\n\\t\\tat org.elasticsearch.snapshots.SnapshotsService$7.onResponse(SnapshotsService.java:1469)\\n\\t\\tat org.elasticsearch.repositories.blobstore.BlobStoreRepository.doGetRepositoryData(BlobStoreRepository.java:1463)\\n\\t\\t... 6 more\\n\\tSuppressed: [REDACTED/REDACTED][[REDACTED][0]] IndexShardSnapshotFailedException[NoSuchFileException... [many MBs of the same snipped]
The text was updated successfully, but these errors were encountered: