Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Data corruption on an actively written primary shard can lead to failing all replicas. #803

Closed
itiyamas opened this issue May 30, 2021 · 14 comments · Fixed by #4133
Closed
Assignees
Labels
blocked Identifies issues that are blocked bug Something isn't working distributed framework Priority-High

Comments

@itiyamas
Copy link
Contributor

itiyamas commented May 30, 2021

Describe the bug
When data corruption is detected on primary, the engine fails. This in turn results in failing primary shard and closing it. Closing the shard also causes closing and cancelling any ongoing replication operations on primary. One such operation is shard bulk action on replica. Cancelling this operation in turn checks the reason for failure and because the reason is shard unavailability, the primary marks the replica as stale.

Both failing the primary and failing the replica are sent as shard failure actions on master, which in turn serializes them locally only after they are queued on state update task queue.

As a result, a corruption on primary shard can result in corruption of replica shards in certain cases, especially on high throughput writes.

To Reproduce
Steps to reproduce the behavior:
Please check this commit for the test. You may need to run it 2-3 times to reproduce the behavior.

Expected behavior
Corruption on primary should not result in replica being lost and cluster turning red. Instead, replica should have taken over as primary and then a new replica should have been assigned- turning the cluster green after some time.

@itiyamas itiyamas added bug Something isn't working untriaged Beta labels May 30, 2021
@itiyamas
Copy link
Contributor Author

Relevant output of the test:

/Users/itiyamas/jdk-14.0.1.jdk/Contents/Home/bin/java -ea -Djava.locale.providers=SPI,COMPAT -Didea.test.cyclic.buffer.size=1048576 -javaagent:/Applications/IntelliJ IDEA CE.app/Contents/lib/idea_rt.jar=64119:/Applications/IntelliJ IDEA CE.app/Contents/bin -Dfile.encoding=UTF-8 -classpath /Applications/IntelliJ IDEA CE.app/Contents/lib/idea_rt.jar:/Applications/IntelliJ IDEA CE.app/Contents/plugins/junit/lib/junit5-rt.jar:/Applications/IntelliJ IDEA CE.app/Contents/plugins/junit/lib/junit-rt.jar:/Users/itiyamas/OpenSearch/server/out/test/classes:/Users/itiyamas/OpenSearch/server/out/test/resources:/Users/itiyamas/OpenSearch/server/out/production/classes:/Users/itiyamas/OpenSearch/server/out/production/resources:/Users/itiyamas/OpenSearch/libs/x-content/out/production/classes:/Users/itiyamas/OpenSearch/test/framework/out/production/classes:/Users/itiyamas/OpenSearch/test/framework/out/production/resources:/Users/itiyamas/OpenSearch/libs/cli/out/production/classes:/Users/itiyamas/OpenSearch/libs/core/out/production/classes:/Users/itiyamas/OpenSearch/libs/secure-sm/out/production/classes:/Users/itiyamas/OpenSearch/libs/geo/out/production/classes:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-core/8.8.2/e039fa90ee900c475bbc0943989ed5dccc556d3e/lucene-core-8.8.2.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-analyzers-common/8.8.2/fa4b3c1d847f4bb91a3afaa990889dd164883d86/lucene-analyzers-common-8.8.2.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-backward-codecs/8.8.2/f4a0475a1404d7be87e911f5c85f8b913ec9770b/lucene-backward-codecs-8.8.2.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-grouping/8.8.2/207ac15287656b6ce61ea921fefec3070a8dd99d/lucene-grouping-8.8.2.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-highlighter/8.8.2/2674644580108335c08a25fd2e0f0c7dee010700/lucene-highlighter-8.8.2.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-join/8.8.2/6eb956741169fe8a5643fa39b14aef2fc067d865/lucene-join-8.8.2.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-memory/8.8.2/63b6e8e862c898d263b941ffa55fe6c85bcdf2df/lucene-memory-8.8.2.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-misc/8.8.2/a3a916c27c42ff6773c52d0be4045d4e5c00f7f8/lucene-misc-8.8.2.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-queries/8.8.2/1f324c3a6e1a7634bbada993b2c014d12d09cb83/lucene-queries-8.8.2.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-queryparser/8.8.2/67c6580f8a5187fdb7f9eb2404bdecc4635b3328/lucene-queryparser-8.8.2.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-sandbox/8.8.2/b674deb4957a64db5c8f4afb9127400f4e65c334/lucene-sandbox-8.8.2.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-spatial-extras/8.8.2/227672b3b88c877b2df27608850ca15d834ec35c/lucene-spatial-extras-8.8.2.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-spatial3d/8.8.2/eb77b665a3b3e9ca6b3bb82c3342b843d9729d0d/lucene-spatial3d-8.8.2.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-suggest/8.8.2/36fd5301e7143f0fcb0ba933f96148dc1e132b66/lucene-suggest-8.8.2.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/com.carrotsearch/hppc/0.8.1/ffc7ba8f289428b9508ab484b8001dea944ae603/hppc-0.8.1.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/joda-time/joda-time/2.10.4/8c10bb8815109067ce3c91a8e547b5a52e8a1c1a/joda-time-2.10.4.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/com.tdunning/t-digest/3.2/2ab94758b0276a8a26102adf8d528cf6d0567b9a/t-digest-3.2.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.hdrhistogram/HdrHistogram/2.1.9/e4631ce165eb400edecfa32e03d3f1be53dee754/HdrHistogram-2.1.9.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.locationtech.spatial4j/spatial4j/0.7/faa8ba85d503da4ab872d17ba8c00da0098ab2f2/spatial4j-0.7.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.locationtech.jts/jts-core/1.15.0/705981b7e25d05a76a3654e597dab6ba423eb79e/jts-core-1.15.0.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.logging.log4j/log4j-api/2.11.1/268f0fe4df3eefe052b57c87ec48517d64fb2a10/log4j-api-2.11.1.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.logging.log4j/log4j-core/2.11.1/592a48674c926b01a9a747c7831bcd82a9e6d6e4/log4j-core-2.11.1.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.elasticsearch/jna/5.5.0/ade077cbb2618a18bfc6c335413b2b7163d97601/jna-5.5.0.jar:/Users/itiyamas/OpenSearch/libs/plugin-classloader/out/production/classes:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.yaml/snakeyaml/1.26/a78a8747147d2c5807683e76ec2b633e95c14fe9/snakeyaml-1.26.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/com.fasterxml.jackson.core/jackson-core/2.11.4/593f7b18bab07a76767f181e2a2336135ce82cc4/jackson-core-2.11.4.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/com.fasterxml.jackson.dataformat/jackson-dataformat-smile/2.11.4/10c1faac0b0bd8545eff02599b48a149202de066/jackson-dataformat-smile-2.11.4.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/com.fasterxml.jackson.dataformat/jackson-dataformat-yaml/2.11.4/ba01014ab0228449be401975b1a7af2f3cdaf1d7/jackson-dataformat-yaml-2.11.4.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/com.fasterxml.jackson.dataformat/jackson-dataformat-cbor/2.11.4/67fa6a00bdc31029bf841ee97d993ef2bb530aa0/jackson-dataformat-cbor-2.11.4.jar:/Users/itiyamas/OpenSearch/libs/nio/out/production/classes:/Users/itiyamas/OpenSearch/client/sniffer/out/production/classes:/Users/itiyamas/OpenSearch/client/rest/out/production/classes:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/com.carrotsearch.randomizedtesting/randomizedtesting-runner/2.7.1/e917f4983144c3b969eb7d3648338ecde5e3ba89/randomizedtesting-runner-2.7.1.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/junit/junit/4.12/2973d150c0dc1fefe998f834810d68f278ea58ec/junit-4.12.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.hamcrest/hamcrest/2.1/9420ba32c29217b54eebd26ff7f9234d31c3fbb2/hamcrest-2.1.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-test-framework/8.8.2/a9b31c79190327e6c87a2987deed506d4969d153/lucene-test-framework-8.8.2.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-codecs/8.8.2/48b8faec209371df05fd47e319b4b1e382484347/lucene-codecs-8.8.2.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/commons-logging/commons-logging/1.1.3/f6f66e966c70a83ffbdb6f17a0919eaf7c8aca7f/commons-logging-1.1.3.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/commons-codec/commons-codec/1.13/3f18e1aa31031d89db6f01ba05d501258ce69d2c/commons-codec-1.13.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.elasticsearch/securemock/1.2/98201d4ad5ac93f6b415ae9172d52b5e7cda490e/securemock-1.2.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.elasticsearch/mocksocket/1.2/190ad3f42fd6ab7367bae0c0e14d1dc4e4a3c361/mocksocket-1.2.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/net.sf.jopt-simple/jopt-simple/5.0.2/98cafc6081d5632b61be2c9e60650b64ddbc637c/jopt-simple-5.0.2.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.httpcomponents/httpclient/4.5.13/e5f6cae5ca7ecaac1ec2827a9e2d65ae2869cada/httpclient-4.5.13.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.httpcomponents/httpcore/4.4.12/21ebaf6d532bc350ba95bd81938fa5f0e511c132/httpcore-4.4.12.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.httpcomponents/httpasyncclient/4.1.4/f3a3240681faae3fa46b573a4c7e50cec9db0d86/httpasyncclient-4.1.4.jar:/Users/itiyamas/.gradle/caches/modules-2/files-2.1/org.apache.httpcomponents/httpcore-nio/4.4.12/84cd29eca842f31db02987cfedea245af020198b/httpcore-nio-4.4.12.jar com.intellij.rt.junit.JUnitStarter -ideVersion5 -junit4 org.opensearch.index.store.CorruptedFileIT,testCorruptFileDuringIndexing
[2021-05-30T09:20:00,496][INFO ][o.o.i.s.CorruptedFileIT  ] [testCorruptFileDuringIndexing] before test
[2021-05-30T09:20:00,512][INFO ][o.o.i.s.CorruptedFileIT  ] [testCorruptFileDuringIndexing] [CorruptedFileIT#testCorruptFileDuringIndexing]: setting up test
[2021-05-30T09:20:00,582][INFO ][o.o.t.InternalTestCluster] [testCorruptFileDuringIndexing] Setup InternalTestCluster [TEST-TEST_WORKER_VM=[--not-gradle--]-CLUSTER_SEED=[-3187108221953292427]-HASH=[F8B5DF0856F8]-cluster] with seed [D3C51DF549BF4775] using [0] dedicated masters, [2] (data) nodes and [1] coord only nodes (min_master_nodes are [auto-managed])
[2021-05-30T09:20:01,142][INFO ][o.o.n.Node               ] [testCorruptFileDuringIndexing] version[1.0.0], pid[53657], build[unknown/unknown/unknown], OS[Mac OS X/10.13.6/x86_64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/14.0.1/14.0.1+7]
[2021-05-30T09:20:01,143][INFO ][o.o.n.Node               ] [testCorruptFileDuringIndexing] JVM home [/Users/itiyamas/jdk-14.0.1.jdk/Contents/Home]
[2021-05-30T09:20:01,147][DEPRECATION][o.o.d.n.Node             ] [testCorruptFileDuringIndexing] no-jdk distributions that do not bundle a JDK are deprecated and will be removed in a future release
[2021-05-30T09:20:01,148][INFO ][o.o.n.Node               ] [testCorruptFileDuringIndexing] JVM arguments [-ea, -Djava.locale.providers=SPI,COMPAT, -Didea.test.cyclic.buffer.size=1048576, -javaagent:/Applications/IntelliJ IDEA CE.app/Contents/lib/idea_rt.jar=64119:/Applications/IntelliJ IDEA CE.app/Contents/bin, -Dfile.encoding=UTF-8]
[2021-05-30T09:20:01,169][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] no modules loaded
[2021-05-30T09:20:01,171][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.index.MockEngineFactoryPlugin]
[2021-05-30T09:20:01,171][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.script.MockScriptService$TestPlugin]
[2021-05-30T09:20:01,171][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.InternalSettingsPlugin]
[2021-05-30T09:20:01,171][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.MockHttpTransport$TestPlugin]
[2021-05-30T09:20:01,172][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.MockIndexEventListener$TestPlugin]
[2021-05-30T09:20:01,172][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.OpenSearchIntegTestCase$AssertActionNamePlugin]
[2021-05-30T09:20:01,172][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.OpenSearchIntegTestCase$TestSeedPlugin]
[2021-05-30T09:20:01,172][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.TestGeoShapeFieldMapperPlugin]
[2021-05-30T09:20:01,172][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.store.MockFSIndexStore$TestPlugin]
[2021-05-30T09:20:01,172][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.transport.MockTransportService$TestPlugin]
[2021-05-30T09:20:01,172][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.transport.nio.MockNioTransportPlugin]
[2021-05-30T09:20:01,233][INFO ][o.o.e.NodeEnvironment    ] [testCorruptFileDuringIndexing] using [4] data paths, mounts [[/ (/dev/disk1s1)]], net usable_space [363.8gb], net total_space [465.5gb], types [apfs]
[2021-05-30T09:20:01,234][INFO ][o.o.e.NodeEnvironment    ] [testCorruptFileDuringIndexing] heap size [4gb], compressed ordinary object pointers [true]
[2021-05-30T09:20:01,338][INFO ][o.o.n.Node               ] [testCorruptFileDuringIndexing] node name [node_t0], node ID [ZMqAFWPqSPanRHGlwB6rXw], cluster name [TEST-TEST_WORKER_VM=[--not-gradle--]-CLUSTER_SEED=[-3187108221953292427]-HASH=[F8B5DF0856F8]-cluster], roles [master, remote_cluster_client, data, ingest]
[2021-05-30T09:20:04,486][INFO ][o.o.d.DiscoveryModule    ] [testCorruptFileDuringIndexing] using discovery type [zen] and seed hosts providers [settings, file]
[2021-05-30T09:20:05,085][WARN ][o.o.g.DanglingIndicesState] [testCorruptFileDuringIndexing] gateway.auto_import_dangling_indices is disabled, dangling indices will not be automatically detected or imported and must be managed manually
[2021-05-30T09:20:05,331][INFO ][o.o.n.Node               ] [testCorruptFileDuringIndexing] initialized
[2021-05-30T09:20:05,334][INFO ][o.o.n.Node               ] [testCorruptFileDuringIndexing] version[1.0.0], pid[53657], build[unknown/unknown/unknown], OS[Mac OS X/10.13.6/x86_64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/14.0.1/14.0.1+7]
[2021-05-30T09:20:05,334][INFO ][o.o.n.Node               ] [testCorruptFileDuringIndexing] JVM home [/Users/itiyamas/jdk-14.0.1.jdk/Contents/Home]
[2021-05-30T09:20:05,334][DEPRECATION][o.o.d.n.Node             ] [testCorruptFileDuringIndexing] no-jdk distributions that do not bundle a JDK are deprecated and will be removed in a future release
[2021-05-30T09:20:05,335][INFO ][o.o.n.Node               ] [testCorruptFileDuringIndexing] JVM arguments [-ea, -Djava.locale.providers=SPI,COMPAT, -Didea.test.cyclic.buffer.size=1048576, -javaagent:/Applications/IntelliJ IDEA CE.app/Contents/lib/idea_rt.jar=64119:/Applications/IntelliJ IDEA CE.app/Contents/bin, -Dfile.encoding=UTF-8]
[2021-05-30T09:20:05,336][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] no modules loaded
[2021-05-30T09:20:05,336][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.index.MockEngineFactoryPlugin]
[2021-05-30T09:20:05,336][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.script.MockScriptService$TestPlugin]
[2021-05-30T09:20:05,336][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.InternalSettingsPlugin]
[2021-05-30T09:20:05,336][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.MockHttpTransport$TestPlugin]
[2021-05-30T09:20:05,337][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.MockIndexEventListener$TestPlugin]
[2021-05-30T09:20:05,337][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.OpenSearchIntegTestCase$AssertActionNamePlugin]
[2021-05-30T09:20:05,337][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.OpenSearchIntegTestCase$TestSeedPlugin]
[2021-05-30T09:20:05,337][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.TestGeoShapeFieldMapperPlugin]
[2021-05-30T09:20:05,337][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.store.MockFSIndexStore$TestPlugin]
[2021-05-30T09:20:05,337][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.transport.MockTransportService$TestPlugin]
[2021-05-30T09:20:05,337][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.transport.nio.MockNioTransportPlugin]
[2021-05-30T09:20:05,355][INFO ][o.o.e.NodeEnvironment    ] [testCorruptFileDuringIndexing] using [4] data paths, mounts [[/ (/dev/disk1s1)]], net usable_space [363.8gb], net total_space [465.5gb], types [apfs]
[2021-05-30T09:20:05,355][INFO ][o.o.e.NodeEnvironment    ] [testCorruptFileDuringIndexing] heap size [4gb], compressed ordinary object pointers [true]
[2021-05-30T09:20:05,367][INFO ][o.o.n.Node               ] [testCorruptFileDuringIndexing] node name [node_t1], node ID [OvzlIeulSuyDzstoCsMgsw], cluster name [TEST-TEST_WORKER_VM=[--not-gradle--]-CLUSTER_SEED=[-3187108221953292427]-HASH=[F8B5DF0856F8]-cluster], roles [master, remote_cluster_client, data, ingest]
[2021-05-30T09:20:05,399][INFO ][o.o.d.DiscoveryModule    ] [testCorruptFileDuringIndexing] using discovery type [zen] and seed hosts providers [settings, file]
[2021-05-30T09:20:05,488][WARN ][o.o.g.DanglingIndicesState] [testCorruptFileDuringIndexing] gateway.auto_import_dangling_indices is disabled, dangling indices will not be automatically detected or imported and must be managed manually
[2021-05-30T09:20:05,630][INFO ][o.o.n.Node               ] [testCorruptFileDuringIndexing] initialized
[2021-05-30T09:20:05,633][INFO ][o.o.n.Node               ] [testCorruptFileDuringIndexing] version[1.0.0], pid[53657], build[unknown/unknown/unknown], OS[Mac OS X/10.13.6/x86_64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/14.0.1/14.0.1+7]
[2021-05-30T09:20:05,633][INFO ][o.o.n.Node               ] [testCorruptFileDuringIndexing] JVM home [/Users/itiyamas/jdk-14.0.1.jdk/Contents/Home]
[2021-05-30T09:20:05,634][DEPRECATION][o.o.d.n.Node             ] [testCorruptFileDuringIndexing] no-jdk distributions that do not bundle a JDK are deprecated and will be removed in a future release
[2021-05-30T09:20:05,634][INFO ][o.o.n.Node               ] [testCorruptFileDuringIndexing] JVM arguments [-ea, -Djava.locale.providers=SPI,COMPAT, -Didea.test.cyclic.buffer.size=1048576, -javaagent:/Applications/IntelliJ IDEA CE.app/Contents/lib/idea_rt.jar=64119:/Applications/IntelliJ IDEA CE.app/Contents/bin, -Dfile.encoding=UTF-8]
[2021-05-30T09:20:05,635][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] no modules loaded
[2021-05-30T09:20:05,635][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.index.MockEngineFactoryPlugin]
[2021-05-30T09:20:05,635][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.script.MockScriptService$TestPlugin]
[2021-05-30T09:20:05,635][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.InternalSettingsPlugin]
[2021-05-30T09:20:05,635][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.MockHttpTransport$TestPlugin]
[2021-05-30T09:20:05,635][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.MockIndexEventListener$TestPlugin]
[2021-05-30T09:20:05,635][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.OpenSearchIntegTestCase$AssertActionNamePlugin]
[2021-05-30T09:20:05,636][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.OpenSearchIntegTestCase$TestSeedPlugin]
[2021-05-30T09:20:05,636][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.TestGeoShapeFieldMapperPlugin]
[2021-05-30T09:20:05,636][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.store.MockFSIndexStore$TestPlugin]
[2021-05-30T09:20:05,636][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.transport.MockTransportService$TestPlugin]
[2021-05-30T09:20:05,636][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.transport.nio.MockNioTransportPlugin]
[2021-05-30T09:20:05,669][INFO ][o.o.e.NodeEnvironment    ] [testCorruptFileDuringIndexing] using [4] data paths, mounts [[/ (/dev/disk1s1)]], net usable_space [363.8gb], net total_space [465.5gb], types [apfs]
[2021-05-30T09:20:05,669][INFO ][o.o.e.NodeEnvironment    ] [testCorruptFileDuringIndexing] heap size [4gb], compressed ordinary object pointers [true]
[2021-05-30T09:20:05,687][INFO ][o.o.n.Node               ] [testCorruptFileDuringIndexing] node name [node_t2], node ID [wecIupjPS9O5h3wX52oGkg], cluster name [TEST-TEST_WORKER_VM=[--not-gradle--]-CLUSTER_SEED=[-3187108221953292427]-HASH=[F8B5DF0856F8]-cluster], roles []
[2021-05-30T09:20:05,744][INFO ][o.o.d.DiscoveryModule    ] [testCorruptFileDuringIndexing] using discovery type [zen] and seed hosts providers [settings, file]
[2021-05-30T09:20:05,860][WARN ][o.o.g.DanglingIndicesState] [testCorruptFileDuringIndexing] gateway.auto_import_dangling_indices is disabled, dangling indices will not be automatically detected or imported and must be managed manually
[2021-05-30T09:20:06,011][INFO ][o.o.n.Node               ] [testCorruptFileDuringIndexing] initialized
[2021-05-30T09:20:06,018][INFO ][o.o.n.Node               ] [[test_TEST-TEST_WORKER_VM=[--not-gradle--]-CLUSTER_SEED=[-3187108221953292427]-HASH=[F8B5DF0856F8]-cluster[T#1]]] starting ...
[2021-05-30T09:20:06,022][INFO ][o.o.n.Node               ] [[test_TEST-TEST_WORKER_VM=[--not-gradle--]-CLUSTER_SEED=[-3187108221953292427]-HASH=[F8B5DF0856F8]-cluster[T#2]]] starting ...
[2021-05-30T09:20:06,022][INFO ][o.o.n.Node               ] [[test_TEST-TEST_WORKER_VM=[--not-gradle--]-CLUSTER_SEED=[-3187108221953292427]-HASH=[F8B5DF0856F8]-cluster[T#3]]] starting ...
[2021-05-30T09:20:06,173][INFO ][o.o.t.TransportService   ] [[test_TEST-TEST_WORKER_VM=[--not-gradle--]-CLUSTER_SEED=[-3187108221953292427]-HASH=[F8B5DF0856F8]-cluster[T#1]]] publish_address {127.0.0.1:64137}, bound_addresses {[::1]:64134}, {127.0.0.1:64137}
[2021-05-30T09:20:06,173][INFO ][o.o.t.TransportService   ] [[test_TEST-TEST_WORKER_VM=[--not-gradle--]-CLUSTER_SEED=[-3187108221953292427]-HASH=[F8B5DF0856F8]-cluster[T#2]]] publish_address {127.0.0.1:64135}, bound_addresses {[::1]:64133}, {127.0.0.1:64135}
[2021-05-30T09:20:06,174][INFO ][o.o.t.TransportService   ] [[test_TEST-TEST_WORKER_VM=[--not-gradle--]-CLUSTER_SEED=[-3187108221953292427]-HASH=[F8B5DF0856F8]-cluster[T#3]]] publish_address {127.0.0.1:64136}, bound_addresses {[::1]:64132}, {127.0.0.1:64136}
[2021-05-30T09:20:07,402][INFO ][o.o.c.c.Coordinator      ] [node_t1] setting initial configuration to VotingConfiguration{OvzlIeulSuyDzstoCsMgsw,ZMqAFWPqSPanRHGlwB6rXw}
[2021-05-30T09:20:08,043][INFO ][o.o.c.s.MasterService    ] [node_t1] elected-as-master ([2] nodes joined)[{node_t1}{OvzlIeulSuyDzstoCsMgsw}{BDHYR-OWTFqFztS2D-LvIg}{127.0.0.1}{127.0.0.1:64135}{dimr} elect leader, {node_t0}{ZMqAFWPqSPanRHGlwB6rXw}{efha4d-ATBa2WSnhE_u1Nw}{127.0.0.1}{127.0.0.1:64137}{dimr} elect leader, _BECOME_MASTER_TASK_, _FINISH_ELECTION_], term: 1, version: 1, delta: master node changed {previous [], current [{node_t1}{OvzlIeulSuyDzstoCsMgsw}{BDHYR-OWTFqFztS2D-LvIg}{127.0.0.1}{127.0.0.1:64135}{dimr}]}, added {{node_t0}{ZMqAFWPqSPanRHGlwB6rXw}{efha4d-ATBa2WSnhE_u1Nw}{127.0.0.1}{127.0.0.1:64137}{dimr}}
[2021-05-30T09:20:08,345][INFO ][o.o.c.c.CoordinationState] [node_t1] cluster UUID set to [rho477_IQ6iNVM_6MoI8Ng]
[2021-05-30T09:20:08,345][INFO ][o.o.c.c.CoordinationState] [node_t0] cluster UUID set to [rho477_IQ6iNVM_6MoI8Ng]
[2021-05-30T09:20:08,622][INFO ][o.o.c.s.ClusterApplierService] [node_t0] master node changed {previous [], current [{node_t1}{OvzlIeulSuyDzstoCsMgsw}{BDHYR-OWTFqFztS2D-LvIg}{127.0.0.1}{127.0.0.1:64135}{dimr}]}, added {{node_t1}{OvzlIeulSuyDzstoCsMgsw}{BDHYR-OWTFqFztS2D-LvIg}{127.0.0.1}{127.0.0.1:64135}{dimr}}, term: 1, version: 1, reason: ApplyCommitRequest{term=1, version=1, sourceNode={node_t1}{OvzlIeulSuyDzstoCsMgsw}{BDHYR-OWTFqFztS2D-LvIg}{127.0.0.1}{127.0.0.1:64135}{dimr}}
[2021-05-30T09:20:08,636][INFO ][o.o.n.Node               ] [[test_TEST-TEST_WORKER_VM=[--not-gradle--]-CLUSTER_SEED=[-3187108221953292427]-HASH=[F8B5DF0856F8]-cluster[T#1]]] started
[2021-05-30T09:20:08,642][INFO ][o.o.c.s.ClusterApplierService] [node_t1] master node changed {previous [], current [{node_t1}{OvzlIeulSuyDzstoCsMgsw}{BDHYR-OWTFqFztS2D-LvIg}{127.0.0.1}{127.0.0.1:64135}{dimr}]}, added {{node_t0}{ZMqAFWPqSPanRHGlwB6rXw}{efha4d-ATBa2WSnhE_u1Nw}{127.0.0.1}{127.0.0.1:64137}{dimr}}, term: 1, version: 1, reason: Publication{term=1, version=1}
[2021-05-30T09:20:08,658][INFO ][o.o.n.Node               ] [[test_TEST-TEST_WORKER_VM=[--not-gradle--]-CLUSTER_SEED=[-3187108221953292427]-HASH=[F8B5DF0856F8]-cluster[T#2]]] started
[2021-05-30T09:20:08,680][INFO ][o.o.c.s.MasterService    ] [node_t1] node-join[{node_t2}{wecIupjPS9O5h3wX52oGkg}{FVSHxy-dT5uBWnWOD_bcTA}{127.0.0.1}{127.0.0.1:64136} join existing leader], term: 1, version: 2, delta: added {{node_t2}{wecIupjPS9O5h3wX52oGkg}{FVSHxy-dT5uBWnWOD_bcTA}{127.0.0.1}{127.0.0.1:64136}}
[2021-05-30T09:20:08,740][INFO ][o.o.c.s.ClusterApplierService] [node_t0] added {{node_t2}{wecIupjPS9O5h3wX52oGkg}{FVSHxy-dT5uBWnWOD_bcTA}{127.0.0.1}{127.0.0.1:64136}}, term: 1, version: 2, reason: ApplyCommitRequest{term=1, version=2, sourceNode={node_t1}{OvzlIeulSuyDzstoCsMgsw}{BDHYR-OWTFqFztS2D-LvIg}{127.0.0.1}{127.0.0.1:64135}{dimr}}
[2021-05-30T09:20:08,740][INFO ][o.o.c.s.ClusterApplierService] [node_t2] master node changed {previous [], current [{node_t1}{OvzlIeulSuyDzstoCsMgsw}{BDHYR-OWTFqFztS2D-LvIg}{127.0.0.1}{127.0.0.1:64135}{dimr}]}, added {{node_t1}{OvzlIeulSuyDzstoCsMgsw}{BDHYR-OWTFqFztS2D-LvIg}{127.0.0.1}{127.0.0.1:64135}{dimr},{node_t0}{ZMqAFWPqSPanRHGlwB6rXw}{efha4d-ATBa2WSnhE_u1Nw}{127.0.0.1}{127.0.0.1:64137}{dimr}}, term: 1, version: 2, reason: ApplyCommitRequest{term=1, version=2, sourceNode={node_t1}{OvzlIeulSuyDzstoCsMgsw}{BDHYR-OWTFqFztS2D-LvIg}{127.0.0.1}{127.0.0.1:64135}{dimr}}
[2021-05-30T09:20:08,741][INFO ][o.o.n.Node               ] [[test_TEST-TEST_WORKER_VM=[--not-gradle--]-CLUSTER_SEED=[-3187108221953292427]-HASH=[F8B5DF0856F8]-cluster[T#3]]] started
[2021-05-30T09:20:08,754][INFO ][o.o.c.s.ClusterApplierService] [node_t1] added {{node_t2}{wecIupjPS9O5h3wX52oGkg}{FVSHxy-dT5uBWnWOD_bcTA}{127.0.0.1}{127.0.0.1:64136}}, term: 1, version: 2, reason: Publication{term=1, version=2}
[2021-05-30T09:20:08,763][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] no modules loaded
[2021-05-30T09:20:08,763][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.transport.nio.MockNioTransportPlugin]
[2021-05-30T09:20:09,474][INFO ][o.o.g.GatewayService     ] [node_t1] recovered [0] indices into cluster_state
[2021-05-30T09:20:09,555][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] no modules loaded
[2021-05-30T09:20:09,555][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.transport.nio.MockNioTransportPlugin]
[2021-05-30T09:20:09,605][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] no modules loaded
[2021-05-30T09:20:09,605][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.transport.nio.MockNioTransportPlugin]
[2021-05-30T09:20:10,532][INFO ][o.o.c.m.MetadataIndexTemplateService] [node_t1] adding template [random_index_template] for index patterns [*]
[2021-05-30T09:20:10,813][INFO ][o.o.i.s.CorruptedFileIT  ] [testCorruptFileDuringIndexing] [CorruptedFileIT#testCorruptFileDuringIndexing]: all set up test
[2021-05-30T09:20:10,814][INFO ][o.o.t.InternalTestCluster] [testCorruptFileDuringIndexing] increasing cluster size from 2 to 3
[2021-05-30T09:20:10,819][INFO ][o.o.n.Node               ] [testCorruptFileDuringIndexing] version[1.0.0], pid[53657], build[unknown/unknown/unknown], OS[Mac OS X/10.13.6/x86_64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/14.0.1/14.0.1+7]
[2021-05-30T09:20:10,819][INFO ][o.o.n.Node               ] [testCorruptFileDuringIndexing] JVM home [/Users/itiyamas/jdk-14.0.1.jdk/Contents/Home]
[2021-05-30T09:20:10,820][DEPRECATION][o.o.d.n.Node             ] [testCorruptFileDuringIndexing] no-jdk distributions that do not bundle a JDK are deprecated and will be removed in a future release
[2021-05-30T09:20:10,820][INFO ][o.o.n.Node               ] [testCorruptFileDuringIndexing] JVM arguments [-ea, -Djava.locale.providers=SPI,COMPAT, -Didea.test.cyclic.buffer.size=1048576, -javaagent:/Applications/IntelliJ IDEA CE.app/Contents/lib/idea_rt.jar=64119:/Applications/IntelliJ IDEA CE.app/Contents/bin, -Dfile.encoding=UTF-8]
[2021-05-30T09:20:10,821][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] no modules loaded
[2021-05-30T09:20:10,821][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.index.MockEngineFactoryPlugin]
[2021-05-30T09:20:10,821][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.script.MockScriptService$TestPlugin]
[2021-05-30T09:20:10,822][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.InternalSettingsPlugin]
[2021-05-30T09:20:10,822][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.MockHttpTransport$TestPlugin]
[2021-05-30T09:20:10,822][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.MockIndexEventListener$TestPlugin]
[2021-05-30T09:20:10,822][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.OpenSearchIntegTestCase$AssertActionNamePlugin]
[2021-05-30T09:20:10,822][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.OpenSearchIntegTestCase$TestSeedPlugin]
[2021-05-30T09:20:10,822][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.TestGeoShapeFieldMapperPlugin]
[2021-05-30T09:20:10,822][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.store.MockFSIndexStore$TestPlugin]
[2021-05-30T09:20:10,822][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.test.transport.MockTransportService$TestPlugin]
[2021-05-30T09:20:10,822][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.transport.nio.MockNioTransportPlugin]
[2021-05-30T09:20:10,841][INFO ][o.o.e.NodeEnvironment    ] [testCorruptFileDuringIndexing] using [4] data paths, mounts [[/ (/dev/disk1s1)]], net usable_space [363.8gb], net total_space [465.5gb], types [apfs]
[2021-05-30T09:20:10,841][INFO ][o.o.e.NodeEnvironment    ] [testCorruptFileDuringIndexing] heap size [4gb], compressed ordinary object pointers [true]
[2021-05-30T09:20:10,861][INFO ][o.o.n.Node               ] [testCorruptFileDuringIndexing] node name [node_t3], node ID [YNZFPQcdS26B1MS-Npd20A], cluster name [TEST-TEST_WORKER_VM=[--not-gradle--]-CLUSTER_SEED=[-3187108221953292427]-HASH=[F8B5DF0856F8]-cluster], roles [master, remote_cluster_client, data, ingest]
[2021-05-30T09:20:10,968][INFO ][o.o.d.DiscoveryModule    ] [testCorruptFileDuringIndexing] using discovery type [zen] and seed hosts providers [settings, file]
[2021-05-30T09:20:11,037][WARN ][o.o.g.DanglingIndicesState] [testCorruptFileDuringIndexing] gateway.auto_import_dangling_indices is disabled, dangling indices will not be automatically detected or imported and must be managed manually
[2021-05-30T09:20:11,062][INFO ][o.o.n.Node               ] [testCorruptFileDuringIndexing] initialized
[2021-05-30T09:20:11,090][DEPRECATION][o.o.d.c.s.Settings       ] [node_t1] [discovery.zen.minimum_master_nodes] setting was deprecated in OpenSearch and will be removed in a future release! See the breaking changes documentation for the next major version.
[2021-05-30T09:20:11,091][DEPRECATION][o.o.d.c.s.Settings       ] [node_t1] [discovery.zen.minimum_master_nodes] setting was deprecated in OpenSearch and will be removed in a future release! See the breaking changes documentation for the next major version.
[2021-05-30T09:20:11,179][INFO ][o.o.n.Node               ] [[test_TEST-TEST_WORKER_VM=[--not-gradle--]-CLUSTER_SEED=[-3187108221953292427]-HASH=[F8B5DF0856F8]-cluster[T#4]]] starting ...
[2021-05-30T09:20:11,186][INFO ][o.o.t.TransportService   ] [[test_TEST-TEST_WORKER_VM=[--not-gradle--]-CLUSTER_SEED=[-3187108221953292427]-HASH=[F8B5DF0856F8]-cluster[T#4]]] publish_address {127.0.0.1:64181}, bound_addresses {[::1]:64180}, {127.0.0.1:64181}
[2021-05-30T09:20:11,835][INFO ][o.o.c.s.MasterService    ] [node_t1] node-join[{node_t3}{YNZFPQcdS26B1MS-Npd20A}{0fI2zIpxSj-iC0Kj2S98cw}{127.0.0.1}{127.0.0.1:64181}{dimr} join existing leader], term: 1, version: 8, delta: added {{node_t3}{YNZFPQcdS26B1MS-Npd20A}{0fI2zIpxSj-iC0Kj2S98cw}{127.0.0.1}{127.0.0.1:64181}{dimr}}
[2021-05-30T09:20:11,870][INFO ][o.o.c.s.ClusterApplierService] [node_t2] added {{node_t3}{YNZFPQcdS26B1MS-Npd20A}{0fI2zIpxSj-iC0Kj2S98cw}{127.0.0.1}{127.0.0.1:64181}{dimr}}, term: 1, version: 8, reason: ApplyCommitRequest{term=1, version=8, sourceNode={node_t1}{OvzlIeulSuyDzstoCsMgsw}{BDHYR-OWTFqFztS2D-LvIg}{127.0.0.1}{127.0.0.1:64135}{dimr}}
[2021-05-30T09:20:11,880][INFO ][o.o.c.s.ClusterApplierService] [node_t0] added {{node_t3}{YNZFPQcdS26B1MS-Npd20A}{0fI2zIpxSj-iC0Kj2S98cw}{127.0.0.1}{127.0.0.1:64181}{dimr}}, term: 1, version: 8, reason: ApplyCommitRequest{term=1, version=8, sourceNode={node_t1}{OvzlIeulSuyDzstoCsMgsw}{BDHYR-OWTFqFztS2D-LvIg}{127.0.0.1}{127.0.0.1:64135}{dimr}}
[2021-05-30T09:20:12,214][INFO ][o.o.c.s.ClusterApplierService] [node_t3] master node changed {previous [], current [{node_t1}{OvzlIeulSuyDzstoCsMgsw}{BDHYR-OWTFqFztS2D-LvIg}{127.0.0.1}{127.0.0.1:64135}{dimr}]}, added {{node_t1}{OvzlIeulSuyDzstoCsMgsw}{BDHYR-OWTFqFztS2D-LvIg}{127.0.0.1}{127.0.0.1:64135}{dimr},{node_t2}{wecIupjPS9O5h3wX52oGkg}{FVSHxy-dT5uBWnWOD_bcTA}{127.0.0.1}{127.0.0.1:64136},{node_t0}{ZMqAFWPqSPanRHGlwB6rXw}{efha4d-ATBa2WSnhE_u1Nw}{127.0.0.1}{127.0.0.1:64137}{dimr}}, term: 1, version: 8, reason: ApplyCommitRequest{term=1, version=8, sourceNode={node_t1}{OvzlIeulSuyDzstoCsMgsw}{BDHYR-OWTFqFztS2D-LvIg}{127.0.0.1}{127.0.0.1:64135}{dimr}}
[2021-05-30T09:20:12,223][INFO ][o.o.n.Node               ] [[test_TEST-TEST_WORKER_VM=[--not-gradle--]-CLUSTER_SEED=[-3187108221953292427]-HASH=[F8B5DF0856F8]-cluster[T#4]]] started
[2021-05-30T09:20:12,223][INFO ][o.o.c.s.ClusterApplierService] [node_t1] added {{node_t3}{YNZFPQcdS26B1MS-Npd20A}{0fI2zIpxSj-iC0Kj2S98cw}{127.0.0.1}{127.0.0.1:64181}{dimr}}, term: 1, version: 8, reason: Publication{term=1, version=8}
[2021-05-30T09:20:12,230][INFO ][o.o.i.s.CorruptedFileIT  ] [testCorruptFileDuringIndexing] --> cluster has [3] data nodes, corrupted primary will be overwritten
[2021-05-30T09:20:12,235][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] no modules loaded
[2021-05-30T09:20:12,236][INFO ][o.o.p.PluginsService     ] [testCorruptFileDuringIndexing] loaded plugin [org.opensearch.transport.nio.MockNioTransportPlugin]
[2021-05-30T09:20:13,462][WARN ][o.o.i.IndexSettings      ] [node_t1] [test] [index.merge.enabled] is set to false, this should only be used in tests and can cause serious problems in production environments
[2021-05-30T09:20:13,474][INFO ][o.o.c.m.MetadataCreateIndexService] [node_t1] [test] creating index, cause [api], templates [random_index_template], shards [1]/[1]
[2021-05-30T09:20:14,116][WARN ][o.o.i.IndexSettings      ] [node_t1] [test] [index.merge.enabled] is set to false, this should only be used in tests and can cause serious problems in production environments
[2021-05-30T09:20:14,179][WARN ][o.o.i.IndexSettings      ] [node_t1] [test] [index.merge.enabled] is set to false, this should only be used in tests and can cause serious problems in production environments
[2021-05-30T09:20:15,409][WARN ][o.o.i.IndexSettings      ] [node_t0] [test] [index.merge.enabled] is set to false, this should only be used in tests and can cause serious problems in production environments
[2021-05-30T09:20:15,480][WARN ][o.o.i.IndexSettings      ] [node_t0] [test] [index.merge.enabled] is set to false, this should only be used in tests and can cause serious problems in production environments
[2021-05-30T09:20:16,057][INFO ][o.o.c.r.a.AllocationService] [node_t1] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[test][0]]]).
[2021-05-30T09:20:16,465][WARN ][o.o.i.IndexSettings      ] [node_t3] [test] [index.merge.enabled] is set to false, this should only be used in tests and can cause serious problems in production environments
[2021-05-30T09:20:16,494][WARN ][o.o.i.IndexSettings      ] [node_t1] [test] [index.merge.enabled] is set to false, this should only be used in tests and can cause serious problems in production environments
[2021-05-30T09:20:17,062][INFO ][o.o.i.s.CorruptedFileIT  ] [testCorruptFileDuringIndexing] Index [1] docs async: [true] bulk: [false]
[2021-05-30T09:20:17,235][WARN ][o.o.i.IndexSettings      ] [node_t1] [test] [index.merge.enabled] is set to false, this should only be used in tests and can cause serious problems in production environments
[2021-05-30T09:20:17,246][INFO ][o.o.c.m.MetadataMappingService] [node_t1] [test/Wi1zu6UjRx6JefElfgQ16g] create_mapping [type]
[2021-05-30T09:20:26,218][INFO ][o.o.t.CorruptionUtils    ] [testCorruptFileDuringIndexing] Corrupting file --  flipping at position 153 from 69 to 6a file: segments_4
[2021-05-30T09:20:26,219][INFO ][o.o.t.CorruptionUtils    ] [testCorruptFileDuringIndexing] Checksum before: [2425222970] after: [3359768840] checksum value after corruption: 2425222970] file: segments_4 length: 314
[2021-05-30T09:20:26,220][INFO ][o.o.i.s.CorruptedFileIT  ] [testCorruptFileDuringIndexing] --> [test][0], node[OvzlIeulSuyDzstoCsMgsw], [P], s[STARTED], a[id=mFsTIe3PQx2-7bc6GTEKRQ] corrupted
[2021-05-30T09:20:26,220][INFO ][o.o.i.s.CorruptedFileIT  ] [testCorruptFileDuringIndexing] -->  creating repository
[2021-05-30T09:20:26,228][INFO ][o.o.r.RepositoriesService] [node_t1] put repository [test-repo]
[2021-05-30T09:20:26,484][INFO ][o.o.i.s.CorruptedFileIT  ] [testCorruptFileDuringIndexing] --> snapshot
[2021-05-30T09:20:26,487][WARN ][o.o.i.IndexSettings      ] [node_t1] [test] [index.merge.enabled] is set to false, this should only be used in tests and can cause serious problems in production environments
[2021-05-30T09:20:26,884][INFO ][o.o.t.BackgroundIndexer  ] [testCorruptFileDuringIndexing] --> creating 5 indexing threads (auto start: [true], numOfDocs: [200])
[2021-05-30T09:20:26,887][INFO ][o.o.t.BackgroundIndexer  ] [[Thread-5]] **** starting indexing thread 4
[2021-05-30T09:20:26,888][INFO ][o.o.t.BackgroundIndexer  ] [[Thread-2]] **** starting indexing thread 1
[2021-05-30T09:20:26,887][INFO ][o.o.t.BackgroundIndexer  ] [[Thread-1]] **** starting indexing thread 0
[2021-05-30T09:20:26,887][INFO ][o.o.t.BackgroundIndexer  ] [[Thread-4]] **** starting indexing thread 3
[2021-05-30T09:20:26,888][INFO ][o.o.t.BackgroundIndexer  ] [[Thread-3]] **** starting indexing thread 2
[2021-05-30T09:20:26,919][WARN ][o.o.i.IndexSettings      ] [node_t1] [test] [index.merge.enabled] is set to false, this should only be used in tests and can cause serious problems in production environments
[2021-05-30T09:20:26,932][INFO ][o.o.c.m.MetadataMappingService] [node_t1] [test/Wi1zu6UjRx6JefElfgQ16g] update_mapping [type]
[2021-05-30T09:20:27,240][WARN ][o.o.i.IndexSettings      ] [node_t1] [test] [index.merge.enabled] is set to false, this should only be used in tests and can cause serious problems in production environments
[2021-05-30T09:20:35,226][INFO ][o.o.s.SnapshotsService   ] [node_t1] snapshot [test-repo:test-snap/6bWw79oRRNKKfanK04ZsNA] started
[2021-05-30T09:20:35,264][WARN ][o.o.i.e.Engine           ] [node_t1] [test][0] failed engine [lucene commit failed]
org.apache.lucene.index.CorruptIndexException: checksum failed (hardware problem?) : expected=908df73a actual=8ee71f2a (resource=BufferedChecksumIndexInput(MockIndexInputWrapper(NIOFSIndexInput(path="/private/var/folders/wq/7rqhf85j75516cc7fxl77_hdkwz2y1/T/org.opensearch.index.store.CorruptedFileIT_6F82F7463B75BF73-001/tempDir-002/node_t1/d0/nodes/0/indices/Wi1zu6UjRx6JefElfgQ16g/0/index/segments_4"))))
	at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:419) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:448) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:434) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:291) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.opensearch.index.engine.CombinedDeletionPolicy.getDocCountOfCommit(CombinedDeletionPolicy.java:144) ~[classes/:?]
	at org.opensearch.index.engine.CombinedDeletionPolicy.onCommit(CombinedDeletionPolicy.java:116) ~[classes/:?]
	at org.apache.lucene.index.IndexFileDeleter.checkpoint(IndexFileDeleter.java:524) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.IndexWriter.finishCommit(IndexWriter.java:3823) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3779) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3729) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.opensearch.index.engine.InternalEngine.commitIndexWriter(InternalEngine.java:2537) [classes/:?]
	at org.opensearch.index.engine.InternalEngine.flush(InternalEngine.java:1868) [classes/:?]
	at org.opensearch.index.engine.InternalEngine.acquireLastIndexCommit(InternalEngine.java:2108) [classes/:?]
	at org.opensearch.index.shard.IndexShard.acquireLastIndexCommit(IndexShard.java:1191) [classes/:?]
	at org.opensearch.snapshots.SnapshotShardsService.snapshot(SnapshotShardsService.java:355) [classes/:?]
	at org.opensearch.snapshots.SnapshotShardsService.lambda$startNewShards$1(SnapshotShardsService.java:273) [classes/:?]
	at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:697) [classes/:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]
	at java.lang.Thread.run(Thread.java:832) [?:?]
[2021-05-30T09:20:35,277][WARN ][o.o.i.e.Engine           ] [node_t1] [test][0] tried to fail engine but engine is already failed. ignoring. [corrupt file (source: [flush])]
org.opensearch.index.engine.FlushFailedEngineException: Flush failed
	at org.opensearch.index.engine.InternalEngine.flush(InternalEngine.java:1882) [classes/:?]
	at org.opensearch.index.engine.InternalEngine.acquireLastIndexCommit(InternalEngine.java:2108) [classes/:?]
	at org.opensearch.index.shard.IndexShard.acquireLastIndexCommit(IndexShard.java:1191) [classes/:?]
	at org.opensearch.snapshots.SnapshotShardsService.snapshot(SnapshotShardsService.java:355) [classes/:?]
	at org.opensearch.snapshots.SnapshotShardsService.lambda$startNewShards$1(SnapshotShardsService.java:273) [classes/:?]
	at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:697) [classes/:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]
	at java.lang.Thread.run(Thread.java:832) [?:?]
Caused by: org.apache.lucene.index.CorruptIndexException: checksum failed (hardware problem?) : expected=908df73a actual=8ee71f2a (resource=BufferedChecksumIndexInput(MockIndexInputWrapper(NIOFSIndexInput(path="/private/var/folders/wq/7rqhf85j75516cc7fxl77_hdkwz2y1/T/org.opensearch.index.store.CorruptedFileIT_6F82F7463B75BF73-001/tempDir-002/node_t1/d0/nodes/0/indices/Wi1zu6UjRx6JefElfgQ16g/0/index/segments_4"))))
	at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:419) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:448) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:434) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:291) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.opensearch.index.engine.CombinedDeletionPolicy.getDocCountOfCommit(CombinedDeletionPolicy.java:144) ~[classes/:?]
	at org.opensearch.index.engine.CombinedDeletionPolicy.onCommit(CombinedDeletionPolicy.java:116) ~[classes/:?]
	at org.apache.lucene.index.IndexFileDeleter.checkpoint(IndexFileDeleter.java:524) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.IndexWriter.finishCommit(IndexWriter.java:3823) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3779) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3729) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.opensearch.index.engine.InternalEngine.commitIndexWriter(InternalEngine.java:2537) ~[classes/:?]
	at org.opensearch.index.engine.InternalEngine.flush(InternalEngine.java:1868) [classes/:?]
	... 8 more
[2021-05-30T09:20:35,279][WARN ][o.o.s.SnapshotShardsService] [node_t1] [[test][0]][test-repo:test-snap/6bWw79oRRNKKfanK04ZsNA] failed to snapshot shard
org.opensearch.index.engine.FlushFailedEngineException: Flush failed
	at org.opensearch.index.engine.InternalEngine.flush(InternalEngine.java:1882) ~[classes/:?]
	at org.opensearch.index.engine.InternalEngine.acquireLastIndexCommit(InternalEngine.java:2108) ~[classes/:?]
	at org.opensearch.index.shard.IndexShard.acquireLastIndexCommit(IndexShard.java:1191) ~[classes/:?]
	at org.opensearch.snapshots.SnapshotShardsService.snapshot(SnapshotShardsService.java:355) [classes/:?]
	at org.opensearch.snapshots.SnapshotShardsService.lambda$startNewShards$1(SnapshotShardsService.java:273) [classes/:?]
	at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:697) [classes/:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]
	at java.lang.Thread.run(Thread.java:832) [?:?]
Caused by: org.apache.lucene.index.CorruptIndexException: checksum failed (hardware problem?) : expected=908df73a actual=8ee71f2a (resource=BufferedChecksumIndexInput(MockIndexInputWrapper(NIOFSIndexInput(path="/private/var/folders/wq/7rqhf85j75516cc7fxl77_hdkwz2y1/T/org.opensearch.index.store.CorruptedFileIT_6F82F7463B75BF73-001/tempDir-002/node_t1/d0/nodes/0/indices/Wi1zu6UjRx6JefElfgQ16g/0/index/segments_4"))))
	at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:419) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:448) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:434) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:291) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.opensearch.index.engine.CombinedDeletionPolicy.getDocCountOfCommit(CombinedDeletionPolicy.java:144) ~[classes/:?]
	at org.opensearch.index.engine.CombinedDeletionPolicy.onCommit(CombinedDeletionPolicy.java:116) ~[classes/:?]
	at org.apache.lucene.index.IndexFileDeleter.checkpoint(IndexFileDeleter.java:524) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.IndexWriter.finishCommit(IndexWriter.java:3823) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3779) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3729) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.opensearch.index.engine.InternalEngine.commitIndexWriter(InternalEngine.java:2537) ~[classes/:?]
	at org.opensearch.index.engine.InternalEngine.flush(InternalEngine.java:1868) ~[classes/:?]
	... 8 more
[2021-05-30T09:20:35,283][WARN ][o.o.i.c.IndicesClusterStateService] [node_t1] [test][0] marking and sending shard failed due to [shard failure, reason [lucene commit failed]]
org.apache.lucene.index.CorruptIndexException: checksum failed (hardware problem?) : expected=908df73a actual=8ee71f2a (resource=BufferedChecksumIndexInput(MockIndexInputWrapper(NIOFSIndexInput(path="/private/var/folders/wq/7rqhf85j75516cc7fxl77_hdkwz2y1/T/org.opensearch.index.store.CorruptedFileIT_6F82F7463B75BF73-001/tempDir-002/node_t1/d0/nodes/0/indices/Wi1zu6UjRx6JefElfgQ16g/0/index/segments_4"))))
	at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:419) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:448) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:434) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:291) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.opensearch.index.engine.CombinedDeletionPolicy.getDocCountOfCommit(CombinedDeletionPolicy.java:144) ~[classes/:?]
	at org.opensearch.index.engine.CombinedDeletionPolicy.onCommit(CombinedDeletionPolicy.java:116) ~[classes/:?]
	at org.apache.lucene.index.IndexFileDeleter.checkpoint(IndexFileDeleter.java:524) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.IndexWriter.finishCommit(IndexWriter.java:3823) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3779) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3729) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.opensearch.index.engine.InternalEngine.commitIndexWriter(InternalEngine.java:2537) ~[classes/:?]
	at org.opensearch.index.engine.InternalEngine.flush(InternalEngine.java:1868) ~[classes/:?]
	at org.opensearch.index.engine.InternalEngine.acquireLastIndexCommit(InternalEngine.java:2108) ~[classes/:?]
	at org.opensearch.index.shard.IndexShard.acquireLastIndexCommit(IndexShard.java:1191) ~[classes/:?]
	at org.opensearch.snapshots.SnapshotShardsService.snapshot(SnapshotShardsService.java:355) ~[classes/:?]
	at org.opensearch.snapshots.SnapshotShardsService.lambda$startNewShards$1(SnapshotShardsService.java:273) ~[classes/:?]
	at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:697) [classes/:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]
	at java.lang.Thread.run(Thread.java:832) [?:?]
[2021-05-30T09:20:35,291][WARN ][o.o.c.r.a.AllocationService] [node_t1] failing shard [failed shard, shard [test][0], node[OvzlIeulSuyDzstoCsMgsw], [P], s[STARTED], a[id=mFsTIe3PQx2-7bc6GTEKRQ], message [shard failure, reason [lucene commit failed]], failure [CorruptIndexException[checksum failed (hardware problem?) : expected=908df73a actual=8ee71f2a (resource=BufferedChecksumIndexInput(MockIndexInputWrapper(NIOFSIndexInput(path="/private/var/folders/wq/7rqhf85j75516cc7fxl77_hdkwz2y1/T/org.opensearch.index.store.CorruptedFileIT_6F82F7463B75BF73-001/tempDir-002/node_t1/d0/nodes/0/indices/Wi1zu6UjRx6JefElfgQ16g/0/index/segments_4"))))]], markAsStale [true]]
org.apache.lucene.index.CorruptIndexException: checksum failed (hardware problem?) : expected=908df73a actual=8ee71f2a (resource=BufferedChecksumIndexInput(MockIndexInputWrapper(NIOFSIndexInput(path="/private/var/folders/wq/7rqhf85j75516cc7fxl77_hdkwz2y1/T/org.opensearch.index.store.CorruptedFileIT_6F82F7463B75BF73-001/tempDir-002/node_t1/d0/nodes/0/indices/Wi1zu6UjRx6JefElfgQ16g/0/index/segments_4"))))
	at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:419) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:448) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:434) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:291) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.opensearch.index.engine.CombinedDeletionPolicy.getDocCountOfCommit(CombinedDeletionPolicy.java:144) ~[classes/:?]
	at org.opensearch.index.engine.CombinedDeletionPolicy.onCommit(CombinedDeletionPolicy.java:116) ~[classes/:?]
	at org.apache.lucene.index.IndexFileDeleter.checkpoint(IndexFileDeleter.java:524) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.IndexWriter.finishCommit(IndexWriter.java:3823) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3779) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3729) ~[lucene-core-8.8.2.jar:8.8.2 a92a05e195b775b30ca410bc0a26e8e79e7b3bfb - mdrob - 2021-04-06 16:33:27]
	at org.opensearch.index.engine.InternalEngine.commitIndexWriter(InternalEngine.java:2537) ~[classes/:?]
	at org.opensearch.index.engine.InternalEngine.flush(InternalEngine.java:1868) ~[classes/:?]
	at org.opensearch.index.engine.InternalEngine.acquireLastIndexCommit(InternalEngine.java:2108) ~[classes/:?]
	at org.opensearch.index.shard.IndexShard.acquireLastIndexCommit(IndexShard.java:1191) ~[classes/:?]
	at org.opensearch.snapshots.SnapshotShardsService.snapshot(SnapshotShardsService.java:355) ~[classes/:?]
	at org.opensearch.snapshots.SnapshotShardsService.lambda$startNewShards$1(SnapshotShardsService.java:273) ~[classes/:?]
	at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:697) ~[classes/:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]
	at java.lang.Thread.run(Thread.java:832) [?:?]
[2021-05-30T09:20:35,294][WARN ][o.o.c.r.a.AllocationService] [node_t1] failing shard [failed shard, shard [test][0], node[ZMqAFWPqSPanRHGlwB6rXw], [R], s[STARTED], a[id=6VnoJTbVSDKM7EAKbdlqNg], message [failed to perform indices:data/write/bulk[s] on replica [test][0], node[ZMqAFWPqSPanRHGlwB6rXw], [R], s[STARTED], a[id=6VnoJTbVSDKM7EAKbdlqNg]], failure [IndexShardClosedException[CurrentState[CLOSED] Primary closed.]], markAsStale [true]]
org.opensearch.index.shard.IndexShardClosedException: CurrentState[CLOSED] Primary closed.
	at org.opensearch.action.support.replication.PendingReplicationActions.lambda$cancelActions$0(PendingReplicationActions.java:125) ~[classes/:?]
	at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) ~[?:?]
	at java.util.concurrent.ConcurrentHashMap$KeySpliterator.forEachRemaining(ConcurrentHashMap.java:3573) ~[?:?]
	at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:658) ~[?:?]
	at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:274) ~[?:?]
	at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1624) ~[?:?]
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) ~[?:?]
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) ~[?:?]
	at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) ~[?:?]
	at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) ~[?:?]
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:?]
	at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497) ~[?:?]
	at org.opensearch.action.support.replication.PendingReplicationActions.lambda$cancelActions$1(PendingReplicationActions.java:125) ~[classes/:?]
	at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:697) ~[classes/:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]
	at java.lang.Thread.run(Thread.java:832) [?:?]
[2021-05-30T09:20:35,302][INFO ][o.o.c.r.a.AllocationService] [node_t1] Cluster health status changed from [GREEN] to [RED] (reason: [shards failed [[test][0], [test][0]]]).
[2021-05-30T09:20:36,379][INFO ][o.o.s.SnapshotsService   ] [node_t1] snapshot [test-repo:test-snap/6bWw79oRRNKKfanK04ZsNA] completed with state [PARTIAL]
[2021-05-30T09:20:36,382][INFO ][o.o.i.s.CorruptedFileIT  ] [testCorruptFileDuringIndexing] --> snapshot terminated with state PARTIAL
[2021-05-30T09:21:31,347][WARN ][o.o.t.BackgroundIndexer  ] [[Thread-3]] **** failed indexing thread 2 on doc id 26
org.opensearch.action.UnavailableShardsException: [test][0] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[test][0]] containing [index {[test][_doc][jC0lvHkBUZSfq7VjSxlo], source[{"test":"value26","text":" 푌랞럽푁꼭귣잲킾뾵 \uD83C\uDC25\uD83C\uDC1E\uD83C\uDC2B\uD83C\uDC09\uD83C\uDC1F\uD83C\uDC2B\uD83C\uDC1D\uD83C\uDC15\uD83C\uDC1A\uD83C\uDC04 ՅԻվխՂր՗ՈԼ ퟟퟴퟐ퟽ퟛ퟽ힾퟹힽퟗ ᧨᧻ \uD83C\uDD5C\uD83C\uDDE9\uD83C\uDD5D ąŮ \uD802\uDF14","id":26}]}]]
	at org.opensearch.action.support.replication.TransportReplicationAction$ReroutePhase.retryBecauseUnavailable(TransportReplicationAction.java:913) ~[classes/:?]
	at org.opensearch.action.support.replication.TransportReplicationAction$ReroutePhase.doRun(TransportReplicationAction.java:758) ~[classes/:?]
	at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:50) ~[classes/:?]
	at org.opensearch.action.support.replication.TransportReplicationAction$ReroutePhase$2.onTimeout(TransportReplicationAction.java:873) ~[classes/:?]
	at org.opensearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:348) ~[classes/:?]
	at org.opensearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:265) ~[classes/:?]
	at org.opensearch.cluster.service.ClusterApplierService$NotifyTimeout.run(ClusterApplierService.java:614) ~[classes/:?]
	at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:697) ~[classes/:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) ~[?:?]
	at java.lang.Thread.run(Thread.java:832) ~[?:?]
[2021-05-30T09:21:33,373][WARN ][o.o.t.BackgroundIndexer  ] [[Thread-4]] **** failed indexing thread 3 on doc id 34
org.opensearch.action.UnavailableShardsException: [test][0] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[test][0]] containing [index {[test][_doc][lC0lvHkBUZSfq7VjUxlH], source[{"test":"value34","text":" \uD834\uDE0D\uD834\uDE0E ⟰⟿ \uD800\uDFC1\uD800\uDFDC\uD800\uDFD6\uD800\uDFBC\uD800\uDFC8\uD800\uDFA1\uD800\uDFB9 ➔❺❤❫➢➓✝➏➳ \uD804\uDCAE\uD804\uDCBF\uD804\uDCCE\uD804\uDC98 ⅊ℱℐ ꩾ \uDB94\uDF93\uDB8B\uDE72\uDB90\uDDB7\uDB9F\uDE9B\uDBAB\uDE1D\uDB81\uDED4\uDB9B\uDFBA\uDB93\uDF67 ⷏ⷊⶑⶂⷎⷃ ֐׹א֯מֿתת׀י \uD800\uDF2B ︬︨︨︫︬︤︦ \uD802\uDE7F\uD802\uDE7E\uD802\uDE68\uD802\uDE7D\uD802\uDE60\uD802\uDE6B\uD802\uDE6D\uD802\uDE6E\uD802\uDE68 ⷪⷸⷶⷯⷡⷭⷮⷻⷦⷨ Ⴥჿ჻ქხႤ ᎹᎠ ⭬ Ἡ \uD80C\uDD24","id":34}]}]]
	at org.opensearch.action.support.replication.TransportReplicationAction$ReroutePhase.retryBecauseUnavailable(TransportReplicationAction.java:913) ~[classes/:?]
	at org.opensearch.action.support.replication.TransportReplicationAction$ReroutePhase.doRun(TransportReplicationAction.java:758) ~[classes/:?]
	at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:50) ~[classes/:?]
	at org.opensearch.action.support.replication.TransportReplicationAction$ReroutePhase$2.onTimeout(TransportReplicationAction.java:873) ~[classes/:?]
	at org.opensearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:348) ~[classes/:?]
	at org.opensearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:265) ~[classes/:?]
	at org.opensearch.cluster.service.ClusterApplierService$NotifyTimeout.run(ClusterApplierService.java:614) ~[classes/:?]
	at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:697) ~[classes/:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) ~[?:?]
	at java.lang.Thread.run(Thread.java:832) ~[?:?]
[2021-05-30T09:21:33,374][WARN ][o.o.t.BackgroundIndexer  ] [[Thread-1]] **** failed indexing thread 0 on doc id 27
org.opensearch.action.UnavailableShardsException: [test][0] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[test][0]] containing [index {[test][_doc][jS0lvHkBUZSfq7VjUxkl], source[{"test":"value27","text":" ⥏⥱⤼⤾⥩⥻⤔⥹⤴⤛ ᥰᥲ᥵᥾᥷᥮ᥩ \uD801\uDC1B\uD801\uDC02\uD801\uDC45\uD801\uDC12\uD801\uDC2F\uD801\uDC23 ꨔꨆ ꠦꠤꠏꠡꠐ \uD83C\uDEF9\uD83C\uDEC7\uD83C\uDE14 ۾۶ڝ۩ٱ۞ ⁶ₓⁱ₃₃ₗ ᮿᮔᮦᮢᮍᮦᮞᮔ ɵɭ ⸻⹦⹜⸑⸒⸥⸖⸜⸀ ࿫࿯༵࿌ཊ࿮࿾࿇࿺ཞ ᜉᜆ᜖᜛᜙᜚ᜅ ��￸� ꣼꣢꣮ꣲ꣩꣫ꣽ꣦ꣴꣾ ǾƣDŽƹ ᭪ᬋ","id":27}]}]]
	at org.opensearch.action.support.replication.TransportReplicationAction$ReroutePhase.retryBecauseUnavailable(TransportReplicationAction.java:913) ~[classes/:?]
	at org.opensearch.action.support.replication.TransportReplicationAction$ReroutePhase.doRun(TransportReplicationAction.java:758) ~[classes/:?]
	at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:50) ~[classes/:?]
	at org.opensearch.action.support.replication.TransportReplicationAction$ReroutePhase$2.onTimeout(TransportReplicationAction.java:873) ~[classes/:?]
	at org.opensearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:348) ~[classes/:?]
	at org.opensearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:265) ~[classes/:?]
	at org.opensearch.cluster.service.ClusterApplierService$NotifyTimeout.run(ClusterApplierService.java:614) ~[classes/:?]
	at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:697) ~[classes/:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) ~[?:?]
	at java.lang.Thread.run(Thread.java:832) ~[?:?]
[2021-05-30T09:21:33,378][WARN ][o.o.t.BackgroundIndexer  ] [[Thread-2]] **** failed indexing thread 1 on doc id 28
org.opensearch.action.UnavailableShardsException: [test][0] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[test][0]] containing [index {[test][_doc][ji0lvHkBUZSfq7VjUxkt], source[{"test":"value28","text":" 牡頕苒鰽滜舢 ㆙㆖㆘㆞ ও২া৳৆য়৹ ʕɽ \uDB40\uDD40\uDB40\uDD2F\uDB40\uDD97\uDB40\uDDC4\uDB40\uDD7C\uDB40\uDD0C\uDB40\uDD93\uDB40\uDD79 ᵴ","id":28}]}]]
	at org.opensearch.action.support.replication.TransportReplicationAction$ReroutePhase.retryBecauseUnavailable(TransportReplicationAction.java:913) ~[classes/:?]
	at org.opensearch.action.support.replication.TransportReplicationAction$ReroutePhase.doRun(TransportReplicationAction.java:758) ~[classes/:?]
	at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:50) ~[classes/:?]
	at org.opensearch.action.support.replication.TransportReplicationAction$ReroutePhase$2.onTimeout(TransportReplicationAction.java:873) ~[classes/:?]
	at org.opensearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:348) ~[classes/:?]
	at org.opensearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:265) ~[classes/:?]
	at org.opensearch.cluster.service.ClusterApplierService$NotifyTimeout.run(ClusterApplierService.java:614) ~[classes/:?]
	at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:697) ~[classes/:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) ~[?:?]
	at java.lang.Thread.run(Thread.java:832) ~[?:?]
[2021-05-30T09:21:36,389][INFO ][o.o.i.s.CorruptedFileIT  ] [testCorruptFileDuringIndexing] --> Checking RED status
[2021-05-30T09:21:36,391][INFO ][o.o.i.s.CorruptedFileIT  ] [testCorruptFileDuringIndexing] --> Confirmed RED status
[2021-05-30T09:21:36,392][INFO ][o.o.i.s.CorruptedFileIT  ] [testCorruptFileDuringIndexing] [CorruptedFileIT#testCorruptFileDuringIndexing]: cleaning up after test

@nknize
Copy link
Collaborator

nknize commented Jun 1, 2021

You may need to run it 2-3 times to reproduce the behavior.

Is there not a reproducible -Dtests.seed?

@itiyamas
Copy link
Contributor Author

itiyamas commented Jun 2, 2021

The issue is reproduced only when a pending replica write is in progress when the corruption happens. The way I achieved it is a bit hacky, I added a few sleeps and parallelized writes to increase the probability of this happening. This is why sometimes you may need to run it multiple times to run into the issue.

@dblock
Copy link
Member

dblock commented Jun 4, 2021

Without looking at the code, maybe corrupted data can be cooked ahead of time, or stored data forcefully corrupted by writing some random bytes for the test to be reliable?

@itiyamas
Copy link
Contributor Author

itiyamas commented Jun 4, 2021

That's what I did- I have forcefully corrupted the data. The issue is to reproduce the race where the writes are pending on the shard at exactly the same time when the primary fails - again deterministically reproducing the race is not impossible, just that adding sleep was just faster. :)

@minalsha minalsha added distributed framework Priority-High v1.1.0 Issues, PRs, related to the 1.1.0 release labels Aug 10, 2021
@adnapibar adnapibar self-assigned this Aug 24, 2021
@anasalkouz anasalkouz removed the v1.1.0 Issues, PRs, related to the 1.1.0 release label Sep 7, 2021
@adnapibar
Copy link
Contributor

@itiyamas I was unable to reproduce the issue in a deterministic manner. Do you have any specific idea on how to do so?

@adnapibar adnapibar added the blocked Identifies issues that are blocked label Sep 14, 2021
@adnapibar adnapibar removed their assignment Sep 21, 2021
@anasalkouz
Copy link
Member

Hi @itiyamas, I am closing the issue for now, since there is no clear steps to reproduce it. Feel free to re-open if you have more details to add.

@itiyamas
Copy link
Contributor Author

itiyamas commented Jan 13, 2022

@anasalkouz Not sure if this got fixed as a side effect of some other change. I was able to trace through the code and identify what was causing the bug. Are those code paths not relevant anymore? Did you check that? The ones that are shared in the description.

@itiyama
Copy link

itiyama commented Jul 30, 2022

@anasalkouz Please re-open this issue. This is an ongoing issue in one of our clusters. If you know of a fix, I would like to backport it to older versions as well. Also, if you are unable to repro this, I would like to pursue this myself.

@andrross
Copy link
Member

andrross commented Aug 1, 2022

I'm looking into this now

@andrross andrross reopened this Aug 1, 2022
@andrross andrross self-assigned this Aug 1, 2022
@andrross
Copy link
Member

andrross commented Aug 2, 2022

I have cherry picked the modified integration test into the latest on the main branch and have verified that this bug is fairly reliably reproducible. It usually takes me between 3 and 6 attempts to see a failure, never taking more than 10 minutes of running to fail. I'm going to keep digging to see if I can come up with any ideas about how to fix this.

@itiyama
Copy link

itiyama commented Aug 3, 2022

One of the ways that I can think of solving this:

  1. Primary should never fail the replica when primary itself is cancelling replication operations for its own failure. Currently it has no way to distinguish errors originating from replica vs locally originating from primary itself.
  2. Code wise: When primary cancels the replication action, there should be a way to distinguish it from the errors originating from replica(maybe add source in the exception message- primary/replica and corresponding term). Once we have a way to distinguish the failures, it can be handled gracefully in onFailure or in failShardIfNeeded by checking the source of this failure.

What do you think?

@andrross
Copy link
Member

andrross commented Aug 3, 2022

Thanks @itiyama! Yes, the approach you've describe is exactly what I'm looking into.

@andrross
Copy link
Member

andrross commented Aug 4, 2022

@itiyama Please take a look at my draft PR. I still need to make the test deterministic, and some of the various sleeps are still in place. The approach I took to create a specific exception type for primary closure and then branch the logic based on that doesn't feel like the most general-purpose way to solve this, but it does appear to fix the issue. Please let me know what you think. I am worried about unintended side effects here (the worst case being that a replica shard that should have been failed is not failed and then promoted to primary and data is lost, but going with a very targeted fix would seem to make that unlikely).

ritty27 pushed a commit to ritty27/OpenSearch that referenced this issue May 12, 2024
…t#803)

* Bump org.owasp.dependencycheck from 9.0.7 to 9.0.8

Bumps org.owasp.dependencycheck from 9.0.7 to 9.0.8.

---
updated-dependencies:
- dependency-name: org.owasp.dependencycheck
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* Update changelog

Signed-off-by: dependabot[bot] <support@github.com>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocked Identifies issues that are blocked bug Something isn't working distributed framework Priority-High
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants