Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] org.opensearch.remotestore.SegmentReplicationUsingRemoteStoreIT.testReplicaAlreadyAtCheckpoint is flaky #11254

Closed
jed326 opened this issue Nov 17, 2023 · 1 comment
Labels
bug Something isn't working flaky-test Random test failure that succeeds on second run untriaged

Comments

@jed326
Copy link
Collaborator

jed326 commented Nov 17, 2023

Describe the bug
The test SegmentReplicationUsingRemoteStoreIT.testReplicaAlreadyAtCheckpoint is flaky and succeed after retry within the same Gradle Check run.

See: #11235 (comment)

Lap 16, 2023 7:34:59 PM com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler uncaughtException
WARNING: Uncaught exception in thread: Thread[#1692,opensearch[node_t4][generic][T#2],5,TGRP-SegmentReplicationUsingRemoteStoreIT]
java.lang.AssertionError:  inconsistent generation 
	at __randomizedtesting.SeedInfo.seed([138297ACFE85FB99]:0)
	at org.opensearch.index.translog.transfer.TranslogCheckpointTransferSnapshot$Builder.build(TranslogCheckpointTransferSnapshot.java:180)
	at org.opensearch.index.translog.RemoteFsTranslog.upload(RemoteFsTranslog.java:338)
	at org.opensearch.index.translog.RemoteFsTranslog.prepareAndUpload(RemoteFsTranslog.java:310)
	at org.opensearch.index.translog.RemoteFsTranslog.sync(RemoteFsTranslog.java:365)
	at org.opensearch.index.translog.InternalTranslogManager.syncTranslog(InternalTranslogManager.java:196)
	at org.opensearch.index.engine.InternalEngine.syncTranslog(InternalEngine.java:610)
	at org.opensearch.index.shard.IndexShard.postActivatePrimaryMode(IndexShard.java:3449)
	at org.opensearch.index.shard.IndexShard.lambda$updateShardState$4(IndexShard.java:727)
	at org.opensearch.index.shard.IndexShard$5.onResponse(IndexShard.java:4052)
	at org.opensearch.index.shard.IndexShard$5.onResponse(IndexShard.java:4022)
	at org.opensearch.index.shard.IndexShard.lambda$asyncBlockOperations$37(IndexShard.java:3973)
	at org.opensearch.core.action.ActionListener$1.onResponse(ActionListener.java:82)
	at org.opensearch.index.shard.IndexShardOperationPermits$1.doRun(IndexShardOperationPermits.java:157)
	at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:908)
	at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1583)

REPRODUCE WITH: ./gradlew ':server:internalClusterTest' --tests "org.opensearch.remotestore.SegmentReplicationUsingRemoteStoreIT.testReplicaAlreadyAtCheckpoint" -Dtests.seed=138297ACFE85FB99 -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=lt -Dtests.timezone=Europe/Warsaw -Druntime.java=21
REPRODUCE WITH: ./gradlew ':server:internalClusterTest' --tests "org.opensearch.remotestore.SegmentReplicationUsingRemoteStoreIT" -Dtests.seed=138297ACFE85FB99 -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=en -Dtests.timezone=Etc/UTC -Druntime.java=21
NOTE: test params are: codec=Asserting(Lucene95): {index_uuid=Lucene90, id=PostingsFormat(name=Asserting), type=PostingsFormat(name=Asserting), body=Lucene90}, docValues:{dv=DocValuesFormat(name=Asserting)}, maxPointsInLeafNode=400, maxMBSortInHeap=7.339228541543765, sim=Asserting(RandomSimilarity(queryNorm=false): {body=DFR GL2}), locale=lt, timezone=Europe/Warsaw
NOTE: Linux 5.15.0-1048-aws amd64/Eclipse Adoptium 21.0.1 (64-bit)/cpus=32,threads=1,free=397764792,total=1046478848
NOTE: All tests run in this JVM: [IndicesRequestIT, ValidateIndicesAliasesRequestIT, ForceMergeBlocksIT, SearchProgressActionListenerIT, ClusterInfoServiceIT, RemoveSettingsCommandIT, UpdateShardAllocationSettingsIT, StableClusterManagerDisruptionIT, GetActionIT, DynamicMappingIT, IndexActionIT, RandomExceptionCircuitBreakerIT, InternalSettingsIT, IngestProcessorNotInstalledOnAllNodesIT, PrimaryTermValidationIT, SegmentReplicationUsingRemoteStoreIT]
Lap 16, 2023 7:53:24 PM org.apache.lucene.store.MemorySegmentIndexInputProvider <init>
INFO: Using MemorySegmentIndexInput with Java 21; to disable start with -Dorg.apache.lucene.store.MMapDirectory.enableMemorySegments=false
@jed326 jed326 added bug Something isn't working untriaged flaky-test Random test failure that succeeds on second run labels Nov 17, 2023
@mch2
Copy link
Member

mch2 commented Nov 17, 2023

Closing as duplicate of #11255

@mch2 mch2 closed this as completed Nov 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working flaky-test Random test failure that succeeds on second run untriaged
Projects
None yet
Development

No branches or pull requests

2 participants