Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Backport to 2.x] [Segment Replication] - Update replicas to commit SegmentInfos instead of relying on segments_N from primary shards. #4450

Merged

Conversation

dreamer-89
Copy link
Member

Manual backport of #4402 to 2.x

@dreamer-89 dreamer-89 requested review from a team and reta as code owners September 7, 2022 18:42
@dreamer-89
Copy link
Member Author

Looks like something is missing. There is only one file change in this backport PR.

@dreamer-89 dreamer-89 force-pushed the mch2_replicaCommits_2x branch from 469228a to 62049fd Compare September 7, 2022 18:52
@dreamer-89 dreamer-89 requested a review from mch2 September 7, 2022 18:53
@github-actions
Copy link
Contributor

github-actions bot commented Sep 7, 2022

Gradle Check (Jenkins) Run Completed with:

@dreamer-89
Copy link
Member Author

Java compilation error :(

> Task :rest-api-spec:validatePom

        TranslogManager translogManager = engine.translogManager();
> Task :test:framework:compileJava
                                                ^
  symbol:   method translogManager()
  location: variable engine of type Engine

> Task :server:processTestResources
> Task :server:dependencyLicenses
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
1 error

> Task :test:framework:compileJava

> Task :test:framework:compileJava FAILED
> Task :server:filepermissions

@github-actions
Copy link
Contributor

github-actions bot commented Sep 7, 2022

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

github-actions bot commented Sep 7, 2022

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

github-actions bot commented Sep 7, 2022

Gradle Check (Jenkins) Run Completed with:

@dreamer-89
Copy link
Member Author

Java spotless failure.

@github-actions
Copy link
Contributor

github-actions bot commented Sep 7, 2022

Gradle Check (Jenkins) Run Completed with:

@dreamer-89
Copy link
Member Author

FAILURE: Build completed with 2 failures.

1: Task failed with an exception.
-----------
* What went wrong:
Execution failed for task ':distribution:bwc:maintenance:buildBwcLinuxTar'.
> Building 1.3.5 didn't generate expected file /var/jenkins/workspace/gradle-check/search/distribution/bwc/maintenance/build/bwc/checkout-1.3/distribution/archives/linux-tar/build/distributions/opensearch-min-1.3.5-SNAPSHOT-linux-x64.tar.gz

* Try:
> Run with --stacktrace option to get the stack trace.
> Run with --info or --debug option to get more log output.
==============================================================================

2: Task failed with an exception.
-----------
* What went wrong:
org.gradle.api.GradleException: Reaper process failed. Check log at /var/jenkins/workspace/gradle-check/search/.gradle/reaper/build-13277/error.log for details
> Reaper process failed. Check log at /var/jenkins/workspace/gradle-check/search/.gradle/reaper/build-13277/error.log for details

* Try:
> Run with --stacktrace option to get the stack trace.
> Run with --info or --debug option to get more log output.
==============================================================================

* Get more help at https://help.gradle.org/

BUILD FAILED in 16m 29s
2432 actionable tasks: 2418 executed, 2 from cache, 12 up-to-date
Gradle Check Failed!

@dreamer-89 dreamer-89 mentioned this pull request Sep 7, 2022
6 tasks
mch2 and others added 4 commits September 7, 2022 15:15
…uting metadata snapshot on primary shards. (opensearch-project#4366)

* Segment Replication - Fix NoSuchFileException errors caused when computing metadata snapshot on primary shards.

This change fixes the errors that occur when computing metadata snapshots on primary shards from the latest in-memory SegmentInfos.  The error occurs when a segments_N file that is referenced by the in-memory infos is deleted as part of a concurrent commit.  The segments themselves are incref'd by IndexWriter.incRefDeleter but the commit file (Segments_N) is not.  This change resolves this by ignoring the segments_N file when computing metadata for CopyState and only sending incref'd segment files to replicas.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Fix spotless.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Update StoreTests.testCleanupAndPreserveLatestCommitPoint to assert additional segments are deleted.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Rename snapshot to metadataMap in CheckpointInfoResponse.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Refactor segmentReplicationDiff method to compute off two maps instead of MetadataSnapshots.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Fix spotless.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Revert catchall in SegmentReplicationSourceService.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Revert log lvl change.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Fix SegmentReplicationTargetTests

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Cleanup unused logger.

Signed-off-by: Marc Handalian <handalm@amazon.com>

Signed-off-by: Marc Handalian <handalm@amazon.com>
Co-authored-by: Suraj Singh <surajrider@gmail.com>
…d of relying on segments_N from primary shards. (opensearch-project#4402)

* Segment Replication - Update replicas to commit SegmentInfos instead of relying on segments_N from primary shards.

This change updates replicas to commit SegmentInfos before the shard is closed, on receiving a new commit point from a primary, and when a new primary is detected. This change also makes the public commitSegmentInfos on NRTEngine obsolete, refactoring IndexShard to simply call reset on the engine.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Remove noise & extra log statement.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* PR feedback.

Signed-off-by: Marc Handalian <handalm@amazon.com>

Signed-off-by: Marc Handalian <handalm@amazon.com>
…anslog()

Signed-off-by: Suraj Singh <surajrider@gmail.com>
Signed-off-by: Suraj Singh <surajrider@gmail.com>
@dreamer-89 dreamer-89 force-pushed the mch2_replicaCommits_2x branch from 407dc92 to 8a18787 Compare September 7, 2022 22:15
Copy link
Member

@mch2 mch2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the backport @dreamer-89

@github-actions
Copy link
Contributor

github-actions bot commented Sep 7, 2022

Gradle Check (Jenkins) Run Completed with:

@codecov-commenter
Copy link

Codecov Report

Merging #4450 (8a18787) into 2.x (ceb0e17) will decrease coverage by 0.04%.
The diff coverage is 96.00%.

@@             Coverage Diff              @@
##                2.x    #4450      +/-   ##
============================================
- Coverage     70.67%   70.62%   -0.05%     
+ Complexity    57179    57153      -26     
============================================
  Files          4585     4585              
  Lines        274506   274507       +1     
  Branches      40227    40229       +2     
============================================
- Hits         193996   193878     -118     
- Misses        64278    64389     +111     
- Partials      16232    16240       +8     
Impacted Files Coverage Δ
...va/org/opensearch/index/engine/EngineTestCase.java 84.89% <66.66%> (-1.97%) ⬇️
.../opensearch/index/engine/NRTReplicationEngine.java 76.92% <100.00%> (+1.08%) ⬆️
...arch/index/engine/NRTReplicationReaderManager.java 87.50% <100.00%> (+0.54%) ⬆️
...in/java/org/opensearch/index/shard/IndexShard.java 70.52% <100.00%> (-0.37%) ⬇️
...g/opensearch/indices/recovery/MultiFileWriter.java 84.00% <100.00%> (+0.16%) ⬆️
...org/opensearch/index/shard/IndexShardTestCase.java 93.67% <100.00%> (-1.03%) ⬇️
...java/org/opensearch/client/indices/DataStream.java 0.00% <0.00%> (-76.09%) ⬇️
.../opensearch/client/indices/CloseIndexResponse.java 17.50% <0.00%> (-73.75%) ⬇️
...h/action/ingest/SimulateDocumentVerboseResult.java 60.71% <0.00%> (-39.29%) ⬇️
...ion/admin/cluster/node/info/PluginsAndModules.java 53.12% <0.00%> (-34.38%) ⬇️
... and 511 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@dreamer-89 dreamer-89 merged commit 4170d37 into opensearch-project:2.x Sep 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants