Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize checksum creation for remote cluster state #16046

Merged
merged 11 commits into from
Oct 1, 2024

Conversation

himshikha
Copy link
Contributor

Description

This change parallelizes checksum creation for cluster state components, with the aim to reduce remote state publication latency.

Related Issues

Resolves #[Issue number to be closed when this PR is merged]

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Himshikha Gupta added 2 commits September 23, 2024 18:55
Signed-off-by: Himshikha Gupta <himshikh@amazon.com>
Signed-off-by: Himshikha Gupta <himshikh@amazon.com>
Copy link
Contributor

❌ Gradle check result for 1b5edaa: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Collaborator

@Bukhtawar Bukhtawar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @himshikha would you be able to share the perf numbers with this change

Signed-off-by: Himshikha Gupta <himshikh@amazon.com>
Copy link
Contributor

❌ Gradle check result for 542d709: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Himshikha Gupta <himshikh@amazon.com>
Signed-off-by: Himshikha Gupta <himshikh@amazon.com>
Copy link
Contributor

❌ Gradle check result for 93f9a43: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@himshikha
Copy link
Contributor Author

❌ Gradle check result for 93f9a43: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Flaky test #14324

Copy link
Contributor

github-actions bot commented Oct 1, 2024

❌ Gradle check result for f14898c: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@himshikha
Copy link
Contributor Author

❌ Gradle check result for f14898c: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Flaky test #15944, #15812

Signed-off-by: Himshikha Gupta <himshikh@amazon.com>
Copy link
Contributor

github-actions bot commented Oct 1, 2024

❕ Gradle check result for 989e4fe: UNSTABLE

  • TEST FAILURES:
      3 org.opensearch.index.shard.RemoteIndexShardTests.classMethod
      1 org.opensearch.search.SearchTimeoutIT.testSimpleTimeout {p0={"search.concurrent_segment_search.enabled":"true"}}
      1 org.opensearch.index.shard.RemoteIndexShardTests.testSegmentReplication_With_EngineClosedConcurrently

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

@Bukhtawar Bukhtawar merged commit a767e92 into opensearch-project:main Oct 1, 2024
34 checks passed
@Bukhtawar Bukhtawar added the backport 2.x Backport to 2.x branch label Oct 1, 2024
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.x
# Create a new branch
git switch --create backport/backport-16046-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 a767e92f3eeaf57c066deb9ad075d40ed00f4a58
# Push it to GitHub
git push --set-upstream origin backport/backport-16046-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-16046-to-2.x.

himshikha added a commit to himshikha/OpenSearch that referenced this pull request Oct 1, 2024
…ct#16046)

* Support parallelisation in remote publication checksum computation

Signed-off-by: Himshikha Gupta <himshikh@amazon.com>
@himshikha
Copy link
Contributor Author

Backport PR #16150

himshikha added a commit to himshikha/OpenSearch that referenced this pull request Oct 1, 2024
…ct#16046)

* Support parallelisation in remote publication checksum computation

Signed-off-by: Himshikha Gupta <himshikh@amazon.com>
himshikha added a commit to himshikha/OpenSearch that referenced this pull request Oct 1, 2024
…ct#16046)

* Support parallelisation in remote publication checksum computation

Signed-off-by: Himshikha Gupta <himshikh@amazon.com>
hainenber pushed a commit to hainenber/OpenSearch that referenced this pull request Oct 1, 2024
…ct#16046)

* Support parallelisation in remote publication checksum computation

Signed-off-by: Himshikha Gupta <himshikh@amazon.com>
Bukhtawar pushed a commit that referenced this pull request Oct 1, 2024
* Support parallelisation in remote publication checksum computation

Signed-off-by: Himshikha Gupta <himshikh@amazon.com>
@sandeshkr419
Copy link
Contributor

Thanks @himshikha for this change.

@Bukhtawar @himshikha - I think changes like this should have a changelog so that they can be called out in performance gains / optimizations as part of release notes.

dk2k pushed a commit to dk2k/OpenSearch that referenced this pull request Oct 16, 2024
…ct#16046)

* Support parallelisation in remote publication checksum computation

Signed-off-by: Himshikha Gupta <himshikh@amazon.com>
dk2k pushed a commit to dk2k/OpenSearch that referenced this pull request Oct 17, 2024
…ct#16046)

* Support parallelisation in remote publication checksum computation

Signed-off-by: Himshikha Gupta <himshikh@amazon.com>
dk2k pushed a commit to dk2k/OpenSearch that referenced this pull request Oct 21, 2024
…ct#16046)

* Support parallelisation in remote publication checksum computation

Signed-off-by: Himshikha Gupta <himshikh@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants