Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Populate RecoveryState details for shallow snapshot restore #15353

Merged

Conversation

ltaragi
Copy link
Contributor

@ltaragi ltaragi commented Aug 22, 2024

Description

  • In a regular active recovery, the recovery stats for the _cat/recovery call are populated as
index | shard | time | type | stage | source_host | source_node | target_host | target_node | repository | snapshot | files | files_recovered | files_percent | files_total | bytes | bytes_recovered | bytes_percent | bytes_total | translog_ops | translog_ops_recovered | translog_ops_percent
movies | 0 | 117ms | empty_store | done | n/a | n/a | 172.18.0.4 | odfe-node1 | n/a | n/a | 0 | 0 | 0.0% | 0 | 0 | 0 | 0.0% | 0 | 0 | 0 | 100.0%
movies | 0 | 382ms | peer | done | 172.18.0.4 | odfe-node1 | 172.18.0.3 | odfe-node2 | n/a | n/a | 1 | 1 |  100.0% | 1 | 208 | 208 | 100.0% | 208 | 1 | 1 | 100.0%
  • Information like bytes_recovered, bytes_total, etc. is obtained from the ReplicationLuceneIndex object of the RecoveryState for the shard being recovered
public Table buildRecoveryTable(RestRequest request, RecoveryResponse response) {
    ...
    for (String index : response.shardRecoveryStates().keySet()) {
        ...
        for (RecoveryState state : shardRecoveryStates) {
            t.startRow();
            t.addCell(index);
            t.addCell(state.getShardId().id());
            ...
            t.addCell(state.getIndex().totalRecoverFiles());
            t.addCell(state.getIndex().recoveredFileCount());
            ...
            t.endRow();
        }
    }
    return t;
}
  • ReplicationLuceneIndex gets this data with addFileDetail() and addRecoveredBytesToFile() as and when these are called during the restore flow
  • In case of restoration of shallow snapshots, these functions are never called and the stats are not populated.
  • This change adds these details of count and covered percentage for files/bytes to RecoveryState of shards being restored from a shallow snapshot

Related Issues

Resolves #15434

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@ltaragi ltaragi self-assigned this Aug 22, 2024
@ltaragi ltaragi force-pushed the shallow-snapshot-recovery-stats branch from f8e731b to e6bcc79 Compare August 22, 2024 14:18
Copy link
Contributor

❌ Gradle check result for f8e731b: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for e6bcc79: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@ltaragi
Copy link
Contributor Author

ltaragi commented Aug 22, 2024

❌ Gradle check result for f8e731b: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

❌ Gradle check result for e6bcc79: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Flaky test #14294

@ltaragi ltaragi force-pushed the shallow-snapshot-recovery-stats branch 2 times, most recently from c482892 to 7bccf91 Compare August 23, 2024 05:42
Copy link
Contributor

❌ Gradle check result for c482892: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 7bccf91: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@ltaragi ltaragi force-pushed the shallow-snapshot-recovery-stats branch from 7bccf91 to 49141d8 Compare August 26, 2024 14:33
Copy link
Contributor

❌ Gradle check result for 49141d8: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❕ Gradle check result for fcaa3a4: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Copy link

codecov bot commented Aug 27, 2024

Codecov Report

Attention: Patch coverage is 11.11111% with 8 lines in your changes missing coverage. Please review.

Project coverage is 71.84%. Comparing base (acee2ae) to head (668dfec).
Report is 27 commits behind head on main.

Files with missing lines Patch % Lines
...in/java/org/opensearch/index/shard/IndexShard.java 11.11% 7 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #15353      +/-   ##
============================================
- Coverage     71.87%   71.84%   -0.03%     
- Complexity    63318    63402      +84     
============================================
  Files          5231     5244      +13     
  Lines        296521   296797     +276     
  Branches      42832    42852      +20     
============================================
+ Hits         213113   213230     +117     
- Misses        65948    66128     +180     
+ Partials      17460    17439      -21     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ltaragi ltaragi added enhancement Enhancement or improvement to existing feature or request Storage:Snapshots labels Aug 27, 2024
Signed-off-by: Lakshya Taragi <lakshya.taragi@gmail.com>
Signed-off-by: Lakshya Taragi <lakshya.taragi@gmail.com>
Signed-off-by: Lakshya Taragi <lakshya.taragi@gmail.com>
@ltaragi ltaragi force-pushed the shallow-snapshot-recovery-stats branch from fcaa3a4 to 668dfec Compare August 29, 2024 06:13
Copy link
Contributor

✅ Gradle check result for 668dfec: SUCCESS

@sachinpkale
Copy link
Member

The changes are covered in integ test. CodeCov does not consider ITs, that is why the check is failing.

@sachinpkale sachinpkale merged commit 3726c52 into opensearch-project:main Aug 29, 2024
33 of 34 checks passed
@sachinpkale sachinpkale added the backport 2.x Backport to 2.x branch label Sep 1, 2024
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.x
# Create a new branch
git switch --create backport/backport-15353-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 3726c52b31e8504e7fcf9cdc1b52a0a404d6c944
# Push it to GitHub
git push --set-upstream origin backport/backport-15353-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-15353-to-2.x.

@sachinpkale sachinpkale added backport 2.x Backport to 2.x branch and removed backport 2.x Backport to 2.x branch labels Sep 2, 2024
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.x
# Create a new branch
git switch --create backport/backport-15353-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 3726c52b31e8504e7fcf9cdc1b52a0a404d6c944
# Push it to GitHub
git push --set-upstream origin backport/backport-15353-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-15353-to-2.x.

sachinpkale pushed a commit to sachinpkale/OpenSearch that referenced this pull request Sep 2, 2024
…ch-project#15353)

---------
Signed-off-by: Lakshya Taragi <lakshya.taragi@gmail.com>
sachinpkale pushed a commit to sachinpkale/OpenSearch that referenced this pull request Sep 2, 2024
…ch-project#15353)

---------
Signed-off-by: Lakshya Taragi <lakshya.taragi@gmail.com>
Signed-off-by: Sachin Kale <kalsac@amazon.com>
linuxpi pushed a commit that referenced this pull request Sep 2, 2024
…15566)

---------
Signed-off-by: Lakshya Taragi <lakshya.taragi@gmail.com>
Signed-off-by: Sachin Kale <kalsac@amazon.com>

Co-authored-by: Lakshya Taragi <157457166+ltaragi@users.noreply.github.com>
akolarkunnu pushed a commit to akolarkunnu/OpenSearch that referenced this pull request Sep 10, 2024
…ch-project#15353)

---------
Signed-off-by: Lakshya Taragi <lakshya.taragi@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch backport-failed enhancement Enhancement or improvement to existing feature or request skip-changelog Storage:Snapshots v2.17.0
Projects
Status: ✅ Done
Development

Successfully merging this pull request may close these issues.

Populate RecoveryState details for shallow snapshot restore
3 participants