-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use SegmentReplicationTarget test to validate non-active on-disk files are reused for replication #11786
Conversation
❌ Gradle check result for 8b69e04: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Compatibility status:Checks if related components are compatible with change 95a866a Incompatible componentsSkipped componentsCompatible componentsCompatible components: [https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/performance-analyzer.git] |
…s are reused for replication Signed-off-by: Suraj Singh <surajrider@gmail.com>
Signed-off-by: Suraj Singh <surajrider@gmail.com>
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #11786 +/- ##
============================================
- Coverage 71.31% 71.18% -0.14%
+ Complexity 59221 59149 -72
============================================
Files 4911 4911
Lines 278667 278667
Branches 40528 40528
============================================
- Hits 198735 198363 -372
- Misses 63375 63776 +401
+ Partials 16557 16528 -29 ☔ View full report in Codecov by Sentry. |
…s are reused for replication (#11786) * Use SegmentReplicationTarget test to validate non-active on-disk files are reused for replication Signed-off-by: Suraj Singh <surajrider@gmail.com> * Fix original shard level unit test Signed-off-by: Suraj Singh <surajrider@gmail.com> --------- Signed-off-by: Suraj Singh <surajrider@gmail.com> (cherry picked from commit fe98aad) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
…s are reused for replication (#11786) * Use SegmentReplicationTarget test to validate non-active on-disk files are reused for replication Signed-off-by: Suraj Singh <surajrider@gmail.com> * Fix original shard level unit test Signed-off-by: Suraj Singh <surajrider@gmail.com> --------- Signed-off-by: Suraj Singh <surajrider@gmail.com> (cherry picked from commit fe98aad) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
…s are reused for replication (#11786) * Use SegmentReplicationTarget test to validate non-active on-disk files are reused for replication Signed-off-by: Suraj Singh <surajrider@gmail.com> * Fix original shard level unit test Signed-off-by: Suraj Singh <surajrider@gmail.com> --------- Signed-off-by: Suraj Singh <surajrider@gmail.com> (cherry picked from commit fe98aad) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
…s are reused for replication (#11786) (#11806) * Use SegmentReplicationTarget test to validate non-active on-disk files are reused for replication * Fix original shard level unit test --------- (cherry picked from commit fe98aad) Signed-off-by: Suraj Singh <surajrider@gmail.com> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
…s are reused for replication (opensearch-project#11786) * Use SegmentReplicationTarget test to validate non-active on-disk files are reused for replication Signed-off-by: Suraj Singh <surajrider@gmail.com> * Fix original shard level unit test Signed-off-by: Suraj Singh <surajrider@gmail.com> --------- Signed-off-by: Suraj Singh <surajrider@gmail.com>
…s are reused for replication (opensearch-project#11786) * Use SegmentReplicationTarget test to validate non-active on-disk files are reused for replication Signed-off-by: Suraj Singh <surajrider@gmail.com> * Fix original shard level unit test Signed-off-by: Suraj Singh <surajrider@gmail.com> --------- Signed-off-by: Suraj Singh <surajrider@gmail.com> Signed-off-by: Shivansh Arora <hishiv@amazon.com>
Description
This change updates
testSegRepSucceedsOnPreviousCopiedFiles
shard level test to wait for file copy operations to complete in first round of segment replication before proceeding to next round without which it is possible to have concurrent access on segment files. 1) FIle copy operations from first round 2) File checksum validation from next round.Issues Resolves
Resolves #10885
Background
With #9630, we introduced optimization to avoid re-downloading files already on disk but not referenced by active reader. With #10725, we made RemoteStoreReplicationSource compliant with ReplicationSource by making download logic asynchronous. I suspect this resulted scenario where file copy remains in-progress from first round of segment replication which breaks checksum validations on next round.
Error trace for failure builds
testSegRepSucceedsOnPreviousCopiedFiles steps
Step 1. Index docs
Step 2. Use custom RemoteStoreReplicationSource where exception is post
getSegmentFiles
. With background files are been copied, we halted first round of segment replication pre-maturely. Below code shows the custom remote store replication source used during first round of segrep. We throw exception post getsegmentFiles method call.Issue: At step 2, since download (getSegmentFiles) is async in nature, throwing exception right after getSegmentFiles is not right. We need mechanism that tells files were downloaded completely.
Check List
- [ ] Public documentation issue/PR createdBy submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.