-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix weighted shard routing state across search requests #6004
Fix weighted shard routing state across search requests #6004
Conversation
Signed-off-by: Anshu Agarwal <anshukag@amazon.com>
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
final ShardShuffler shuffler; | ||
final ShardShuffler shufflerForWeightedRouting; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a second shuffler or just creating another method the prevents the shuffle for onlyNodeSelectorActiveInitializingShardsIt
should work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is with this line https://github.com/opensearch-project/OpenSearch/blob/main/server/src/main/java/org/opensearch/cluster/routing/IndexShardRoutingTable.java#L587. This is moving the shuffler one more time ie in a three node cluster with weights (1,1,0) the second request also hits the same node since the shuffler moves by 2 in the first request.
Since we need onlyNodeSelectorActiveInitializingShardsIt
to get shards belonging to node with weight zero. Introducing a method without shuffler will require a code duplication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets put up a comment explaining the same.
Gradle Check (Jenkins) Run Completed with:
|
Codecov Report
📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more @@ Coverage Diff @@
## main #6004 +/- ##
============================================
+ Coverage 70.75% 70.81% +0.05%
- Complexity 58720 58738 +18
============================================
Files 4771 4771
Lines 280818 280819 +1
Branches 40568 40568
============================================
+ Hits 198704 198860 +156
+ Misses 65824 65630 -194
- Partials 16290 16329 +39
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
Signed-off-by: Anshu Agarwal <anshukag@amazon.com>
a5b2125
to
69d20ba
Compare
Gradle Check (Jenkins) Run Completed with:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR fixes issue due to which state is not maintained across weighted shard routing search requests .
Can you provide more detail about what is changing here (or link an issue that explains in detail what the problem is)? "state is not maintained" is the cause, but what is the actual issue? i.e. what is the behavior change that a user will as a result of this fix?
CHANGELOG.md
Outdated
@@ -69,6 +69,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), | |||
- Fix 'org.apache.hc.core5.http.ParseException: Invalid protocol version' under JDK 16+ ([#4827](https://github.com/opensearch-project/OpenSearch/pull/4827)) | |||
- Fixed compression support for h2c protocol ([#4944](https://github.com/opensearch-project/OpenSearch/pull/4944)) | |||
- Support OpenSSL Provider with default Netty allocator ([#5460](https://github.com/opensearch-project/OpenSearch/pull/5460)) | |||
- Fix weighted shard routing state across search requests([#6004](https://github.com/opensearch-project/OpenSearch/pull/6004)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you intend to backport this to 2.x? If so, you'll need to put this in the [Unreleased 2.x]
section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, moved the entry to 2.x section. Thanks for pointing this out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR fixes issue due to which state is not maintained across weighted shard routing search requests .
Can you provide more detail about what is changing here (or link an issue that explains in detail what the problem is)? "state is not maintained" is the cause, but what is the actual issue? i.e. what is the behavior change that a user will as a result of this fix?
Created an issue with the details #6056
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CHANGELOG entry should move to 2.x section
Signed-off-by: Anshu Agarwal <anshukag@amazon.com>
Gradle Check (Jenkins) Run Completed with:
|
final ShardShuffler shuffler; | ||
final ShardShuffler shufflerForWeightedRouting; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets put up a comment explaining the same.
* Test to validate that shard routing state is maintained across requests, requests are assigned to nodes | ||
* according to assigned routing weights | ||
*/ | ||
public void testWeightedRoutingShardStateWithDifferentWeights() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering how did we miss this regression since we had ITs covering same . If not , lets please add
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah we have added a test now
Signed-off-by: Anshu Agarwal <anshukag@amazon.com>
Gradle Check (Jenkins) Run Completed with:
|
* Fix maintaining state across search requests with weighted shard routing Signed-off-by: Anshu Agarwal <anshukag@amazon.com> (cherry picked from commit cd860ec) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* Fix maintaining state across search requests with weighted shard routing (cherry picked from commit cd860ec) Signed-off-by: Anshu Agarwal <anshukag@amazon.com> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Andrew Ross <andrross@amazon.com>
Signed-off-by: Anshu Agarwal anshukag@amazon.com
Description
This PR fixes issue due to which state is not maintained across weighted shard routing search requests . The shuffler is moved twice in a call which is causing the issue. The PR adds code logic to prevent unwanted shuffler movement.
Issues Resolved
#6056
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.