-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix](local shuffle) Fix unbalanced data distribution #44137
Conversation
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
run buildall |
clang-tidy review says "All clean, LGTM! 👍" |
PR approved by at least one committer and no changes requested. |
PR approved by anyone and no changes requested. |
TeamCity be ut coverage result: |
run buildall |
TeamCity be ut coverage result: |
run buildall |
clang-tidy review says "All clean, LGTM! 👍" |
TeamCity be ut coverage result: |
PR approved by at least one committer and no changes requested. |
Now `LOCAL_EXCHANGE_OPERATOR (BUCKET_HASH_SHUFFLE) ` use a wrong buckets num to distribute rows so it got an unbalance distribution. This PR fix it. Before: ``` LOCAL_EXCHANGE_OPERATOR (BUCKET_HASH_SHUFFLE) (id=-10): - RowsProduced: sum 32.636547M (32636547), avg 2.719712M (2719712), max 10.932368M (10932368), min 0 ``` After: ``` LOCAL_EXCHANGE_OPERATOR (BUCKET_HASH_SHUFFLE) (id=-10): - RowsProduced: sum 32.636547M (32636547), avg 2.719712M (2719712), max 2.736918M (2736918), min 2.6938M (2693800) ```
What problem does this PR solve?
Now
LOCAL_EXCHANGE_OPERATOR (BUCKET_HASH_SHUFFLE)
use a wrong buckets num to distribute rows so it got an unbalance distribution. This PR fix it.Before:
After:
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)