-
Notifications
You must be signed in to change notification settings - Fork 316
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AMORO-1035][Flink] Support customizing mixed-format table source parallelism #1973
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #1973 +/- ##
============================================
- Coverage 32.64% 30.76% -1.88%
+ Complexity 4482 3896 -586
============================================
Files 599 553 -46
Lines 50321 45613 -4708
Branches 6691 6178 -513
============================================
- Hits 16426 14035 -2391
+ Misses 32578 30489 -2089
+ Partials 1317 1089 -228
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Can you show the rendering of flink webui after the change? As far as I know, Flinksql has not yet supported custom source parallelism. In flinksql, all operators use the parallelism of the previous operator as the current parallelism, so After I customized the source.parallelism , I think the parallelism of subsequent tasks will not run according to the default parallelism of the task. |
@huyuanfeng2018 Thanks for your comment. Actually we could modify the source operater's parallelism through the Table API. In the ![]() select * from
log_data /*+OPTIONS('arctic.read.mode'='log','properties.group.id'='......','parallelism'='5' )*/ --The job parallelism is 10. Specify source parallelism with SQL hint |
What I am confused about is how the subsequent task changed back to 10 after the scan parallelism was changed? Let me give you an example. From the perspective of flink-table-planner, often a |
Because the above example has a window aggregate operator, it can't be chained with the previous source and calc operators, so it will become 10 parallelism, and the source and calc operators are 5 parallelism. This PR does not change the parallelism of the source operator alone, but also the operators in its chain. |
17ef90c
to
646cf0a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
3d9ebba
to
2822226
Compare
Fixed the conflicts after the spotless tool was added to the Flink module. |
|
2822226
to
8187bde
Compare
8187bde
to
b534d21
Compare
...k/flink-common/src/main/java/com/netease/arctic/flink/table/descriptors/ArcticValidator.java
Outdated
Show resolved
Hide resolved
LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Why are the changes needed?
Support customizing the parallelism of the filestore/logstore source via Flink SQL.
Close #1035 .
Brief change log
How was this patch tested?
Add some test cases that check the changes thoroughly including negative and positive cases if possible
Add screenshots for manual tests if appropriate
Run test locally before making a pull request
Documentation