Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AMORO-1883] Perform file filtering when adding files to PartitionEvaluator instead of during splitting tasks #1886

Merged
merged 16 commits into from
Aug 28, 2023

Conversation

wangtaohz
Copy link
Contributor

@wangtaohz wangtaohz commented Aug 24, 2023

Why are the changes needed?

fix #1883

Brief change log

  • filter files when adding files to PartitionEvaluator and AbstractPartitionPlan
  • split tasks based on rewriteFiles and rewritePosFiles instead of fragment files and segment files
  • remove taskNeedExecute after splitting tasks
  • remove addPartitionProperties from PartitionEvaluator, add it to the constructor

How was this patch tested?

  • Add some test cases that check the changes thoroughly including negative and positive cases if possible

  • Add screenshots for manual tests if appropriate

  • Run test locally before making a pull request

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

@github-actions github-actions bot added the module:ams-dashboard Ams dashboard module label Aug 24, 2023
@wangtaohz wangtaohz changed the title [AMORO-1883] [AMORO-1883] Perform file filtering when adding files to PartitionEvaluator instead of during splitting tasks Aug 24, 2023
@codecov
Copy link

codecov bot commented Aug 24, 2023

Codecov Report

Patch coverage: 80.12% and project coverage change: -0.05% ⚠️

Comparison is base (674208c) 50.83% compared to head (9d306d4) 50.78%.

Additional details and impacted files
@@             Coverage Diff              @@
##             master    #1886      +/-   ##
============================================
- Coverage     50.83%   50.78%   -0.05%     
+ Complexity     3803     3802       -1     
============================================
  Files           479      479              
  Lines         25648    25645       -3     
  Branches       2611     2613       +2     
============================================
- Hits          13038    13024      -14     
- Misses        11443    11454      +11     
  Partials       1167     1167              
Flag Coverage Δ
core 51.21% <80.12%> (-0.06%) ⬇️
trino 48.67% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
...va/com/netease/arctic/utils/TablePropertyUtil.java 50.68% <0.00%> (-9.00%) ⬇️
...rver/optimizing/plan/CommonPartitionEvaluator.java 79.41% <68.75%> (-4.35%) ⬇️
...ver/optimizing/plan/MixedIcebergPartitionPlan.java 77.77% <83.33%> (+0.27%) ⬆️
...ic/server/optimizing/plan/OptimizingEvaluator.java 81.39% <83.33%> (-0.33%) ⬇️
.../server/optimizing/plan/AbstractPartitionPlan.java 92.92% <92.85%> (-2.26%) ⬇️
...server/optimizing/plan/MixedHivePartitionPlan.java 92.06% <100.00%> (+3.92%) ⬆️

... and 3 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@zhoujinsong zhoujinsong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments, PTAL.

@XBaith
Copy link
Contributor

XBaith commented Aug 24, 2023

Refer to issue #1866

@github-actions github-actions bot added the module:core Core module label Aug 25, 2023
Copy link
Contributor

@zhongqishang zhongqishang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Contributor

@zhoujinsong zhoujinsong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@zhoujinsong zhoujinsong merged commit 9d9817e into apache:master Aug 28, 2023
@wangtaohz wangtaohz deleted the fix-1883 branch August 28, 2023 01:56
ShawHee pushed a commit to ShawHee/arctic that referenced this pull request Dec 29, 2023
…aluator` instead of during splitting tasks (apache#1886)

* optimizing plan: filter when add files

* fix some comment

* cache reachHiveRefreshInterval

* add docs

* rename to reservedDeleteFiles

* add return false

* move reachFullInterval to constructor

* set properties first

* add some check code

* add docs for PartitionEvaluator

* fix add properties first

* remove addPartitionProperties from PartitionEvaluator, add it to constructor

* remove all files if optimizing is not enabled

* fix reservedDeleteFiles
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module:ams-dashboard Ams dashboard module module:core Core module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Improvement]: Perform file filtering as early as possible when during optimizing plan process
4 participants