Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AMORO-2257] Automatically create tags on snapshots hour-level for iceberg Table Formats #2411

Merged
merged 20 commits into from
Jan 31, 2024

Conversation

huyuanfeng2018
Copy link
Contributor

Automatically create tags on snapshots hour-level for iceberg Table Formats

Why are the changes needed?

Close #2257.

Brief change log

  • Add hour level (HOURLY) in TagConfiguration.Period

How was this patch tested?

  • Add some test cases that check the changes thoroughly including negative and positive cases if possible

  • Add screenshots for manual tests if appropriate

  • Run test locally before making a pull request

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

@github-actions github-actions bot added type:docs Improvements or additions to documentation module:ams-dashboard Ams dashboard module labels Dec 6, 2023
@CLAassistant
Copy link

CLAassistant commented Dec 6, 2023

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
3 out of 4 committers have signed the CLA.

✅ wangtaohz
✅ huyuanfeng2018
✅ zhoujinsong
❌ huyuanfeng


huyuanfeng seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

Copy link

codecov bot commented Dec 6, 2023

Codecov Report

Attention: 6 lines in your changes are missing coverage. Please review.

Comparison is base (a0aa8ed) 32.26% compared to head (308d34d) 32.24%.
Report is 1 commits behind head on master.

Files Patch % Lines
.../netease/arctic/server/table/TagConfiguration.java 80.00% 2 Missing and 1 partial ⚠️
...erver/table/executor/TagsAutoCreatingExecutor.java 0.00% 2 Missing ⚠️
...imizing/maintainer/AutoCreateIcebergTagAction.java 95.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master    #2411      +/-   ##
============================================
- Coverage     32.26%   32.24%   -0.02%     
+ Complexity     4384     4379       -5     
============================================
  Files           589      589              
  Lines         49867    49874       +7     
  Branches       6615     6616       +1     
============================================
- Hits          16088    16081       -7     
- Misses        32510    32522      +12     
- Partials       1269     1271       +2     
Flag Coverage Δ
core 30.30% <83.78%> (-0.03%) ⬇️
trino 50.87% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@huyuanfeng2018 huyuanfeng2018 changed the title [AMORO-2257] Automatically create tags on snapshots hour-level for iceberg Table Formats [WIP] [AMORO-2257] Automatically create tags on snapshots hour-level for iceberg Table Formats Dec 6, 2023
@huyuanfeng2018 huyuanfeng2018 requested review from wangtaohz and zhoujinsong and removed request for wangtaohz December 7, 2023 12:27
@huyuanfeng2018 huyuanfeng2018 changed the title [WIP] [AMORO-2257] Automatically create tags on snapshots hour-level for iceberg Table Formats [AMORO-2257] Automatically create tags on snapshots hour-level for iceberg Table Formats Dec 7, 2023
@huyuanfeng2018 huyuanfeng2018 changed the title [AMORO-2257] Automatically create tags on snapshots hour-level for iceberg Table Formats [WIP] [AMORO-2257] Automatically create tags on snapshots hour-level for iceberg Table Formats Dec 7, 2023
@huyuanfeng2018 huyuanfeng2018 marked this pull request as draft December 7, 2023 15:09
@github-actions github-actions bot added the module:core Core module label Dec 11, 2023
@huyuanfeng2018 huyuanfeng2018 changed the title [WIP] [AMORO-2257] Automatically create tags on snapshots hour-level for iceberg Table Formats [AMORO-2257] Automatically create tags on snapshots hour-level for iceberg Table Formats Dec 11, 2023
@huyuanfeng2018 huyuanfeng2018 marked this pull request as ready for review December 11, 2023 12:15
@wangtaohz
Copy link
Contributor

It seems that we forget to implement this method in TagsAutoCreatingExecutor.

  @Override
  public void handleConfigChanged(TableRuntime tableRuntime, TableConfiguration originalConfig) {
    scheduleIfNecessary(tableRuntime, getStartDelay());
  }

Can you add this method to this PR? @huyuanfeng2018

@huyuanfeng2018
Copy link
Contributor Author

It seems that we forget to implement this method in TagsAutoCreatingExecutor.

  @Override
  public void handleConfigChanged(TableRuntime tableRuntime, TableConfiguration originalConfig) {
    scheduleIfNecessary(tableRuntime, getStartDelay());
  }

Can you add this method to this PR? @huyuanfeng2018

ok

Copy link
Contributor

@wangtaohz wangtaohz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@huyuanfeng2018 I left some comments, please take a look.

@huyuanfeng2018
Copy link
Contributor Author

@huyuanfeng2018 I left some comments, please take a look.

done.

Copy link
Contributor

@wangtaohz wangtaohz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for you contribution! @huyuanfeng2018

The code looks fine to me, only with some details needs to be modified.

Copy link
Contributor

@wangtaohz wangtaohz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Contributor

@zhoujinsong zhoujinsong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@huyuanfeng2018 @wangtaohz
I left some comments, please take another look.

| tag.auto-create.trigger.period | daily | Period of creating tags, support `daily`,`hourly` now |
| tag.auto-create.trigger.offset.minutes | 0 | The minutes by which the tag is created after midnight (00:00) |
| tag.auto-create.trigger.max-delay.minutes | 60 | The maximum delay time for creating a tag |
| tag.auto-create.tag-format | 'tag-'yyyyMMdd for daily and 'tag-'yyyyMMddHH for hourly periods | The format of the <br/>name for daily tag <br> |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description here seems to be misleading.
IMO, we can set the format for both daily and hourly auto-tag functions with this configuration.

However, I'm curious as to whether I need to carefully set different values when I configure this configuration based on different trigger periods. For example, in the daily mode, if it's set as 'tag-'yyyyMMddHH, will auto-tag still function properly, or in the hourly mode, if it's set as 'tag-'yyyyMMdd.

throw new IllegalArgumentException(
"unsupported trigger period " + tagConfig.getTriggerPeriod());
}
LocalDateTime tagTime =
Copy link
Contributor

@zhoujinsong zhoujinsong Dec 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, when I reviewed the code here, these three time-related variables made it difficult for me to understand. After carefully reviewing the code again, I have summarized the names and meanings of the three parameters:

  • Check time: the timestamp when the action is started.
  • Tag time: the timestamp of the last tag represents.
  • Trigger time: the timestamp of the last tag should be triggered.

And we calculated the 3 variables with the methods:

  • checkTime = now()
  • triggerTime = (checkTime - triggerOffset) % period + triggerOffest
  • TagTime = triggerTime - triggerOffest - period

then,

  • TagTime = (checkTime - triggerOffset) % period - period
  • TriggerTime = TagTime + period + triggerOffset

However, the tag time here does not correspond to the time of the tag. It is easier for us to generate the corresponding tag name based on this tag time, but it does not accurately represent the data within the tag.

We should calculate the 3 variables like :

  • checkTime = now()
  • TagTime = (checkTime - triggerOffset) % period
  • triggerTime = TagTime + triggerOffest

Then the value of the tag time is more accurate and the calculation methods are much simpler.

So I think we can make the code much easier to understand by:

  • Put the 3 variables together and try to calculate them at the beginning, put some comments to tell the meaning and the calculating rules.
  • Calculate the tag time first and then calculate the trigger time based on tag time.

@github-actions github-actions bot removed the module:ams-dashboard Ams dashboard module label Jan 2, 2024
Copy link
Contributor

@zhoujinsong zhoujinsong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
Thanks a lot for your great work!

@zhoujinsong zhoujinsong merged commit c020c1c into apache:master Jan 31, 2024
5 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module:core Core module type:docs Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Subtask]: Automatically create tags on snapshots hour-level for iceberg Table Formats
4 participants