Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AMORO-3192] Add the "clean-orphan-file.interval-time-minutes" parameter #3193

Merged

Conversation

lintingbin
Copy link
Contributor

Why are the changes needed?

Close #3192.

Brief change log

  • Add a parameter to replace the INTERVAL constant, which was previously fixed at 24 hours in the OrphanFilesCleaningExecutor class.

How was this patch tested?

  • Add some test cases that check the changes thoroughly including negative and positive cases if possible

  • Add screenshots for manual tests if appropriate

  • Run test locally before making a pull request

Documentation

  • Does this pull request introduce a new feature? yes
  • If yes, how is the feature documented? docs

@github-actions github-actions bot added type:docs Improvements or additions to documentation module:ams-server Ams server module module:common labels Sep 10, 2024
Copy link
Contributor

@tcodehuber tcodehuber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your work, but I think the default check interval should be less than the default min existing time.

@lintingbin
Copy link
Contributor Author

Thanks for your work, but I think the default check interval should be less than the default min existing time.

Why is this necessary? The "min existing time" ensures that useful files are not mistakenly deleted, so setting it to two to three days is usually sufficient. However, cleaning orphan files is a performance-intensive task, so it's generally appropriate to configure the cleanup to occur once a week or once a month.

@zhoujinsong
Copy link
Contributor

@lintingbin Thanks for the contribution.

The scheduling interval for the current expiration task is usually set as an AMS configuration like data-expiration.interval, which applies to all tables. I think this meets the requirements, as we generally do not adjust different execution interval for different tables. What do you think?

BTW, we are improving the configuration value for time duration with unit support like 3d, we can do it for the new configuration.

@zhoujinsong
Copy link
Contributor

Why is this necessary? The "min existing time" ensures that useful files are not mistakenly deleted, so setting it to two to three days is usually sufficient. However, cleaning orphan files is a performance-intensive task, so it's generally appropriate to configure the cleanup to occur once a week or once a month.

I agree with @lintingbin, there is no connection between the values of these two configurations.

@zhongqishang zhongqishang changed the title Add the "clean-orphan-file.interval-time-minutes" parameter [AMORO-3192] Add the "clean-orphan-file.interval-time-minutes" parameter Sep 11, 2024
@lintingbin
Copy link
Contributor Author

@lintingbin Thanks for the contribution.

The scheduling interval for the current expiration task is usually set as an AMS configuration like data-expiration.interval, which applies to all tables. I think this meets the requirements, as we generally do not adjust different execution interval for different tables. What do you think?

BTW, we are improving the configuration value for time duration with unit support like 3d, we can do it for the new configuration.

The modifications have been completed according to the suggestions.

@github-actions github-actions bot added type:build and removed type:docs Improvements or additions to documentation labels Sep 11, 2024
Copy link
Contributor

@zhoujinsong zhoujinsong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@zhoujinsong zhoujinsong merged commit d4ea2fa into apache:master Sep 11, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Improvement]: Add the "clean-orphan-file.interval-time-minutes" parameter.
3 participants