Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci-automation/garbage_collect.sh: Add min age, remove orphan directories #1608

Merged
merged 3 commits into from
Jan 29, 2024

Conversation

t-lo
Copy link
Member

@t-lo t-lo commented Jan 26, 2024

This change improves the build cache garbage collector to remove orphaned artifact directories - i.e. directories to which no version tag exists in the scripts repo.

SDK containers built by Github actions (using update_sdk_container) are igored by this change because these are handled in a separate garbage collection script.

Also, a new command line parameter has been added to remove artifacts older than the specified number of days (defaulting to 14):

  • If neither number of builds nor max age is specified, the script defaults to 50 builds to keep, and a max age of 14 days. The max age overrides the number of builds to keep, so more than 50 builds may be kept.
  • If only the number of builds to keep is specified, the max age is set to "0" (i.e. today).
  • If both are specified, max age again overrides number of builds to keep.

Lastly, the change updates ci-automation/garbage_collect_github_ci_sdk.sh to also feature a min_age parameter and passes the parameter from ``garbage_collect.shtoci-automation/garbage_collect_github_ci_sdk.sh`.

How to use

bash -c "source ci-automation/garbage_collect.sh ; DRY_RUN=y garbage_collect ; "
bash -c "source ci-automation/garbage_collect.sh ; DRY_RUN=y garbage_collect 45; "
bash -c "source ci-automation/garbage_collect.sh ; DRY_RUN=y garbage_collect 1 12; "

bash -c "source ci-automation/garbage_collect_github_ci_sdk.sh ; DRY_RUN=y garbage_collect_github_ci 1 236; "
bash -c "source ci-automation/garbage_collect_github_ci_sdk.sh ; DRY_RUN=y garbage_collect_github_ci 10 200; "

Testing done

Ran the above 5 commands and attached logs:

Follow-up tasks

  • Integrate max age parameter with jenkins-os

This change improves the build cache garbage collector to remove
orphaned artifact directories - i.e. directories to which no version tag
exists in the scripts repo.

SDK containers built by Github actions (using update_sdk_container) are
igored by this change because these are handled in a separate garbage
collection script.

Also, a new command line parameter has been added to remove artifacts
older than the specified number of days (defaulting to 14):
    - If neither number of builds nor max age is specified, the script
      defaults to 50 builds to keep, and a max age of 14 days.
      The max age overrides the number of builds to keep, so more than
      50 builds may be kept.
    - If only the number of builds to keep is specified, the max age is
      set to "0" (i.e. today).
    - If both are specified, max age again overrides number of builds to
      keep.

Signed-off-by: Thilo Fromm <thilofromm@microsoft.com>
@t-lo t-lo requested a review from a team January 26, 2024 14:07
This change adds a min_age parameter to the github CI SDK garbage
collector. The parameter specifies a minimum age (in days) for artifacts
to be garbage collected. NOTE that this can result in more artifacts
being kept than specified via the "keep" parameter if artifacts are
younger than min_age.

The change also has garbage_collect.sh pass the min_age parameter to
garbage_collect_github_ci_sdk.sh.

Signed-off-by: Thilo Fromm <thilofromm@microsoft.com>
@@ -269,6 +271,6 @@ function _garbage_collect_impl() {
echo

source ci-automation/garbage_collect_github_ci_sdk.sh
garbage_collect_github_ci
garbage_collect_github_ci 1 "${min_age_days}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one SDK build to keep? Asking to make sure, especially that the default was 20. I do realize that probably more will be kept until they are older than 14 days or so.

Copy link
Member Author

@t-lo t-lo Jan 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "1" just means that at least one is kept; for everything else the 14-day limit will apply. We don't build the SDK from GitHub actions that often so I thought that would be sensible.

local versions_detected="$(git tag -l --sort=-committerdate \
--format="%(creatordate:format:%Y-%m-%d) | %(refname:strip=2)" \
| grep -E '.*\| (main|alpha|beta|stable|lts)-[0-9]+\.[0-9]+\.[0-9]+-.*' \
| grep -vE '(-pro)$')"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why excluding the -pro build tag? Some artifact from the past?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, to prevent deleting the release tags of the -pro versions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit too all-catching as it will prevent GC of any dev build that ends with -pro (so a build tag like weekly-updates-like-a-pro won't be GC'd), but OTOH we tend not to put -pro into our build tags, so I suppose it's fine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, the match could be more precise.

else
# make sure we only accept dev versions
# User-provided version list, make sure we only accept dev versions
purge_versions="$(echo "${purge_versions}" | sed 's/ /\n/g' \
| grep -E '(main|alpha|beta|stable|lts)-[0-9]+\.[0-9]+\.[0-9]+\-.*' \
| grep -vE '(-pro)$')"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here for excluding -pro from purges.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would need to rework tag removal and exclude any pattern that matches an official release version. Note that versions that look like official releases are skipped entirely - the garbage collector only cares about dev builds and nightlies. (Releases are supposed to be cleaned up manually, after pushing release binaries to the official mirrors.)

ci-automation/garbage_collect.sh Outdated Show resolved Hide resolved
Co-authored-by: Krzesimir Nowak <knowak@microsoft.com>
@t-lo t-lo merged commit 4f10dd9 into main Jan 29, 2024
1 check failed
@t-lo t-lo deleted the t-lo/garbage-collect-by-date-remove-orphans branch January 29, 2024 14:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants