Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cirrus: Improve caching effectiveness #553

Merged
merged 1 commit into from
Jan 16, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 13 additions & 8 deletions .cirrus.yml
Original file line number Diff line number Diff line change
Expand Up @@ -50,16 +50,15 @@ build_task:
# all PRs & branches will share caches with other PRs and branches
# for a given $DEST_BRANCH and vX value. Adjust vX if cache schema
# changes.
fingerprint_key: "cargo_v2_${DEST_BRANCH}_amd64"
fingerprint_script: echo -e "cargo_v3_${DEST_BRANCH}_amd64\n---\n$(<Cargo.lock)\n---\n$(<Cargo.toml)"yy
# Required to be set explicitly since fingerprint_key is also set
reupload_on_changes: true
targets_cache: &targets_cache
# Similar to cargo_cache, but holds the actual compiled artifacts. This must
# be scoped similar to bin_cache to avoid binary pollution across cache
# contexts. For example, two PRs that happen to coincidentally change
# and use cache. Adjust vX if cache schema changes.
# Similar to cargo_cache, but holds the actual compiled dependent artifacts.
# This should be scoped to a hash of the dependency-metadata lock file.
# Cirrus-CI will automatically use separate caches for PRs and branches.
folder: "$CARGO_TARGET_DIR"
fingerprint_key: "targets_v2_${CIRRUS_TAG}${DEST_BRANCH}${CIRRUS_PR}_amd64" # Cache only within same tag, branch, or PR (branch will be 'pull/#')
fingerprint_script: echo -e "targets_v3_${CIRRUS_TAG}${DEST_BRANCH}${CIRRUS_PR}_amd64\n---\n$(<Cargo.lock)\n---\n$(<Cargo.toml)"
reupload_on_changes: true
bin_cache: &bin_cache
# This simply prevents rebuilding bin/netavark for every subsequent task.
Expand All @@ -70,6 +69,7 @@ build_task:
reupload_on_changes: true
setup_script: &setup "$SCRIPT_BASE/setup.sh"
main_script: &main "$SCRIPT_BASE/runner.sh $CIRRUS_TASK_NAME"
cache_grooming_script: &groom bash "$SCRIPT_BASE/cache_groom.sh"
upload_caches: [ "cargo", "targets", "bin" ]


Expand All @@ -82,11 +82,15 @@ build_aarch64_task:
architecture: arm64 # CAUTION: This has to be "arm64", not "aarch64"
cargo_cache: &cargo_cache_aarch64
folder: "$CARGO_HOME"
fingerprint_key: "cargo_v2_${DEST_BRANCH}_aarch64"
# N/B: Should exactly match (except for arch) line from build_task (above).
# (No, there isn't an easy way to not duplicate most of this :()
fingerprint_script: echo -e "cargo_v3_${DEST_BRANCH}_aarch64\n---\n$(<Cargo.lock)\n---\n$(<Cargo.toml)"yy
reupload_on_changes: true
targets_cache: &targets_cache_aarch64
folder: "$CARGO_TARGET_DIR"
fingerprint_key: "targets_v2_${CIRRUS_TAG}${DEST_BRANCH}${CIRRUS_PR}_aarch64" # Cache only within same tag, branch, or PR (branch will be 'pull/#')
# N/B: Should exactly match (except for arch) line from build_task (above).
# (No, there isn't an easy way to not duplicate most of this :()
fingerprint_script: echo -e "targets_v3_${CIRRUS_TAG}${DEST_BRANCH}${CIRRUS_PR}_aarch64\n---\n$(<Cargo.lock)\n---\n$(<Cargo.toml)"
reupload_on_changes: true
bin_cache: &bin_cache_aarch64
# This simply prevents rebuilding bin/netavark for every subsequent task.
Expand All @@ -95,6 +99,7 @@ build_aarch64_task:
reupload_on_changes: true
setup_script: *setup
main_script: *main
cache_grooming_script: *groom
upload_caches: [ "cargo", "targets", "bin" ]
# Downstream CI needs the aarch64 binaries from this CI system.
# However, we don't want to confuse architectures.
Expand Down
77 changes: 77 additions & 0 deletions contrib/cirrus/cache_groom.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
#!/bin/bash
#
# This script is intended to be run from Cirrus-CI to prepare the
# rust targets cache for re-use during subsequent runs. This mainly
# involves removing files and directories which change frequently
# but are cheap/quick to regenerate - i.e. prevent "cache-flapping".
# Any other use of this script is not supported and may cause harm.

set -eo pipefail

source $(dirname ${BASH_SOURCE[0]})/lib.sh

if [[ "$CIRRUS_CI" != true ]]; then
die "Script is not intended for use outside of Cirrus-CI"
fi

req_env_vars CARGO_HOME CARGO_TARGET_DIR CIRRUS_BUILD_ID

# Giant-meat-cleaver HACK: It's possible (with a long-running cache key) for
# the targets and/or cargo cache to grow without-bound (gigabytes). Ref:
# https://github.com/rust-lang/cargo/issues/5026
# There isn't a good way to deal with this or account for outdated content
# in some intelligent way w/o trolling through config and code files. So,
# Any time the Cirrus-CI build ID is evenly divisible by some number (chosen
# arbitrarily) clobber the whole thing and make the next run entirely
# re-populate cache. This is ugly, but maybe the best option available :(
if [[ "$CIRRUS_BRANCH" == "$DEST_BRANCH" ]] && ((CIRRUS_BUILD_ID%15==0)); then
msg "It's a cache-clobber build, yay! This build has been randomly selected for"
msg "a forced cache-wipe! Congradulations! This means the next build will be"
msg "slow, and nobody will know who to to blame!. Lucky you! Hurray!"
msg "(This is necessary to prevent branch-level cache from infinitely growing)"
cd $CARGO_TARGET_DIR
# Could use `cargo clean` for this, but it's easier to just clobber everything.
rm -rf ./* ./.??*
# In case somebody goes poking around, leave a calling-card hopefully leading
# them back to this script. I don't know of a better way to handle this :S
touch CACHE_WAS_CLOBBERED

cd $CARGO_HOME
rm -rf ./* ./.??*
touch CACHE_WAS_CLOBBERED
exit 0
fi

# The following applies to both PRs and branch-level cache. It attempts to remove
# things which are non-essential and/or may change frequently. It stops short of
# trolling through config & code files to determine what is relevant or not.
# Ref: https://doc.rust-lang.org/nightly/cargo/guide/build-cache.html
# https://github.com/Swatinem/rust-cache/tree/master/src
cd $CARGO_TARGET_DIR
for targetname in $(find ./ -type d -maxdepth 1 -mindepth 1); do
msg "Grooming $CARGO_TARGET_DIR/$targetname..."
cd $CARGO_TARGET_DIR/$targetname
# Any top-level hidden files or directories
showrun rm -rf ./.??*
# Example targets
showrun rm -rf ./target/debug/examples
# Documentation
showrun rm -rf ./target/doc
# Internal to rust build process
showrun rm -rf ./target/debug/deps ./target/debug/incremental ./target/debug/build
done

# The following only applies to dependent packages (crates). It follows recommendations
# Ref: https://doc.rust-lang.org/nightly/cargo/guide/cargo-home.html#caching-the-cargo-home-in-ci
# and probably shouldn't be extended beyond what's documented. This cache plays a major
# role in built-time reduction, but must also be prevented from causing "cache-flapping".
cd $CARGO_HOME
for dirname in $(find ./ -type d -maxdepth 2 -mindepth 1); do
case "$dirname" in
./bin) ;& # same steps as next item
./registry/index) ;&
./registry/cache) ;&
./git/db) continue ;; # Keep
*) rm -rf $dirname ;; # Remove
esac
done
6 changes: 6 additions & 0 deletions contrib/cirrus/lib.sh
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,12 @@ else # set default values - see make_cienv() below
# VM Images are built with this setup
CARGO_HOME="${CARGO_HOME:-/var/cache/cargo}"
source $CARGO_HOME/env

# Make caching more effective - disable incremental compilation,
# so that the Rust compiler doesn't waste time creating the
# additional artifacts required for incremental builds.
# Ref: https://github.com/marketplace/actions/rust-cache#cache-details
CARGO_INCREMENTAL=0
fi

# END Global export of all variables
Expand Down