-
Notifications
You must be signed in to change notification settings - Fork 979
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docker layer caching support #26
Comments
CodeBuild does not currently have native support for Docker layer caching, though we are aware of the use case for it. In the meantime, have you tried using |
I'll try that and see how it goes. Thanks |
@deevus I'm curious if you had any luck with |
@ewolfe I tried it and it doesn't seem very effective. The time it takes to load/save negates any benefits at least in my case. Here are the scripts I have written. If you try them out perhaps you can find a way to make them work in your favour. It saves all the generated images after the build since the docker host is empty (I assume) when the build runs. Apologies for the lack of comments. cache-load.sh #!/bin/bash
set -e
echo 'Loading docker cache...'
mkdir -p $IMAGE_CACHE_PATH
DOCKER_IMAGES_CACHE=`mktemp`
find $IMAGE_CACHE_PATH -name *.tar.gz > $DOCKER_IMAGES_CACHE
while read file; do
echo $file
if ! docker load -i $file; then
echo "Error loading docker image $file. Removing..."
rm $file
fi
done < $DOCKER_IMAGES_CACHE
rm $DOCKER_IMAGES_CACHE cache-save.sh #/bin/bash
set -e
mkdir -p $IMAGE_CACHE_PATH
DOCKER_IMAGES_NEW=`mktemp`
docker images -q --no-trunc | awk -F':' '{print $2}' | sort > $DOCKER_IMAGES_NEW
DOCKER_IMAGES_CACHE=`mktemp`
find $IMAGE_CACHE_PATH -name *.tar.gz -printf '%f\n' | awk -F. '{print $1}' | sort > $DOCKER_IMAGES_CACHE
DOCKER_IMAGES_DELETE=`mktemp`
DOCKER_IMAGES_SAVE=`mktemp`
comm -13 $DOCKER_IMAGES_NEW $DOCKER_IMAGES_CACHE > $DOCKER_IMAGES_DELETE
comm -23 $DOCKER_IMAGES_NEW $DOCKER_IMAGES_CACHE > $DOCKER_IMAGES_SAVE
if [ $(< $DOCKER_IMAGES_DELETE wc -l) -gt 0 ]; then
echo Deleting docker images that are no longer current
< $DOCKER_IMAGES_DELETE xargs -I % sh -c "echo Deleting extraneous image % && rm $IMAGE_CACHE_PATH/%.tar.gz"
echo
fi
if [ $(< $DOCKER_IMAGES_SAVE wc -l) -gt 0 ]; then
echo Saving missing images to docker cache
< $DOCKER_IMAGES_SAVE xargs -I % sh -c "echo Saving image % && docker save % | gzip -c > '$IMAGE_CACHE_PATH/%.tar.gz'"
echo
fi
rm $DOCKER_IMAGES_NEW $DOCKER_IMAGES_CACHE $DOCKER_IMAGES_DELETE $DOCKER_IMAGES_SAVE I don't know if I'm missing something here but a couple of the intermediate containers still build from scratch anyway, which is what I was originally trying to avoid. EDIT: You need to set |
Do you run these on |
Yes that's correct |
I also need to cache the layers between build but my attempts have been so far unsuccessful. (tried to cache I have a question for you @jvusich, by
Do you mean that this is somewhere on the codebuild roadmap? :) |
As @jvusich mentioned, we are aware that this use case is something that we do not support natively (without custom work arounds mentioned in this issue). We've also heard about this use case before from our other customers as well. Our roadmaps are decided primarily based on customer requests and use cases. So effectively its on our radar, we cannot comment when it will be addressed. |
Thanks @deevus! Your handy shell script made this easier. I had to make a small change to properly cache all the layers in the build: I put my version of your script in these gists here: I also put these lines in my buildspec.yml:
|
@jabalsad How well does it work with that change? If my original script was missing a bunch of layers I would expect a decent improvement with your changes |
It speeds up the actual build significantly, however the The real reason I'm looking for the caching functionality is actually so that noop changes don't create a new image in ECR unnecessarily. |
this worked for me: version: 0.2
phases:
pre_build:
commands:
- docker version
- $(aws ecr get-login --no-include-email)
- docker pull $CONTAINER_REPOSITORY_URL:$REF_NAME || true
build:
commands:
- docker build --cache-from $CONTAINER_REPOSITORY_URL:$REF_NAME --tag $CONTAINER_REPOSITORY_URL:$REF_NAME --tag $CONTAINER_REPOSITORY_URL:ref-$CODEBUILD_RESOLVED_SOURCE_VERSION .
post_build:
commands:
- docker push $CONTAINER_REPOSITORY_URL |
@monken This worked perfectly for me, my build time reduced from |
@monken After a couple of hours trying I found your solution just perfect, I managed to decrease build time a 60%! |
I think that the method mentioned by @monken (pull and cache-from) does not work with multi-stage builds because the pulled image does not have all the stages, but only the last one. |
@monken I can't get this to work. It keeps invalidate the cache even at the base image 😢 |
@tvb double check your docker pull command is working, I noticed mine was failing due to not having the required IAM roles yet the build continued because the pull command is in the pre-build stage. |
@monken Worked Perfectly.. 👍 |
For anyone looking for a simple solution that works with multi-stage builds, I made a pretty simple build script that was able to meet my requirements. I was looking for a solution that would:
The basic process is this:
Here's the basic script: #!/usr/bin/env bash
readonly repo=${AWS_ACCOUNT_ID}.dkr.ecr.${AWS_REGION}.amazonaws.com/${REPO_NAME}
# Attempt to pull existing builder image
if docker pull ${repo}:builder; then
# Update builder image
docker build -t ${repo}:builder --target build --cache-from ${repo}:builder .
else
# Create new builder image
docker build -t ${repo}:builder --target build .
fi
# Attempt to pull latest target image
docker pull ${repo}:latest || true
# Build and push target image
docker build -t ${repo}:latest --cache-from ${repo}:builder --cache-from ${repo}:latest .
docker push ${repo}:latest
# Push builder image
docker push ${repo}:builder The conditional logic is mainly there for clarity. The entire caching pattern can be simplified as: docker pull ${repo}:builder || true
docker build -t ${repo}:builder --target build --cache-from ${repo}:builder .
docker pull ${repo}:latest || true
docker build -t ${repo}:latest --cache-from ${repo}:builder --cache-from ${repo}:latest .
docker push ${repo}:latest
docker push ${repo}:builder This solution has been working well for me, and dramatically reduced our build times. It works with multiple concurrent builds and if any of the |
@monken just curious, is $REF_NAME pulling only the specific version/tag of that container in ECR, or are you pulling all the intermediate containers? if pulling all intermediates, can you describe how that works as that sounds good, but not sure it will work for my use case. |
@judahb it's only pulling the last container image (including all layers). It's more likely that you will have matching layers with the latest image than any image that's older. So there is probably not a huge gain in pulling all previous images. |
I also pull docker pull ${repo}:$(git rev-parse HEAD) || docker pull ${repo}:$(git rev-parse HEAD~1) || true |
For those using
In our case |
Yes indeed.... waiting over 30 minutes with This makes CodeBuild quite unusable... |
We have added support for local caching, including Docker layers. Documentation: https://docs.aws.amazon.com/codebuild/latest/userguide/build-caching.html#caching-local |
Sorry, it's not clear how the new feature is working.
Can anyone explain what's the reason? How make CodeBuild to use cache even if I run changes through pipeline in one week? |
My limited experience so far is that it caches for a very short period of time. If I start repeat builds within a few minutes of each other it seems to use the cache most of the time, but any longer than that and it usually doesn’t hit the cache at all. |
@dev-walker As explained in the documentation, the build cache is kept on local storage for maximum performance. When there are long intervals with no builds running, that underlying storage may be retired. Your first few builds in the morning may need to re-warm the cache if you ran very few builds overnight. |
Can someone explain how to use @monken script step by step? Should I use it during the creation of the image or as the buildspec on CodeBuild? (Which means my image should have docker inside?) Sorry, I am far from being a DevOps guy.. I am using a custom Windows image, pusher on AWS ECR. |
@josephvusich or @subinataws can we get a documentation about local cache bursting? Is it possible? Are there any plans to make it possible? Any recommended workaround? I know I would love longer cache as I mentioned previously, but on very rare occasions I have the need to burst the local docker layers to get the build passing. |
@deleugpn - you can override the cache setting (or for that matter any project setting) when invoking the startBuild. So if you don't want to use local cache for a particular build run, select "no cache". Replied to you on slack as well. |
I'm hosting my custom built CodeBuild image on ECR and running off a base image hosted on ECR. The slow network transfer rate is what led me to caching. Local Caching seems to still be a blackbox. It's great when it hits, but when it misses, it's questionable why exactly it missed. I have tried to get more insight into the PROVISIONING stage to no avail. What exactly is going on with caching in terms of expiry and what it is caching? Could we have more visibility into the cache? |
Today, I tried to implement We use |
@gabrielenosso did you ever figure out how to adapt this to a Windows image? |
@StarpTech Have you reached to a solution? |
docker 1.18.09 allowed buildkit's automatic pull for caches. Blog post There is any plan to update from 1.24 to 1.25.x ? Seems that would help with this issue. |
What's the point of local cache if 15 minutes is the maximum life span? Serious question! |
is there any recommended way how to cache docker layers for more than a life span of a codebuild? |
Docker registries now allow layer caching. Unfortunately, this is not supported by ECR yet. aws/containers-roadmap#876 |
None of the examples in this thread worked for me but I managed to find a solution. They key is to enable the BuildKit inline cache. A cut down example:
The first time this runs it will still rebuild entirely but it will place an inline cache in the finished image. Then every future build will use the inline cache and be much faster. It doesn't seem to like some parallel stages, i.e. you build in one stage and only copy out the final binary. It makes sense why they wouldn't be stored in the inline cache so you either need to store intermediary images as shown earlier in the thread or make your Dockerfile more linear. |
Given that you're using - echo Pulling image for cache...
- docker pull $REPO_URI:$IMAGE_TAG || true
More info: |
I found removing this doesn't work. After ~20 mins once the CodeBuild cache expires it does a full rebuild again and there's no indication of anything getting pulled down. I guess you could grep docker images and only pull if needed but you run the risk of slowing down frequent builds with an increasingly stale cache. |
Based on this bugreport and this source I came up with the below solution.
|
I documented what worked for me here, very close to @RonaldTechnative solution. basically I did not have to install buildx I just had to create a builder |
As an example for the registry cache storage backend is missing, I am providing mine. It is similar to the
Installing |
Thanks @Janosch |
How can I enable caching of docker layers between builds? Caching is one of the biggest benefits of multi-stage builds but in CodeBuild it runs every step every time.
The text was updated successfully, but these errors were encountered: