-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[v0.13] fail to export image when using rewrite-timestamp=true #4793
Comments
Please attach a (minimal and yet self-contained) reproducer |
Admittedly I cannot reproduce this in a minimal self-contained environment. Closing the issue until I figure out what's up. |
We have a very similar issue with v0.13.1 with command:
Stack trace:
@emalihin did you find a solution for your problem? |
I've rolled back to 0.13.0-beta1 for now. Are both of your |
I see. In this specific example Actually I notice this start happening after upgrade from My guess is that somehow the cache is "poisoned", so when a cache get the issue, I'll continue to see this happening until I manually delete the cache. Also I build multiple images in parallel, and some images have the issue and some other not, but I'm not able to find any pattern in the behavior. |
Might be a regression in #4663 (v0.13.0-rc2) ? |
In my testing of |
I'm testing v0.13.0-rc1 and currently I'm not seeing the issue (but previously I noticed that could happen after a while, so I'm not 100% secure that rc1 doesn't have this issue).
I didn't check that. Our CI generates different cache tags if we change the buildkit version, so I'm not able to test this scenario. |
This PR may fix your issue too |
Happy to test it as soon as a build is available. Thank you. |
I found a way to repro it on Example Dockerfile: FROM debian:bookworm as base
FROM debian:bookworm-slim
RUN echo "foo" Might need to reopen this or file a new issue? |
Thanks for reporting, but I can't repro the issue with BuildKit v0.13.2 on Ubuntu 24.04
|
Here's a script to more precisely repro what I've seen (using a #!/usr/bin/env bash
set -xeuo pipefail
cd "$(mktemp -d)"
cat >Dockerfile <<'EOF'
FROM debian:bookworm-slim AS base
FROM debian:bookworm
EOF
OUTPUT_IMAGE="<IMAGE FROM PRIVATE REGISTRY>"
CACHE_IMAGE="<DIFFERENT IMAGE FROM PRIVATE REGISTRY>"
docker buildx build \
--pull \
--output "name=$OUTPUT_IMAGE,push=true,rewrite-timestamp=true,type=image" \
--cache-to "ref=$CACHE_IMAGE,image-manifest=true,mode=max,oci-mediatypes=true,type=registry" \
--platform linux/arm64/v8,linux/amd64 \
. That yields this error:
This can probably be reduced to a more minimal repro, but this was the first config that I got a repro out of as I gradually added configuration from my full build pipeline. My build pipeline had a 100% failure rate when running on my AWS CI builders, but appears to fail inconsistently (race condition?) when I build on my macbook. Either way, removing the unused stage from the Dockerfile eliminates the error. |
could that be related to containerd/containerd#10187 @AkihiroSuda ? |
Minimal repro:
(Needs I also see an occasional panic in the daemon log
|
Any movement or changes to this @AkihiroSuda? |
Not yet, PR is welcome |
Hello,
I've been running BuildKit
0.13.0-beta1
for a while as adocker-container
driver to get caching ECR registry integration and timestamp rewriting for reproducibility. This worked well for a couple of months.Today I tried upgrading BuildKit to the stable 0.13.0 and 0.13.1 releases with the same
docker-container
setup, and started encountering these errors:For some reason jobs that use this cached layer fail to export it.
Once I delete the cache from ECR - 1st build passes, and subsequent builds fail.
Everything works again once I revert to using BuildKit
0.13.0-beta1
, with new and existing caches.This is the command I use:
Edit: similar to this issue, but I'm not doing
COPY --link
The text was updated successfully, but these errors were encountered: