Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The cache export step hangs #537

Open
Xplouder opened this issue Feb 8, 2021 · 11 comments
Open

The cache export step hangs #537

Xplouder opened this issue Feb 8, 2021 · 11 comments

Comments

@Xplouder
Copy link

Xplouder commented Feb 8, 2021

Hi,

first of all, sorry if this is a double post, but since the other reports I found are a kite old and without recent activity, I decided/try to sum it all here:

Looks like the "preparing build cache for the export" step is hanging pointing to some kind of bug here:
image
In my last tries, it had more than 1h, no CPU usage, just stuck. Meaning that beside inline cache type, the others are unusable.

How to reproduce:

  • dockerfile with multistage build
  • cache-to with local or registry type with mode default/max
  • first export will work well, however in the second run (maybe due cache-from) it will hang up infinitely

notes:

  • I also used a multi-platform build but I don't think it's related
  • tested in a GitLab CI pipeline

Samples

Here is the build commands that i used with just some redacted commands:

local type:

    - docker buildx build
      --cache-from=type=local,src=docker_cache/
      --cache-to=type=local,dest=docker_cache/
      --ssh default=...
      --output type=image,name=registry.dev:foo,push=true
      --platform=linux/amd64,linux/arm,linux/arm64
      .

registry type:

    - docker buildx build
      --cache-from=type=registry,ref=registry.dev/cache
      --cache-to=type=registry,ref=registry.dev/cache
      --ssh default=...
      --output type=image,name=registry:foo,push=true
      --platform=linux/amd64,linux/arm,linux/arm64
      .

Other reports that might be related:

@tonistiigi
Copy link
Member

please post a runnable reproducer

@umonaca
Copy link

umonaca commented Mar 17, 2021

Same here. There seems to be a lot of similar issues here.
I have been stuck with output=type=local,dest=path as well.
I can reproduce the issue but I don't know how to make a minimal reproducer. All I know is that it got stuck in the "copying file" after image is built. BTW it is a multi-platform build, the arm64 image is successfully exported but armv7 always gets stuck in the copying to output stage, after the image is built successfully.

@awakecoding
Copy link

I have been trying to figure out for the entire day why a simple docker buildx filesystem export hangs in GitHub Actions, while it works just fine locally in WSL2. Maybe this is the same issue? https://twitter.com/awakecoding/status/1430252223771054084

@tonistiigi
Copy link
Member

opened grpc/grpc-go#4722

@tonistiigi
Copy link
Member

If someone can make a reproducer using --cache-to that fails in a similar way @awakecoding did to -o type=local with a reproducible system, I could look if it is similar. Still don't quite understand what is the difference between local and tar output if it breaks in grpc level. It could be that local transfers files individually but neither type=tar or --cache-to do not. @bendavies

@hectorj-klaxoon
Copy link

hectorj-klaxoon commented Jan 27, 2022

I have a similar issue.
It doesn't happen all the time, and I'm not sure what triggers it so I can't give a reproducer.
The build fully uses the --cache-from (all steps are marked as CACHED), which points to the same registry&image&tag as --cache-to, so I don't think anything actually needs to be pushed.

Deleting the --cache-to tag from the registry allows the next build to succeed.

Sorry, I don't have much more information.

@worldspawn
Copy link

I saw this immediately after adding mode=max. I'm caching to/from registry,

@arikmaor
Copy link

arikmaor commented Aug 3, 2022

Any fix?

@bbednarek
Copy link

@tonistiigi I created reproducible example in https://github.com/bbednarek/multiple-docker-build repo along with the workaround that we have taken (OCI layout).

I am basically building Docker image in 2 steps, using 2 different Dockerfiles: Docker.builder and Docker.
You can find 2 different workflow files which are using 2 different ways to build the final Docker image (you can also run them manually and override default target platform):

On the top of that you can find Makefile which contains 3 self-descriptive jobs to run it locally:

  • buildx-docker -> make clean buildx-docker took around 50 min to complete
  • buildx-docker-oci -> make clean buildx-docker-oci took around 30 min to complete
  • build-docker -> make clean build-docker took around 5 min to complete

@tonistiigi
Copy link
Member

I saw this immediately after adding mode=max. I'm caching to/from registry,

This issue is only about type=local . It was traced to #537 (comment)

@jjhuff
Copy link

jjhuff commented Jan 18, 2024

@tonistiigi Unfortunately, grpc/grpc-go#4722 was closed as stale. At this point, our cache exports are so slow, that we might as well not use docker caching at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants