Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v0.11.0 images pull had ErrImagePull error #211

Open
Mossaka opened this issue Feb 13, 2024 · 8 comments
Open

v0.11.0 images pull had ErrImagePull error #211

Mossaka opened this issue Feb 13, 2024 · 8 comments

Comments

@Mossaka
Copy link
Member

Mossaka commented Feb 13, 2024

  Type     Reason            Age                     From               Message
  ----     ------            ----                    ----               -------
  Warning  FailedScheduling  7m25s                   default-scheduler  0/3 nodes are available: 3 node(s) had untolerated taint {node.kubernetes.io/not-ready: }. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling..
  Normal   Scheduled         7m23s                   default-scheduler  Successfully assigned default/wasm-lunatic-764c4f46d4-5b4zt to k3d-wasm-cluster-agent-1
  Normal   Pulling           5m51s (x4 over 7m22s)   kubelet            Pulling image "ghcr.io/deislabs/containerd-wasm-shims/examples/lunatic-submillisecond:v0.11.0"
  Warning  Failed            5m51s (x4 over 7m22s)   kubelet            Failed to pull image "ghcr.io/deislabs/containerd-wasm-shims/examples/lunatic-submillisecond:v0.11.0": rpc error: code = NotFound desc = failed to pull and unpack image "ghcr.io/deislabs/containerd-wasm-shims/examples/lunatic-submillisecond:v0.11.0": no match for platform in manifest: not found
  Warning  Failed            5m51s (x4 over 7m22s)   kubelet            Error: ErrImagePull
  Warning  Failed            5m38s (x6 over 7m21s)   kubelet            Error: ImagePullBackOff
  Normal   BackOff           2m15s (x20 over 7m21s)  kubelet            Back-off pulling image "ghcr.io/deislabs/containerd-wasm-shims/examples/lunatic-submillisecond:v0.11.0"
@Mossaka
Copy link
Member Author

Mossaka commented Feb 13, 2024

I think I would have to change the platform back to amd64 and arm64 for images that we publish.

https://github.com/deislabs/containerd-wasm-shims/blob/main/.github/workflows/docker-build-push.yaml#L79

What do you think? @devigned , @jsturtevant

@devigned
Copy link
Member

I believe you are correct about the origin of the error, but I'm not sure the platform / arch. @jsturtevant wdyt?

@jsturtevant
Copy link
Contributor

It looks like this is because it is build as an image index by buildx

 regctl manifest get ghcr.io/deislabs/containerd-wasm-shims/examples/lunatic-submillisecond:v0.11.0
Name:                            ghcr.io/deislabs/containerd-wasm-shims/examples/lunatic-submillisecond:v0.11.0
MediaType:                       application/vnd.oci.image.index.v1+json
Digest:                          sha256:40c27dda0433770fb84cf8404a0904288e557db74dda2e42d6f70efd09aba82f

Manifests:

  Name:                          ghcr.io/deislabs/containerd-wasm-shims/examples/lunatic-submillisecond:v0.11.0@sha256:cd20f3be0de911ad8eb7551ef4ccdfc16040ad35a47f7633786b80d3a6b76598
  Digest:                        sha256:cd20f3be0de911ad8eb7551ef4ccdfc16040ad35a47f7633786b80d3a6b76598
  MediaType:                     application/vnd.oci.image.manifest.v1+json
  Platform:                      wasi/wasm

  Name:                          ghcr.io/deislabs/containerd-wasm-shims/examples/lunatic-submillisecond:v0.11.0@sha256:f587a7a300a7ec1b7f0c404bca38c337dfb79e6f2b93238daac25f963e5008c6
  Digest:                        sha256:f587a7a300a7ec1b7f0c404bca38c337dfb79e6f2b93238daac25f963e5008c6
  MediaType:                     application/vnd.oci.image.manifest.v1+json
  Platform:                      unknown/unknown
  Annotations:
    vnd.docker.reference.digest: sha256:cd20f3be0de911ad8eb7551ef4ccdfc16040ad35a47f7633786b80d3a6b76598
    vnd.docker.reference.type:   attestation-manifest

Switching it back to amd64 for the platform arch would allow this to be pulled as an index, otherwise I believe disabling attestation in buildx would produce a single image manifest instead of an index and it would pull properly.

The other option would be to use the digest directly like: ghcr.io/deislabs/containerd-wasm-shims/examples/lunatic-submillisecond:v0.11.0@sha256:cd20f3be0de911ad8eb7551ef4ccdfc16040ad35a47f7633786b80d3a6b76598

@mboersma
Copy link

mboersma commented Feb 14, 2024

Can I ask that in the future, if there are problems with a release, we follow convention and issue a new patch release?

Tags and releases are generally considered write-once. There are excellent reasons for not replacing an existing release in-line, even if it's completely broken. For example: this PR which passed initially but began failing once v0.11.0 suddenly meant something different:
kubernetes-sigs/image-builder#1405

It was minutes from merging, and would have been DOA if we had.

I can update it to match new SHAs, but TBH I have low confidence that will work, and my incentive to keep up with releases is reduced. (This isn't the first time a wasm-shims release has been updated inline with new binaries.)

Is there an additional test or release gate we could add to make sure this doesn't happen in the future? I'd love to help!

@mboersma
Copy link

mboersma commented Feb 14, 2024

Also it appears the v0.11.0 release is incomplete now: it only has three of the expected eight binary packages.

image

@Mossaka
Copy link
Member Author

Mossaka commented Feb 14, 2024

Can I ask that in the future, if there are problems with a release, we follow convention and issue a new patch release?

My bad. Yes, of course.

Is there an additional test or release gate we could add to make sure this doesn't happen in the future? I'd love to help!

I think, in the future, given that there is a high probability that a release would fail, I am going to push release candidates first and verify they work and then push the main tag. Does this sound like a plausible approach?

@mboersma
Copy link

Does this sound like a plausible approach?

It does indeed, sorry there's so much manual work involved.

(I'm also planning to write an end-to-end test to verify that the wasm-shims are working in the real world--Cluster API for Azure--but that wouldn't catch problems until much later.)

@Mossaka
Copy link
Member Author

Mossaka commented Feb 15, 2024

Just released v0.11.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants