Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

question: does crane copy support multiarch image built with docker buildx? #1320

Closed
ericbl opened this issue Mar 16, 2022 · 19 comments
Closed
Labels
lifecycle/stale question Further information is requested

Comments

@ericbl
Copy link

ericbl commented Mar 16, 2022

I would like to use crane to make the deployment to different registries easier.
I have one gitlab repo, with one pipeline building a multiarch build to the Gitlab registry.

docker buildx build --platform "linux/amd64,linux/arm64" --build-arg REGISTRY=$TARGET_REGISTRY -t $TARGET_REGISTRY_WITH_NAMESPACE/$IMAGE_NAME:$TARGET_IMAGE_TAG --push .

and then, I would like to push this multarch image to AWS ECR registry (actually a job used twice with 2 different target registries).

- crane auth login $CI_REGISTRY -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD
- aws ecr get-login-password --region $AWS_REGION | crane auth login --username AWS --password-stdin $TARGET_REGISTRY
- crane cp $CI_REGISTRY_IMAGE/$IMAGE_NAME:$IMAGE_TAG $TARGET_REGISTRY_WITH_NAMESPACE/$IMAGE_NAME:$TARGET_IMAGE_TAG

the auth to the 2 registries (gitlab and aws) seems to be ok, but the crane cp always results with a 400:

2022/03/15 15:28:40 failed to copy index: HEAD <aws-registry>/v2/<aws-namespace>/manifests/sha256:fwed95...: unsupported status code 400

Any idea why this 400? Could be more a question for AWS support.

But at the first place, it is supposed to work ? Does crane cp support copying a multi arch image?

@ericbl ericbl added the question Further information is requested label Mar 16, 2022
@imjasonh
Copy link
Collaborator

crane definitely supports copying multiplatform images.

Can you share more about your manifest? (crane manifest $CI_REGISTRY_IMAGE/$IMAGE_NAME:$IMAGE_TAG) That might help identify what's causing ECR to reject it.

You could also narrow down the issue with crane cp $src $dst --platform=linux/amd64 and then again with --platform=linux/arm64. This won't copy the multiplatform image, but if copying both single-platform images works (or doesn't) that tells us something about the problem.

This might be an issue on ECR's side, but there's stuff we can try to find out.

@imjasonh
Copy link
Collaborator

Hmm I just noticed it's getting the 400 on a HEAD request -- does ECR not support HEAD requests?

@ericbl
Copy link
Author

ericbl commented Mar 16, 2022

thanks Jason for your help.
crane manifest cmd gives nothing unexpected I think:

{
   "mediaType": "application/vnd.docker.distribution.manifest.list.v2+json",
   "schemaVersion": 2,
   "manifests": [
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "digest": "sha256:39f60801785f3ea3ef242c628d0a268c834e7575b974d318b2abb057ab32d251",
         "size": 3679,
         "platform": {
            "architecture": "amd64",
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "digest": "sha256:103659be961702c[38](<gitlab-job>)658edf4caa4bb051650635a6918cb02bb22cc71e6425c54",
         "size": 3679,
         "platform": {
            "architecture": "arm64",
            "os": "linux"
         }
      }
   ]
}
$ crane cp --platform=linux/amd64 $CI_REGISTRY_IMAGE/$IMAGE_NAME:$IMAGE_TAG $TARGET_REGISTRY_WITH_NAMESPACE/$IMAGE_NAME:$TARGET_IMAGE_TAG
2022/03/16 11:59:00 Copying from <CI_REGISTRY_IMAGE>/<image-name> to <TARGET_REGISTRY_WITH_NAMESPACE><image-name>:crane-test
2022/03/16 11:59:01 failed to copy image: HEAD <aws-registry>/v2/<aws-namespace>/<image-name>/blobs/sha256:1c9a8b42b5780ac49c71f[39]()0 Bad Request (HEAD responses have no body, use GET for details)

I actually don't understand why AWS triggers a HEAD request here.

@ericbl
Copy link
Author

ericbl commented Mar 16, 2022

I am using crane copy to push to AWS ECR in another project successfully, but with images built with kaniko, and using IAM_role from AWS.
here, neither Kaniko nor IAM_Role set.

@imjasonh
Copy link
Collaborator

I actually don't understand why AWS triggers a HEAD request here.

crane issues a HEAD request before pushing a manifest or blob, to see whether the registry already has it. The registry is supposed to respond with a 200 or 404 to let us know whether to proceed with a POST to send it up. I'm not sure why it's giving a 400 though -- in this latest example, it's a 400 on the blob HEAD, in your original report it was for a manifest. It's probably something affecting both paths. I suspect auth... 🤔

It's odd that you're using Kaniko, because Kaniko uses the same underlying registry client code in pkg/v1/remote to push, and it's successful. What version of Kaniko are you using?

Just in case it matters, could you make sure you're using the latest crane release, and maybe even try go install github.com/google/go-containerregistry/cmd/crane@main to get the absolute latest version, and see if that works.

@ericbl
Copy link
Author

ericbl commented Mar 16, 2022

thanks for the explanation with HEAD.
In whilch conditions does the POST occurs? I suppose 404 OR 200 with different sha / different timestamp ?

The other project using Kaniko has different auth indeed.
I would like to build with Kaniko here too but it seems not possible to build multi arch image with Kaniko ( GoogleContainerTools/kaniko#786 )

Yes, the 400 on blob vs manifest is because on manifest occurs on a already existing image on the repo (pushed with docker buildx push) while here on the blob occurs on a different image name. The related aws repo should be already created, but I might add an extra step for that.

So I ll check again my auth here. I though I already fixed it since I used to have a 401 and 403 before,

Ok for getting the lastest crane.

@imjasonh
Copy link
Collaborator

In whilch conditions does the POST occurs? I suppose 404 OR 200 with different sha / different timestamp ?

If the response from the HEAD is 404, that means it doesn't currently have the manifest, so we POST it. If it's 200, the registry has the manifest and we can skip uploading it. The registry API is based on content-addressed storage, so if the manifest changes at all -- including new source inputs producing new layers, or non-deterministic build inputs like the current time being different from previous builds -- a new manifest will be produced, which the registry has never seen before, the HEAD will return 404, and the manifest will be POSTed.

The related aws repo should be already created, but I might add an extra step for that.

This might be the cause. Try crane cp --verbose to see if there's a better log message that's being swallowed along the way, that would help us debug that.

@ericbl
Copy link
Author

ericbl commented Mar 16, 2022

as said earlier, I want to push to 2 different AWS ECR repo.

for the first one,
I use as said earlier
- aws ecr get-login-password --region $AWS_REGION | crane auth login --username AWS --password-stdin $TARGET_REGISTRY

The get-login-password should use the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY variables set as gitlab variables.

for the 2nd one, I have a script that generates a AWS Token based on OAuth2 credentials
and eventually log with
crane auth login $TARGET_REGISTRY -u AWS -p $AWS_TOKEN

both solutions should be ok, isn't it?

edit: apparently not so, I indeed see a 401 with crane cp --verbose before the 400 mentioned above.

@imjasonh
Copy link
Collaborator

I don't personally know enough about ECR's auth to know whether one should work and not the other.

Which form of login produces the 401/400? Or do they both fail?

@ericbl
Copy link
Author

ericbl commented Mar 17, 2022

Both should work, but in my case, none of them!
However I found a solution, via the /.docker/config.json
As said, for my 2nd AWS, I generate a TOKEN, and export it into the config file
echo "{"auths":{"$REGISTRY":{"auth":"$TOKEN"}}}" >~/.docker/config.json

I realized that crane is actually automatically/magically READING this file and use it for auth. Thus I don't need extra auth login.

I also see that it is probably quite new since an older version of crane was not doing so.

Please confirm whether reading .docker/config.json is an expected behavior.
And please enhance the documentation in
https://github.com/google/go-containerregistry/blob/main/cmd/crane/README.md
It would help many other user to have a about crane auth login and if expected, mention the /.docker/config.json as alternative.

@ericbl
Copy link
Author

ericbl commented Mar 17, 2022

and for the first AWS, I still need to find the easiest way to use $AWS_ACCESS_KEY_ID and $AWS_SECRET_ACCESS_KEY with
crane auth login without creating an extra IAM_Role.
edit: it is actually working fine with
aws ecr get-login-password --region $AWS_REGION | crane auth login --username AWS --password-stdin $TARGET_REGISTRY
when I set the proper region!

@imjasonh
Copy link
Collaborator

crane auth login updates your ~/.docker/config.json to set the username and password you provide. It then uses the auth configured in that file to make requests to registries.

(docker login does the same thing, crane does it for compatibility with docker)

I'll try to make docs describe this interaction better. I'm pretty sure this has been crane's behavior for private registries forever though.

@ericbl
Copy link
Author

ericbl commented Mar 17, 2022

Here, I did NOT use crane auth login to update config.json BUT did the opposite: I provided the ~/.docker/config.json file and saw crane reading it.

@schurteb
Copy link

schurteb commented Mar 18, 2022

I would like to build with Kaniko here too but it seems not possible to build multi arch image with Kaniko ( GoogleContainerTools/kaniko#786 )

In case you'd prefer to have the images built with kaniko:
You can compose multi arch manifests from kaniko's single arch builds using github.com/estesp/manifest-tool.

Using this in a job between my build & deploy stages enables me to copy the multi arch image in a single deploy job from gitlab to the appropriate ECR instance, even though kaniko itself builds single arch images only

@github-actions
Copy link

This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Keep fresh with the 'lifecycle/frozen' label.

@notSoWiseOldMan
Copy link

I am running into this same issue with a private docker registry:

Docker Hub Image:

$ docker manifest inspect --verbose docker.io/php:7.4-cli-alpine3.15 | grep arch
				"architecture": "amd64",
				"architecture": "arm",
				"architecture": "arm",
				"architecture": "arm64",
				"architecture": "386",
				"architecture": "ppc64le",
				"architecture": "s390x",

Mutate:

$ crane mutate --platform all docker.io/php:7.4-cli-alpine3.15 --label="[import-time=$(date)]" --tag=my-registry.com:5000/test/muatate/php:7.4-cli-alpine3.15
2022/07/25 12:08:50 existing blob: sha256:faa6466ede32c2154fffaa365000f3cad688fef630bd5b563a7c38d62f5eaea4
2022/07/25 12:08:50 existing blob: sha256:ab6db1bc80d0a6df92d04c3fad44b9443642fbc85878023bc8c011763fe44524
2022/07/25 12:08:50 existing blob: sha256:1818b5634af7764c7d3d6723f36171a7ec2258eec24a5efda9a9b0f8a6814952
2022/07/25 12:08:50 existing blob: sha256:ef232e5ed4f87fe7268fa8b88215c59508bd35458465149604add5489293c609
2022/07/25 12:08:51 existing blob: sha256:7fe139163bdd891621bd0c7945e959bcd8ae5b214ef1e6d5fae93e92d4f7e321
2022/07/25 12:08:51 existing blob: sha256:f17e53d95947e37e411c33fadfe2f2062365fadd5437d7383b4f2b8607eece6d
2022/07/25 12:08:51 existing blob: sha256:f8439b6d7b874032dfe4388dac555182e42de6424037f971c851fbb1eb7d46b0
2022/07/25 12:08:52 existing blob: sha256:8e3c9ddcd2df619327304606048ce9b9f6e333294e9ce10c5f1162914838765d
2022/07/25 12:08:52 existing blob: sha256:36583b481ffa4175e4367f33547fc1dd870688a529edb9a95773a0ea42a50ded
2022/07/25 12:08:52 pushed blob: sha256:17b7c4b016aeb1a3294f958b14e8e1893791a141273279d5af17be50efd28826
2022/07/25 12:08:55 my-registry.com:5000/test/muatate/php:7.4-cli-alpine3.15: digest: sha256:c8c6624ff22da6c6df4e56c5a828e957b55732837cdf4a0a11d538eef5630667 size: 1726

Result Image:

$ my-registry.com:5000/test/muatate/php:7.4-cli-alpine3.15 | grep arch
			"architecture": "amd64",

It seems like crane doesn't read all archs, it only reads the one you ask for or your localhost arch by default

$ crane config --platform all docker.io/php:7.4-cli-alpine3.15 | jq ".os,.architecture"
"linux"
"amd64"

@imjasonh
Copy link
Collaborator

I think the issue there is that --platform=all isn't really defined in crane, and it's probably (incorrectly) interpreting that to mean "just use the default linux/amd64". It should fail instead, since all isn't a valid platform string.

Same with crane mutate -- it's interpreting --platform=all to mean "the default", and only labelling that single-arch
linux/amd64 image.

If you want to label every image in a multi-arch manifest, or get all the configs for a multi-arch manifest, you'll have to do some scripting with jq, xargs, etc., there's no single crane command that will do all of that in one invocation.

@notSoWiseOldMan
Copy link

thank you for the quick reply 😄

So when I do copy/mutate multiple times with separate --platform params they just end up overwriting each other. How can I have crane copy multiple architectures of an image?

@imjasonh
Copy link
Collaborator

imjasonh commented Jul 25, 2022

crane copy by default will copy all the platforms of the image. If you specify --platform, only the matching platform image will be copied.

crane mutate only operates on a single-platform image, so it has to choose one. If you don't specify --platform, it defaults to mutating linux/amd64.

Let me know if there improvements we can make to docs or examples to make this clearer.

edit to add: crane mutate --platform=linux/amd64 shouldn't overwrite any result of crane mutate --platform=linux/arm64 for example, unless they're both configured to push back to the same tag. Effectively, this:

crane mutate <multi-platform image>
  --platform=A
  --entrypoint=foo
  --tag=<tag>

says

  1. pull <multi-platform-image>
  2. extract the image matching platform A
  3. set its entrypoint to foo
  4. push that single-platform image back to the registry as <tag>

It doesn't try to do anything like update <multi-platform-image> with the result of having extracted and set the entrypoint for one image inside the original multi-platform image.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/stale question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants