-
Notifications
You must be signed in to change notification settings - Fork 203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support multi architecture docker images using BuildX #3355
Conversation
Codecov Report
@@ Coverage Diff @@
## main #3355 +/- ##
==========================================
+ Coverage 54.46% 54.47% +0.01%
==========================================
Files 1527 1527
Lines 639478 639478
==========================================
+ Hits 348305 348374 +69
+ Misses 234334 234266 -68
+ Partials 56839 56838 -1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of comments, but I'm not yet convinced that this enables the container change we want without cutting off entire development scenarios.
- name: Docker login | ||
run: | | ||
container_id=${{env.container_id}} | ||
docker exec -e AZURE_CLIENT_ID -e AZURE_CLIENT_SECRET -e DOCKER_REGISTRY "$container_id" task docker-login |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that these secrets are being passed in to the devcontainer, is there any possibility of them being logged or left behind as a part of the published image? If so, this opens up the chance of someone harvesting those secrets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so - We already do pass-in secrets for live validation and az-login in other workflows as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logging we're protected from because GH actually redacts the logs for the secret contents. Even if you try to log the secret it'll be redacted.
Rest I think as Harsh said, we should be OK with as long as we aren't actually saving the secret into the published container. I think it would be relatively obviously if we were based on the dockerfile for the image we're building.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my above question though about why we have to do it this way... it seems like the docker login should work?
Or is the issue that, docker context on host and docker context in devcontainer are different, and so we need to make the login happen in the devcontainer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking a git blame
of this file, it was @Porges who wrote the comment
# note that all creds are on host and never passed into devcontainer
Working from the precept that smart people do things for good reasons, I'm worried that we're missing something here. To be more specific on my earlier concern, I'm not worried about disclosure during CI, I'm worried about the credentials being left lying around inside the container image, available for anyone to extract if they go nosing around inside.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default, Docker looks for the native binary on each of the platforms, i.e. "osxkeychain" on macOS, "wincred" on windows, and "pass" on Linux. A special case is that on Linux, Docker will fall back to the "secretservice" binary if it cannot find the "pass" binary. If none of these binaries are present, it stores the credentials (i.e. password) in base64 encoding in the config files described above.
Here's the reference to how docker stores the credentials and auth tokens.
- name: Docker login | ||
run: | | ||
container_id=${{env.container_id}} | ||
docker exec -e AZURE_CLIENT_ID -e AZURE_CLIENT_SECRET -e DOCKER_REGISTRY "$container_id" task docker-login |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logging we're protected from because GH actually redacts the logs for the secret contents. Even if you try to log the secret it'll be redacted.
Rest I think as Harsh said, we should be OK with as long as we aren't actually saving the secret into the published container. I think it would be relatively obviously if we were based on the dockerfile for the image we're building.
- name: Docker login | ||
run: | | ||
container_id=${{env.container_id}} | ||
docker exec -e AZURE_CLIENT_ID -e AZURE_CLIENT_SECRET -e DOCKER_REGISTRY "$container_id" task docker-login |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my above question though about why we have to do it this way... it seems like the docker login should work?
Or is the issue that, docker context on host and docker context in devcontainer are different, and so we need to make the login happen in the devcontainer?
- name: Build, tag and push docker image | ||
run: | | ||
container_id=${{env.container_id}} | ||
docker exec -e DOCKER_PUSH_TARGET "$container_id" task controller:docker-push-multiarch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: is there a "normal" docker-push
target stil? If not, seems like we should just rename this to that? Yes it's multiarch but it's the only push we have?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have any other docker push target instead of docker-push-local
. I'm letting the local push target be there for reasons:
- For local builds, we don't need multi-arch images as they take time to build and occupy space.
- Buildx does not have direct support for pushing to local registries. We need to add in a few extra steps to enable it. Mentioned in Build cannot export to registry on localhost
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably document this on the taskfile targets, for why we have two and why different ones are used in different places.
There are some risks with this too (we're not actually testing the image we're creating), but I see that lots of people have this problem, as documented:
docker/buildx#166
docker/buildx#1152
There does seem to be a way to push buildx image to local registry now though, so I am wondering if we should try swapping the kind
steps to use the multiarch image?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can surely swap kind steps to use buildx, but not sure about the multiarch images if thats a good idea. As I noticed, for cross-platform images buildx takes around about 10-15 mins to build(on my machine). Which would increase the CI run time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CI shouldn't be building the image every time - it's supposed to be cached. We even have a workflowthat runs weekly to keep it current.
So if switching CI to use buildx slows it down, we need to fix that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think CI caches this image @theunrepentantgeek - it caches the .devcontainer image, not the controller image (being built here). It must build the controller image every time because that's the image that:
- Contains the code changes for the PR in question.
- Runs in kind and is verified by the tests.
My understanding of what buildx
does is that it basically calls N docker builds with different architectures and then automatically merges the manifests. So it may be OK to not use it for the local build (see the issues I linked above for where this is talked about some and people suggest using buildx w/ a single version argument to get local working, which anyway is going to be different than what we do for release)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm tentatively convinced that this doesn't prohibit development and debugging approaches that currently work, so I'm approving. Once the comments are addressed, merge away.
- name: Docker login | ||
run: | | ||
container_id=${{env.container_id}} | ||
docker exec -e AZURE_CLIENT_ID -e AZURE_CLIENT_SECRET -e DOCKER_REGISTRY "$container_id" task docker-login |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking a git blame
of this file, it was @Porges who wrote the comment
# note that all creds are on host and never passed into devcontainer
Working from the precept that smart people do things for good reasons, I'm worried that we're missing something here. To be more specific on my earlier concern, I'm not worried about disclosure during CI, I'm worried about the credentials being left lying around inside the container image, available for anyone to extract if they go nosing around inside.
@@ -35,7 +35,7 @@ vars: | |||
LATEST_VERSION_TAG: | |||
sh: git describe --tags $(git rev-list --tags=v2* --max-count=1) | |||
|
|||
VERSION_FLAGS: -ldflags "-X {{.PACKAGE}}/internal/version.BuildVersion={{.VERSION}}" | |||
VERSION_FLAGS: '"-X {{.PACKAGE}}/internal/version.BuildVersion={{.VERSION}}"' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why remove -ldflags
- it's one of the flags required to set the version in the executable. Doesn't make sense to me to remove this here, then restate is manually everywhere else.
If you have a need for part of this value elsewhere, introduce a different variable for that purpose rather than redefining this one and making things awkward.
- docker buildx create --driver-opt network=host --use | ||
- docker buildx build --push | ||
--build-arg VERSION_FLAGS={{.VERSION_FLAGS}} | ||
--build-arg CONTROLLER_APP={{.CONTROLLER_APP}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: Leave a comment here about how we don't use multiarch platform options here because we dont need them for local testing and they don't work for local registries? (Can link the GH issues I linked earlier)
Taskfile.yml
Outdated
@@ -372,57 +373,72 @@ tasks: | |||
# excluding the ./apis directory here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this is true anymore, it's not excluded. I don't know why it should be excluded though because I do think that the controller does depend on the api folder... so maybe just delete this comment?
Co-authored-by: Matthew Christopher <matthchr@users.noreply.github.com>
Closes #3321
What this PR does / why we need it:
This PR adds support for publishing multi-architecture docker images using buildx.
Changes include: