-
Notifications
You must be signed in to change notification settings - Fork 481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
build: rework node resolution #1966
Conversation
2798b7a
to
450ad3e
Compare
450ad3e
to
093226c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit out of scope but it would be nice to check if supported platforms reported by BuildKit are emulated or not and if so, native ones should take priority.
093226c
to
b23a510
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made some tests on my side and matching is way smarter than before 🎉
As said in #1966 (review) we could check in follow-up if supported platforms reported by BuildKit are emulated or not and if so, native one should take priority.
I was thinking we would need BuildKit changes to return the native platform as a new worker opt but it doesn't seem necessary as it's always the first item of the list of supported platforms https://github.com/moby/buildkit/blob/687091bb6c8aaa0185cdc570c4db3db533f329d0/util/archutil/detect.go#L23.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One problem with this is that (imho) the platforms.Only
implementation for arm64 is wrong in containerd. It assumes that all arm64
mean compatibility with arm
but that is not the case in practice, eg. no Apple chip is 32bit compatible.
We should at least make sure that if there is both arm
node and arm64
node then arm
builds will not match with arm64
.
build/resolver.go
Outdated
} | ||
} | ||
|
||
func (m *archMatcher) Match(p specs.Platform) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would building on same arch for a different OS be faster than any other node?
Eg. if you are building for darwin/arm64
and have nodes linux/amd64
, linux/arm64
then it should not go to linux/arm64
but to the first node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The theory of this is not about faster speed, but about output artifacts.
OS of a system can be a bit of a lie sometimes, e.g. WSL, or Docker Desktop, etc. However, in those cases, while not perfect, it would be better to match to the architecture, instead of defaulting to just the first.
In your scenario, this could happen with a remote builder connected to Docker Desktop - if I build and --load
without a --platform
flag, then we'll just select the first builder, which might be the non-native architecture linux/amd64
. We should prefer to load the linux/arm64
image to the container store instead.
It's not perfect by any measure, but I think it's a bit more reasonable than just picking the first. This way, we don't require that the user reconfigures the order of their builders (I'm not sure what the UX of that would even be like) to get the desired default behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(🤔 I thought I already left a comment for this, either I forgot to submit or it was somewhere else)
The feature you are describing here is determining the implicit platform (if none is given by user) based on DOCKER_CONTEXT/-o context=
value if --load
is used. If we want this then it should be specific to the --load
behavior and look at the context value. Without --load
it should be the native platform of the first node.
which might be the non-native architecture linux/amd64.
This is only the case when there is mismatch of configurations between builder and load target. Otherwise the first is always native. I don't understand how ignoring the OS allows determining the load target.
OS of a system can be a bit of a lie sometimes, e.g. WSL, or Docker Desktop, etc
These are server-side platforms. I don't understand where the difference could come from.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed this for now, I'll open a follow-up PR to do this at some point.
b23a510
to
a41875c
Compare
This patch reworks and updates the node resolution logic for selecting a node from a builder. The new implementation reworks the logic to make use of containerd's platforms.Matcher interface instead of manually associated strings, and additionally provides a few behavioural changes over the original implementation, namely platforms can be matched with non-strict semantics. e.g. i386 builds can be scheduled on an amd64 node, arm/v6 builds can be scheduled on an arm/v7 node. We also add a new collection of tests for tracking regressions and making the intended behaviour clearer. Signed-off-by: Justin Chadwell <me@jedevc.com>
a41875c
to
47eb405
Compare
@crazy-max @tonistiigi does this make sense to take for v0.12? |
I would like this one in 0.12 as well. WDYT @tonistiigi? |
😬 , I forgot to rerun CI before merging (all was green and no conflicts), and this broke everything so I had to revert. Please reopen. |
This patch reworks and updates the node resolution logic for selecting a node from a builder.
The new implementation reworks the logic to make use of containerd's
platforms.MatchComparer
interface instead of manually associated strings, and additionally provides a few behavioural changes over the original implementation:i386
builds can be scheduled on anamd64
node,arm/v7
builds can be scheduled on anarm/v8
node.We also add a new collection of tests for tracking regressions and making the intended behaviour clearer.