-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clear Architecture field in platform constraint for arm architectures #34021
Clear Architecture field in platform constraint for arm architectures #34021
Conversation
3cbd838
to
444fdca
Compare
I wonder if we could append all known ARM variants to |
Ping @stevvooe |
@aaronlehmann based on the discussion in the SwarmKit issue, I don't think we should be doing that. This will eventually be fixed for images. |
Wouldn't this PR cause ARM images to be scheduled on x86 and other platforms? I suppose it's better than not working at all, but I'm wondering if we can do better. |
@aaronlehmann yes, that will happen. I think until the manifest is fixed, it is better to err on the side of putting lesser constraints. |
This fix needs to happen server-side:
Yes, but right now, we have no way of resolving which version of arm an image belongs to. |
444fdca
to
5fa6df3
Compare
It is strange that I'm just learning this, but a weird mismatch that we actually query the Linux UTS subsystem for the daemon's Seems we need to at some point resolve matching to all use GOOS/GOARCH (with the added |
I agree with all of the above; we should fix the current situation, because it's a regression from 17.03;
This is what 17.03 did, so not a regression. Mixed architecture Swarm clusters are probably rare, and if someone currently uses a mixed architecture (or operating system) cluster, they probably use constraints already to influence scheduling. In the reporter's case moby/swarmkit#2294, things worked fine in the old situation; the service used an image specific to the architecture, and could just be deployed. |
@estesp what Go chooses to do does not correspond to the real world
requirements of users necessarily either. This will likely diverge more in
future too, so I don't think it is a great idea just to base things on Go.
…On 11 Jul 2017 22:12, "Phil Estes" ***@***.***> wrote:
It is strange that I'm just learning this, but a weird mismatch that we
actually query the Linux UTS subsystem for the daemon's architecture
setting, but all image handling is done with GOOS and GOARCH. It was
confusing to me that an image was trying to match on something that Go
doesn't even acknowledge (armv7l) which to me is more the root of the
problem than the well understood issue with ARM variants.
Seems we need to at some point resolve matching to all use GOOS/GOARCH
(with the added variant processing that is already available in OCI and
also available when building a manifest list) so that we don't have some
resolver of a bunch of random distro uname choices back to something
"understood".
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#34021 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAdcPLn03_0IL1Rmxl3l_yUhwJ8XVDL6ks5sM-UngaJpZM4ORfra>
.
|
@justincormack true, the connection to Go is not to take their view of OS/architecture as gospel, but realize that as Go programs, we have a few options for determining our current runtime architecture and OS; either ask the Go runtime, or try to divine it from Linux syscalls/environment data. FWIW, IMHO, YMMV :) .. Go is a much more stable source of this information than Linux as distros have shown to be quite open to variance and finding their own way when it comes to embedded architectures and To make this more concrete, there is an issue (I think it was over in the OCI spec repo) that lists about 60 different variants (strings) of arch-os tuples on ARM platforms running Linux. Go provides us 2: "arm" and "arm64". That leaves us having to find another method (using the proposed variant field in manifest list/OCI indexes) when building images for v6, v7, v8 specific ARM families, but at least it means we aren't trying to harmonize 60 different strings down to "arm" or "arm64" + a variant list of 3 items. Of course we will have to find a way to query the runtime "variant" when arch = {arm,arm64}, which IMO is the only missing piece, but that only applies to manifest list/index-based image resolving, not single manifest images which only provide us the two pieces of info: OS and arch. If I haven't rambled on too long (don't answer that), another way to look at this bug is that you will never have a stable comparison between runtime.GOARCH and |
@justincormack UTS doesn't exactly help here. Right now, it returns Go's approach is at least pragmatic: they try to define a real machine target. If you can identify a standard approach that actually works, then, by all means, suggest one. Laying this on the footsteps of Go's toolchain is fairly dismissive. |
I also tested In addition there are no error messages on service inspect/ls/ps. |
@alexellis fyi the code changes in this PR was not included in the 17.07.0-ce-rc1 release because it has not been merged yet. Curious to know if you ran your test with a custom compile of the 17.07.0-ce-rc1 codebase with this PR patch applied? |
I think Stefan and myself tried the same install script from Eli @ Docker Inc via the #arm Slack channel-
All services from docker-compose.armhf.yml are pinned at 0/1 replicas. DEBU[0116] no suitable node available for task module=node node.id=a82troq1z5gmjwnpdlw76k5v0 task.id=dcxqoaohw872432weilc9voc8 However with 17.05 all come out at 1/1 replicas.
|
@alexellis what does docker service inspect -f '{{json .Spec.TaskTemplate.Placement}}' <service-id> show for those services? |
|
@thaJeztah ... it's still broken in the following version
|
@alexellis nothing has changed since #34021 (comment) "fyi the code changes in this PR was not included in the 17.07.0-ce-rc1 release because it has not been merged yet. Curious to know if you ran your test with a custom compile of the 17.07.0-ce-rc1 codebase with this PR patch applied?" Have you tested it? Does it fix the issue? |
I interpreted the comment to mean "tell me what build you're using?" and I replied. Do you have a "custom compile" of Docker CE available for testing? |
I can confirm the fix works.. I deployed a full application (FaaS). I tested it along with rallying some of the community to help with the ARM testing and opened additional issues in the docker-arm repo with the tag community-supports-docker. |
Concur with @alexellis above - have similarly tested the fix today - commented with details on alexellis/docker-arm#17 |
Is this PR a suitable stepping-stone step in the interim? |
@@ -123,12 +123,19 @@ func (s *distributionRouter) getDistributionInfo(ctx context.Context, w http.Res | |||
if err == nil { | |||
err := json.Unmarshal(configJSON, &platform) | |||
if err == nil && (platform.OS != "" || platform.Architecture != "") { | |||
if platform.Architecture == "arm" || platform.Architecture == "Arm" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was looking at this approach and doing it on the server side will create future problems for this endpoint. Clearing the architecture should be done on the client side, when setting up the constraint. That way, we can fix this in the future and still be able to support this version of docker.
LGTM @nishanttotla Thanks! |
client/service_create.go
Outdated
@@ -98,6 +98,11 @@ func imageDigestAndPlatforms(ctx context.Context, cli DistributionAPIClient, ima | |||
if len(distributionInspect.Platforms) > 0 { | |||
platforms = make([]swarm.Platform, 0, len(distributionInspect.Platforms)) | |||
for _, p := range distributionInspect.Platforms { | |||
// clear architecture field for arm. This is a temporary fix. | |||
arch := p.Architecture | |||
if arch == "arm" || arch == "Arm" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if stringsToLower(arch) == "arm"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also should this be HasPrefix
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the images I've seen, it's only been "arm". We could use HasPrefix
to be extra careful but I don't think it's necessary.
client/service_create.go
Outdated
@@ -98,6 +98,11 @@ func imageDigestAndPlatforms(ctx context.Context, cli DistributionAPIClient, ima | |||
if len(distributionInspect.Platforms) > 0 { | |||
platforms = make([]swarm.Platform, 0, len(distributionInspect.Platforms)) | |||
for _, p := range distributionInspect.Platforms { | |||
// clear architecture field for arm. This is a temporary fix. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add more detail here as to why it's being cleared?
Signed-off-by: Nishant Totla <nishanttotla@gmail.com>
f172404
to
772af60
Compare
@cpuguy83 addressed your comments. |
Lint errors are being dealt with here: #34706 |
This commit will also need vendoring into the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Is this one ready to go or are there any other types of verification that is needed before merging? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🐮
This is a potential fix for moby/swarmkit#2294
This is a temporary requirement.
Also related: opencontainers/image-spec#661