Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add docker.io to unqualified image name #12321

Closed
wants to merge 1 commit into from

Conversation

mscherer
Copy link
Contributor

Since that's the default behavior of moby engine, the compat
API need to reflect that.

[ NO NEW TESTS NEEDED ]

Signed-off-by: Michael Scherer misc@redhat.com

What this PR does / why we need it:

This PR should fixe #12320

How to verify it

Same as my other PR, trying to get woodpecker running.

Which issue(s) this PR fixes:

Fixes #12320

Special notes for your reviewer:

There is no test added, because adding them would requires to download image from docker.io, but this doesn't seems wise due to the rate limit in place on the registry.

@mscherer
Copy link
Contributor Author

Also, I would prefer having a constant somewhere rather than hardcoding docker.io, but I haven't found one in the existing codebase. I doubt the moby project is going to change that soon, so maybe that's fine.

Since that's the default behavior of moby engine, the compat
API need to reflect that.

Fixes containers#12320

Signed-off-by: Michael Scherer <misc@redhat.com>
if err != nil && shortnames.IsShortName(fromImage) {
fromImage = fmt.Sprintf("%s/%s", "docker.io", fromImage)
_, err = shortnames.Resolve(runtime.SystemContext(), fromImage)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was suggesting something like:

if err != nil && shortnames.IsShortName(fromImage) {
		names, err = shortnames.Resolve(runtime.SystemContext(), fromImage)
                if len(names) == 1 {
                      fromImage=names[0]
               } else {
		       fromImage = fmt.Sprintf("%s/%s", "docker.io", fromImage)
              }
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That way if I have a shortname mapping, Podman service will use it, if there is not one then fail over to docker.io.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes, my bad; I will fix it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but if a software ask for ubi8, shortname.Resolve shouldn't fail in the first place, so it wouldn't go in that condition, no ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what you are asking. shortnames.IsShortName("ubi8") should return true.
Then I would want to pull registry.access.redhat.com/ubi8 not docker.io/ubi8.

Copy link
Contributor Author

@mscherer mscherer Nov 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but the code just before (line 257):

_, err := shortnames.Resolve(runtime.SystemContext(), fromImage)

will return nil in err if fromImage is a short name alias, since it can be resolved.
If there is no error, fromImage will not be modified.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your right I missed that
LGTM

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is ignoring errors in short-name resolution which I think is something we cannot do. Let's go one step back and first define the problem we want to solve. This looks like a way to big shotgun to me.

@mscherer
Copy link
Contributor Author

Also, it seems github do not refresh comments (or I have a issue with my browser), as I just seen your 2nd comment on the 1st version of the PR, sorry.

@rhatdan
Copy link
Member

rhatdan commented Nov 17, 2021

/approve
@vrothberg @jwhonce @mheon @baude @flouthoc PTAL

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 17, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mscherer, rhatdan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 17, 2021
Copy link
Member

@vrothberg vrothberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not agree with the change. It is a breaking one that can be avoided.

If users wish to only resolve to docker.io, they are free to change that in /etc/containers/registries.conf. We also need to care about users migrating over from RHEL 7 (atomic) Docker which supports search registries.

@vrothberg
Copy link
Member

/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 17, 2021
@vrothberg
Copy link
Member

Sorry for the ping-pong, @mscherer.

@rhatdan
Copy link
Member

rhatdan commented Nov 17, 2021

This is only for the compatibility API, which does not effect RHEL7 Docker.

@vrothberg
Copy link
Member

This is only for the compatibility API, which does not effect RHEL7 Docker.

I am convinced it does affect RHEL7 Docker. Scripts on RHEL 7 are only speaking compat API, so it will impact upgrades.

@rhatdan
Copy link
Member

rhatdan commented Nov 17, 2021

There was no support for docker-compose or docker-py on RHEL7. We don't support them now.

@rhatdan
Copy link
Member

rhatdan commented Nov 17, 2021

Also if we say this is for podman 4.0 then we can have breaking change.

@vrothberg
Copy link
Member

It doesn't change my take on the proposed change. If users want to resolve to docker.io only they should be using registries.conf. This very likely breaks existing deployments. We enforce that in podman machine init already and users can opt-out by ... changing registries.conf :^)

@rhatdan
Copy link
Member

rhatdan commented Nov 17, 2021

This does not fix the issue for docker-py and docker-compose on non MAC.

@vrothberg
Copy link
Member

This does not fix the issue for docker-py and docker-compose on non MAC.

Can we define exactly what the problem is?

@rhatdan
Copy link
Member

rhatdan commented Nov 17, 2021

Docker compatibility means shortnames map to docker.io.

@mscherer
Copy link
Contributor Author

Ok so the current problem is that the go code from moby, used by a few projects (for example, woodpecker, which I am trying to get running on Fedora without using Docker) will convert a call of:

client.ImagePull(_, "docker.io/a6543/test_git_plugin", _ )

to a POST request to

POST /v1.33/images/create?fromImage=a6543%2Ftest_git_plugin&tag=latest

So even if a client use a fully qualified name with the registry, Moby client go code switch to a registry-less name (eg, "familiar name" in the moby API), and is assuming to have a docker daemon that will default to docker.io registry on the other side.

While the proper engineering fix is to convince moby upstream to change that practice, I doubt this will happen (eg, podman origin story).

@vrothberg
Copy link
Member

Docker compatibility means shortnames map to docker.io.

Okay, so we agree on the problem. I stick with configuring registries.conf though. Solving this in the code at this location only is not sufficient to catch all possible code paths to enforce docker.io. The PR also only does it if there's an error in short-name resolution which is on top ignoring the short-name policy (and hence a security risk).

@mscherer
Copy link
Contributor Author

I pondered about the triggering a error on the shortname path, but from a quick look, it seems this requires the attacker to be privileged enough to mess with it on the system. There is 8 errors condition on the shortname code:

  • not having a TTY (which is going to be triggered anyway on a daemon)
  • not selecting the right item in the menu (unapplicable using the API)
  • incorrect shortname mode (so requires root to fiddle with config)
  • having no registry configured, requires to be root and fiddle with the config
  • having no unqualified-search registry, likely requires to be root
  • error on reference.ParseNormalizedNamed
  • error on parseUnnormalizedShortName
  • incorrect short mode configuration

Out of 8, 6 seems non applicable.

There is the issue with ParseNormalizedNamed and parseUnnormalizedShortName. But the only way a attacker would be able to trigger a error there is by asking to parse attacker controlled string (eg, the image name). But then, if the problem is that a attacker can provide a string that create a parse error to later have that string be prefixed by "docker.io", the attacker can simply just prefix docker.io themself and be done with it.

So I understand your point about being security sensitive, but I can't see how it would be used. Currently, the only type of attack would result in changing a image name, and either requires to be root (or a equivalent, eg, enough access to change config), or to control the image name in the 1st place.

As for the uncompleteness of the patch, I agree. However, I do not see on top of my head any others operations that would deal with remote registries in the docker API.

And asking to change registries.conf would be unfortunate, as it would impact the rest of the system. I do not think it would be wise to default to docker.io for everything if we want to activate the compat API.

@vrothberg
Copy link
Member

Out of 8, 6 seems non applicable.

Ignoring an error from as security-sensitive API (and ignoring a global configuration the rest of the entire code base is using) is a recipe for running into problems.

As for the uncompleteness of the patch, I agree. However, I do not see on top of my head any others operations that would deal with remote registries in the docker API.

I guarantee there are dozens of code paths where this may happen. Also consider long-term maintainability. Sticking to a global configuration that all code paths use makes sure we're not regressing or diverging in the future. Hard-coding one location and leaving the rest untouched does not give that guarantee.

And asking to change registries.conf would be unfortunate, as it would impact the rest of the system. I do not think it would be wise to default to docker.io for everything if we want to activate the compat API.

I sympathize with your desire and I find search registries painful for such cases but there is no one-size-fits-all solution. Also as mentioned before, the atomic Docker in RHEL7 and older Fedoras is using these search registries. Existing users may very well depend on the current behavior. Using Podman 4.0 as an "excuse" for a breaking change is not attractive to me, since there is a way to support both scenarios by not changing anything (again registries.conf).

Ignoring registries.conf is not something I am willing to ship (or maintain) for the aforementioned reasons. Note that we discussed this issue many times internally, and we had consensus on the current solution.

@mscherer
Copy link
Contributor Author

The root of the problem is that we can either be compatible with Docker API (which is not secure), or maintain the current behavior (not compatible with Docker API).

So if we drop the compatibility (and I would say that not being compatible with the reference client implementation on a CreateContainer is not a niche operation), where the limitation should be documented ?

@vrothberg
Copy link
Member

The root of the problem is that we can either be compatible with Docker API (which is not secure),

Why is that not secure? Aliases would still work and configuring "docker.io" only is unambiguous.

or maintain the current behavior (not compatible with Docker API).

Yes.

So if we drop the compatibility (and I would say that not being compatible with the reference client implementation on a CreateContainer is not a niche operation),

That really depends. Some distributions set the search registries to "docker.io" only and there the issue doesn't apply.

where the limitation should be documented ?

I am very biased since I spent so much time on this topic. man containers-registries.conf explains the behavior in detail and some podman manpages reference it. Do you have an idea with a "fresh pair of eyes" where to document it best? We blog a lot, would a blog post on that help?

@rhatdan
Copy link
Member

rhatdan commented Nov 17, 2021

You making every user that could hit this, have the issue, and telling them to modify there registries.conf.
I think it is more important that we work with Docker libraries then theoretical former support for the docker api. (Which we never had).
I could see checking if strict controls is on, then don't accept short names. I think checking for docker.io in the registries.conf should also happen. But I want the greatest number of people who use the docker.sock API to work out of the box.

@mscherer
Copy link
Contributor Author

mscherer commented Nov 17, 2021

Why is that not secure? Aliases would still work and configuring "docker.io" only is unambiguous.

I am a bit lost. You said that ignoring the shortname policy (eg, defaulting to choosing docker.io when the policy is to ask users to choose) is a security risk. So the docker policy of defaulting to docker.io is a security risk.

That really depends. Some distributions set the search registries to "docker.io" only and there the issue doesn't apply.

Yeah, but they are not the distros I use. RHEL/Centos do have multiple search registries, Fedora does too. And podman as a project want to promote having multiple registries, so I think having to choose between compatibility with existing tooling, and a long term goal of decentralization is not a great tradeoff to decide

Do you have an idea with a "fresh pair of eyes" where to document it best? We blog a lot, would a blog post on that help?
Well, deciding whether podman should be compatible or not with Docker bugs for bugs is above my pay grade.

As a user of woodpecker, the required config change should be in the error message, because that's what is displayed when i try to use podman (in red).

As a sysadmin, a line in the log that say "insecure compat API use detected, do X if you want to make it work" would surely help, as that's where I would look.

As for others personas, I guess people from support might want to have a KB article on their respective knowledge base (eg, access.rh.c for RH, whatever Suse use for Suse, etc), and I guess the website could be updated, but there is no section on the api compat.

A blog post could help for sure, that's how I found about the docker API compat (since the initial plan for woodpecker was to add a podman backend, but this doubled the size of the vendor directory and added several non go dependecnies).

@vrothberg
Copy link
Member

You making every user that could hit this, have the issue, and telling them to modify there registries.conf.

So far this is the first time a user reports this issue. And we must not ignore backwards compatibility: turning it around, this PR would break all existing deployments depending on the current behavior but without a workaround or solution.

And again, we had consensus on this very behavior, and I am afraid changing that too quickly is risky.

[...] But I want the greatest number of people who use the docker.sock API to work out of the box.

That is also why we enforce docker.io for podman machine init. But we enforce it globally in the VM via registries.conf.

I also want users migrating on the CLI to work out of the box, but they may face the very same issue as well.

The ask here is to have a separate short-name resolution for the compat API. I don't know how to do that since registries.conf works globally and that behavior is scattered across the code base of Podman, Buildah and c/common - all of which are exercised in the compat API as well. Just changing a single location is not enough and changing all is a nightmare to maintain.

@vrothberg
Copy link
Member

I am a bit lost. You said that ignoring the shortname policy (eg, defaulting to choosing docker.io when the policy is to ask users to choose) is a security risk. So the docker policy of defaulting to docker.io is a security risk.

Only ignoring the errors is risky. If there is only one search registry, the resolution will work without a problem and there is no risk at all. So defaulting to docker.io is not an issue but we must not ignore errors.

@mscherer
Copy link
Contributor Author

So far this is the first time a user reports this issue. And we must not ignore backwards compatibility: turning it around, this PR would break all existing deployments depending on the current behavior but without a workaround or solution.

Given that's already my 3rd PR to make the compat API inline with Moby behavior ( #12279 , #12318 ), I suspect the compat API is maybe not used that much. While #12279 is a corner case, #12318 is noticeable, since it would break anytime you try to create a container with a fully qualified name. However, as the issue can be worked around easily by downloading the image manually a first time, I suspect people just used the workaround.

But to go back on fixing the issue, would a patch that replace runtime.SystemContext() with a specific context just for the compat API be ok ?

Eg, have a specific context that force docker.io as a single search registry, and inherit the whole system context for all others settings.

This way, we use that context everywhere in the compat API (and so it get used everywhere below), we do not have to ignore the errors, and it will not requires users to change the behavior of the non compat API, nor to change the config for that.

@vrothberg
Copy link
Member

But to go back on fixing the issue, would a patch that replace runtime.SystemContext() with a specific context just for the compat API be ok ?

I fear not since it is only changing this specific code path while all others would behave differently in the compat API. Even changing everything in c/podman would not be sufficient since we're calling c/buildah and c/common as well.

@mscherer
Copy link
Contributor Author

I fear not since it is only changing this specific code path while all others would behave differently in the compat API.

If we change everywhere in pkg/api/handlers/compat, it will be applied to the whole compat API and that's what I proposed.

Even changing everything in c/podman would not be sufficient since we're calling
c/buildah and c/common as well.

But so, shouldn't buildah and common inherit the context of the request from the podman code ?

@vrothberg
Copy link
Member

Even changing everything in c/podman would not be sufficient since we're calling
c/buildah and c/common as well.

But so, shouldn't buildah and common inherit the context of the request from the podman code ?

That varies depending on the call sites and the APIs. When it comes to search registries, the source of truth is c/image/pkg/sysregistriesv2. But, again, I am not willing to enforce using docker.io in the code. Assuming we did alter the system context in handlers/compat it would require a larger code audit to make sure that it's passed along consistently.

@mscherer
Copy link
Contributor Author

Assuming we did alter the system context in handlers/compat it would require a larger code audit to make sure that it's passed
along consistently.

If it wasn't the case, this would be a bug with or without the change, but would it be detectable ? EG, is there a way to run the test suite that would trigger a bug ?

Cause even if I say "I audited the code", I suspect this wouldn't be sufficient. I guess one way would be to run the test suite and check that registry files are read only once, or something like that (or instrument the code to log a string each time the object is created or use a global var, something like that).

But so, altering the systemContext would be a acceptable fix ?

@vrothberg
Copy link
Member

If it wasn't the case, this would be a bug with or without the change, but would it be detectable ?

Not necessarily a bug but at least an inconsistency.

But so, altering the systemContext would be a acceptable fix ?

Not for me, no. Again, enforcing docker.io in the code is not an acceptable change to me. I am convinced that if users wish to only resolve to docker.io, then they should change registries.conf.

@mheon
Copy link
Member

mheon commented Nov 17, 2021

Do we want to discuss this at the cabal? I think there's a compatibility case to be made for the default prefixing.

@vrothberg
Copy link
Member

Do we want to discuss this at the cabal? I think there's a compatibility case to be made for the default prefixing.

Sure we can

@mscherer
Copy link
Contributor Author

I am convinced that if users wish to only resolve to docker.io, then they should change registries.conf.

Ok, so another proposal, what about disabling the compat API if there is multiple registries ?
Since otherwise, that's not compatible with Moby behavior, and result in weird error with Moby client go library using common operations

People wishing to use podman as a docker replacement would need to be explicit by changing the configuration.

@vrothberg
Copy link
Member

Ok, so another proposal, what about disabling the compat API if there is multiple registries ?

Please consider what I wrote above already. There are users of the compat API relying on that behavior. Breaking backwards compatibility is something we need to consider. I understand that your specific use case doesn't work out-of-box on Fedora but there is an easy workaround by editing registries.conf.

@mscherer
Copy link
Contributor Author

Following @mheon suggestion, and @vrothberg agreement, I added the topic on the agenda on hackmd, and on my calendar as well. Should I also send a email to the list ?

@TomSweeneyRedHat
Copy link
Member

TomSweeneyRedHat commented Nov 17, 2021

@mscherer I've tweaked the description in the md just a bit for a little more context. I sent a note to the list earlier pointing at the md for the agenda. If you wanted to send another, I don't think it would hurt.

@vrothberg
Copy link
Member

It would be good to have @rhatdan and @mtrmac in today's cabal as well. Note that we need to make sure to not turn in circles since time is limited.

Since we had this very conversation multiple times, I would love this to be the last one. Once there is a new consensus, I will write it down somewhere here in the docs so we have something physical to refer to.

@rhatdan
Copy link
Member

rhatdan commented Nov 18, 2021

I will be there.

@mscherer
Copy link
Contributor Author

So, closing this PR since that would not be the right way to solve the bug (and we can discuss on the initial bug ). However, I wonder if the error message could be improved, since right now, the message is geared toward interactive use, and that's not suitable for the API.

@mscherer mscherer closed this Nov 18, 2021
@vrothberg
Copy link
Member

Thanks for pushing this forwad, @mscherer.

Once docker.io is enforced for the compat API, the error message will not happen anymore since there is only registry to resolve to.

@mtrmac
Copy link
Collaborator

mtrmac commented Nov 19, 2021

For the record …

Docker compatibility means shortnames map to docker.io.

docker/docker compatibility means that. projectatomic/docker compatibility means shortnames resolve to the search path. We never broke that, to my knowledge (and we have had ample opportunity to consider breaking it, and did discuss it IIRC, at the time of the quay.io/docker.io/… ambiguities came up, before the alias mechanism and existing short-name-mode option were added). So I don’t think there’s a new reason to re-consider breaking it.


Ok so the current problem is that the go code from moby … will convert a call of:

client.ImagePull(_, "docker.io/a6543/test_git_plugin", _ )

to a POST request to

POST /v1.33/images/create?fromImage=a6543%2Ftest_git_plugin&tag=latest

So even if a client use a fully qualified name with the registry, Moby client go code switch to a registry-less name (eg, "familiar name" in the moby API), and is assuming to have a docker daemon that will default to docker.io registry on the other side.

Yes, this inability to tell whether a client using the upstream docker/docker client code (IIRC as opposed to the projectatomic/docker variant) intended to pull exactly from docker.io or intended to use a short name to trigger search is one of the (numerous) problems with the way projectatomic/docker set up the short name mechanism. It’s unfortunate, but it’s also not something that can be retroactively fixed — and only matters for systems that do have multi-element search path. Docker API clients that need to pull specifically docker.io need to be patched to use the fully-qualified name, to work correctly with projectatomic/docker clients as well as with Podman.

Note that this really matters only for clients that have a multi-element search path.


And the ultimate quoted failure:

Error response from daemon: failed to resolve image name: short-name resolution enforced but cannot prompt without a TTY

Requires both a legacy system with the multi-element search path configured, and a system with configuration fresh enough to have the non-permissive short name mode chosen. Such systems definitely can exist, but the enforcing configuration is there exactly to motivate clients’ transition to being explicit about the intended name.

Users of such systems can, I think, either do just that, and modify clients to submit fully-qualified names, or change the configuration (probably to set up the search path to contain only docker.io, making the enforcing state basically moot). Silently using docker.io in that case is arguably not desirable, and anyway not necessary because a configuration with exactly that effect can be set up by the user without code changes.

All new distributions, and new installations of future (current?) versions of old distributions, should only ship with docker.io in the search path (or, as in https://github.com/containers/image/blob/main/registries.conf , making non-alias short names completely rejected), making this whole issue moot. The systems that need the compatibility with the search API, well, need it. And we can support those systems, using the current code, without affecting the any of the newer ones, so we do.

Does all of this break in some other way I’m missing?

@rhatdan
Copy link
Member

rhatdan commented Nov 19, 2021

@mtrmac We had a discussion on this yesterday. and we came to the conclusion that for the Compatibilty API that we should enable to do what is best for the users. We talked about potentially having a flag in SystemContext to either allow callers to specify Docker.IO Mode, or to specify the unqualifiedRegistrieslist. Then it would be up to engines to decide how they want to handle overriding the systems registries.list.

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 22, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 22, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Moby client behavior rely on Moby API, and is not compatible with podman short name handling
6 participants