Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable subdomain matching in registries.conf #1191

Merged
merged 3 commits into from
Apr 9, 2021

Conversation

lsm5
Copy link
Member

@lsm5 lsm5 commented Mar 28, 2021

This commit allows the prefix field in registries.conf to be of the
form: `prefix = "*.example.com" for wildcard subdomain matching.

refMatchesPrefix now returns the length of the prefix if there's a match
and the prefix doesn't contain *.. If prefix contains *. and there's
a match, then refMatchesPrefix returns the length of the refString
without the image. This change removes the need for
any additional string comparison in rewriteReference.

Co-authored-by: Valentin Rothberg rothberg@redhat.com
Signed-off-by: Lokesh Mandvekar lsm5@fedoraproject.org

/cc @ashcrow

@lsm5 lsm5 force-pushed the subdomain-matching-registries-conf branch from f1678dd to c22483e Compare March 28, 2021 21:40
@lsm5 lsm5 requested a review from vrothberg March 28, 2021 21:50
@lsm5 lsm5 force-pushed the subdomain-matching-registries-conf branch 2 times, most recently from ac12ae2 to 301262f Compare March 29, 2021 01:22
Copy link
Member

@vrothberg vrothberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

I think we need to increase the test coverage. Most importantly, we need to exercise the loading of registries and add a couple of registries in pkg/sysregistriesv2/testdata and exercise a) the loading (positive and errors) and b) that things like FindRegistry(...) will find the expected Registry.

Please add more verbose docs (code, man pages, commit message). There is a number of important things to document for future generations and readers:

  • Why do we need a new field and don't extend prefix?
  • How does it impact the behavior exactly? (i.e., it extends the prefix with subdomains but is a new field to remain backwards compatible)
  • Dedicate an example for subdomain matching in the man page to illustrate how it can be used and when it may be useful?

Also, I think that prefix and subdomainprefix must conflict. If both are configured, we need to reject the config. Overwriting or silently ignoring can cause confusion and unexpected behavior.

pkg/sysregistriesv2/system_registries_v2.go Outdated Show resolved Hide resolved
pkg/sysregistriesv2/system_registries_v2.go Outdated Show resolved Hide resolved
pkg/sysregistriesv2/system_registries_v2.go Outdated Show resolved Hide resolved
pkg/sysregistriesv2/system_registries_v2.go Outdated Show resolved Hide resolved
pkg/sysregistriesv2/system_registries_v2.go Outdated Show resolved Hide resolved
pkg/sysregistriesv2/system_registries_v2.go Outdated Show resolved Hide resolved
pkg/sysregistriesv2/system_registries_v2_test.go Outdated Show resolved Hide resolved
@vrothberg
Copy link
Member

We need to extend func (c *parsedConfig) updateWithConfigurationFrom(updates *parsedConfig). Currently, when loading registries from registries.conf.d we will overwrite/merge the configs via the prefix. Now, with subdomainprefix and prefix being mutual exclusive we need to account for that in updateWithConfigurationFrom(..); otherwise the merging would fail.

The key in the registryMap must now also look at subdomain prefix. We also need to test the merging (see earlier /testdata comment).

Copy link
Collaborator

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Mostly drive-by)

Why do we need a new field and don't extend prefix?

Yes. *.foo.bar didn’t work as a prefix value before, so defining new semantics for those strings alone could work fine.

OTOH, two other major concern that might impact the above:

  • What about the rewriteReference functionality? (It might well make sense not to support that for wildcards, I don’t know, but we should be explicit about that. And maybe, even if that feature is not provided by this PR, consider that it could be provided in the future.)
  • A lot of code in this package assumes r.Prefix is always set (notably things like updateWithConfigurationFrom are completely broken now, AFAICS). Much worse, we have GetRegistries just return the full data and any current callers that presume to understand the full format are going to be broken. What to do about that? Do we just return the wildcard matches, or filter them out? Either way, every single caller of that function (and every user of the configuration returned by TryUpdatingCache, if any) in the ecosystem needs to be reviewed, and ideally GetRegistries should be deprecated entirely and replaced by something with more limited/precise semantics (IIRC it’s only used for podman info, where we could instead provide a RegistryConfigurationHumanReadableSummary or something like that instead of raw data, but I haven’t looked into that).

pkg/sysregistriesv2/system_registries_v2.go Outdated Show resolved Hide resolved
@vrothberg
Copy link
Member

OTOH, two other major concern that might impact the above:

* What about the `rewriteReference` functionality? (It might well make sense not to support that for wildcards, I don’t know, but we should be explicit about that. And maybe, even if that feature is not provided by this PR, consider that it could be provided in the future.)

Very good point. We need to go step-by-step in this PR but need to eventually address it before we can merge. Pretty much, we need to look at all uses of prefix and check how we can tackle it. Let's take rewriteReference(ref reference.Named, prefix string) as an example. I think we should do as follows:

  • The prefix argument can be in both formats.
  • The rewriting needs to account for the new wildcard syntax and be able to correctly replace the matching subdomain prefix with the specified location.
* A lot of code in this package assumes `r.Prefix` is always set (notably things like `updateWithConfigurationFrom` are completely broken now, AFAICS). 

Agreed (see #1191 (comment)).

Much worse, we have GetRegistries just return the full data and any current callers that presume to understand the full format are going to be broken. What to do about that?

Crap. So far, I was focusing on the part that older clients can continue loading the registries.conf without barking. But now, the configs would be misinterpreted.

[[registry]]
location="foo.com"
subdomainprefix="*.foo.com"

Older clients would set:

Registry {
Prefix: "foo.com",
Location: "foo.com",
SubdomainPrefix: "",
}

That changes my mind. I think that we should extend the syntax of prefix instead. I prefer old tools to break than to misinterpret the data. It's then up to users to update Podman, CRI-O etc.

@umohnani8 @rhatdan PTAL

@mtrmac
Copy link
Collaborator

mtrmac commented Mar 30, 2021

* What about the `rewriteReference` functionality? …

… Pretty much, we need to look at all uses of prefix and check how we can tackle it. Let's take rewriteReference(ref reference.Named, prefix string) as an example. I think we should do as follows:

  • The prefix argument can be in both formats.
  • The rewriting needs to account for the new wildcard syntax and be able to correctly replace the matching subdomain prefix with the specified location.

I don’t think we need to implement that now at all, and not even really specify; just demonstrate that there is a potential path forward (there probably is one, at least allow rewrites from *.foo to $.bar, or maybe from *.foo to mirror.example/$ , with $ standing for some future substitution).

That changes my mind. I think that we should extend the syntax of prefix instead.

My quick first impression is the same.

@vrothberg
Copy link
Member

* What about the `rewriteReference` functionality? …

… Pretty much, we need to look at all uses of prefix and check how we can tackle it. Let's take rewriteReference(ref reference.Named, prefix string) as an example. I think we should do as follows:

  • The prefix argument can be in both formats.
  • The rewriting needs to account for the new wildcard syntax and be able to correctly replace the matching subdomain prefix with the specified location.

I don’t think we need to implement that now at all, and not even really specify; just demonstrate that there is a potential path forward (there probably is one, at least allow rewrites from *.foo to $.bar, or maybe from *.foo to mirror.example/$ , with $ standing for some future substitution).

I am surprised, maybe I am missing something. I expected the rewriting to be especially important for pulling. A podman pull sub.domain.org/image:latest with subdomainprefix="*.domain.org" and location="registry.com" should be pulling registry.com/image:latest.

Would you limit it to blocked=true?

That changes my mind. I think that we should extend the syntax of prefix instead.

My quick first impression is the same.

@mtrmac
Copy link
Collaborator

mtrmac commented Mar 30, 2021

  • What about the rewriteReference functionality? …

I don’t think we need to implement that now at all, and not even really specify

I am surprised, maybe I am missing something. I expected the rewriting to be especially important for pulling.

Oh, that’s quite possible. I didn’t take the time to read the original request(s) I’m afraid.

A podman pull sub.domain.org/image:latest with subdomainprefix="*.domain.org" and location="registry.com" should be pulling registry.com/image:latest.

Ignoring the wildcard match? That’s plausible, and allows future extensions.


One thing that does need to happen now, if no rewriting were implemented: we must refuse configurations that request it (e.g. by specifying different (subdomain)prefix and location).

@lsm5 lsm5 force-pushed the subdomain-matching-registries-conf branch from 301262f to f45d2fb Compare March 30, 2021 12:12
@lsm5
Copy link
Member Author

lsm5 commented Mar 30, 2021

Just to confirm, I'll rework this to instead have the prefix field understand *.example.com, and get rid of the subdomainprefix field altogether. @mtrmac @vrothberg please correct me if I'm wrong.

@lsm5 lsm5 force-pushed the subdomain-matching-registries-conf branch 4 times, most recently from 6490bbf to d4b3ee0 Compare March 30, 2021 20:05
@lsm5
Copy link
Member Author

lsm5 commented Mar 30, 2021

Just to confirm, I'll rework this to instead have the prefix field understand *.example.com, and get rid of the subdomainprefix field altogether. @mtrmac @vrothberg please correct me if I'm wrong.

I think I have addressed this now. I need to add some sample registry configs in testdata and check that the other comments have been addressed.

@lsm5 lsm5 force-pushed the subdomain-matching-registries-conf branch 2 times, most recently from f531261 to ab9c75c Compare March 31, 2021 01:48
Copy link
Collaborator

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Just a very quick skim, I didn’t re-read the whole package to see the full picture.) Looks reasonable at a first glance.

Please make sure rewriteReference has appropriate test cases, at least (as well as maybe other non-trivial cases suggested by earlier discussions).


A reminder:

every single caller of [GetRegistries] (and every user of the configuration returned by TryUpdatingCache, if any) in the ecosystem needs to be reviewed, and ideally GetRegistries should be deprecated entirely and replaced by something with more limited/precise semantics

newNamedRef := strings.Replace(refString, prefix, e.Location, 1)
if prefix[0:2] == "*." {
if strings.Contains(refString, "/") {
firstSlash := strings.Index(refString, "/")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now, assuming refMatchesPrefix is true, wouldn’t it work to just split on first slash, and replace the domain with e.Location without any further processing?


Conceptually it would probably be cleanest for refMatchesPrefix to return the length of the matched prefix; then we wouldn’t even need to do a redundant comparison implied in strings.Replace and we could just splice strings — and more importantly it would naturally extend to a future version where refMatchesPrefix also returns the content matched by * so that it could be substituted.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conceptually it would probably be cleanest for refMatchesPrefix to return the length of the matched prefix;

Especially because we would have exactly one implementation of the matching semantics, with no copies, alternatives, or shortcuts.

pkg/sysregistriesv2/system_registries_v2.go Outdated Show resolved Hide resolved
pkg/sysregistriesv2/system_registries_v2.go Outdated Show resolved Hide resolved
@vrothberg
Copy link
Member

[...] we have GetRegistries just return the full data and any current callers that presume to understand the full format are going to be broken. What to do about that? Do we just return the wildcard matches, or filter them out? Either way, every single caller of that function (and every user of the configuration returned by TryUpdatingCache, if any) in the ecosystem needs to be reviewed, and ideally GetRegistries should be deprecated entirely and replaced by something with more limited/precise semantics (IIRC it’s only used for podman info, where we could instead provide a RegistryConfigurationHumanReadableSummary or something like that instead of raw data, but I haven’t looked into that).

Can you elaborate on what would be broken? Users who relied on the previous syntax of Registry.Prefix?

@mtrmac
Copy link
Collaborator

mtrmac commented Mar 31, 2021

Can you elaborate on what would be broken? Users who relied on the previous syntax of Registry.Prefix?

Yes (possibly, that’s why it needs review).

And to avoid this full ecosystem review and the ~unavoidable API breakage from happening again, we should stop giving out raw data; rather, tend to encapsulate and provide operations aimed at specific use cases (we can always add more operations). Compare the evolution of ConfigPathConfigDirPathConfigurationSourceDescription.

(Well, strictly speaking, even FindRegistry could be affected because a caller could do FindRegistry().Prefix; that one is just less likely. And of course encapsulation only gets us so far, any API we choose bakes in some assumptions.)

@mtrmac
Copy link
Collaborator

mtrmac commented Mar 31, 2021

Yes (possibly, that’s why it needs review).

A quick skim seems to suggest that it’s mostly fine, or even just unused code that could be deleted. And then Podman’s “info” endpoint seems to provide, untyped, the full sysregistriesv2.Registry object, if I’m not reading that wrong… I do hope I’m reading that wrong.

@vrothberg
Copy link
Member

Can you elaborate on what would be broken? Users who relied on the previous syntax of Registry.Prefix?

Yes (possibly, that’s why it needs review).

Thanks! We should double check but I think "our" immediate consumers (Podman, CRI-O, Buildah, Skopeo, etc.) don't use the Prefix.

And to avoid this full ecosystem review and the ~unavoidable API breakage from happening again, we should stop giving out raw data; rather, tend to encapsulate and provide operations aimed at specific use cases (we can always add more operations). Compare the evolution of ConfigPathConfigDirPathConfigurationSourceDescription.

I concur. Lessons learned from extending registries.conf is that configuration data and API types should be clearly separated (separation of concerns to a degree). Parsing should go into an internal package to prevent leaking configuration data to external consumers.

Maybe something for sysregistriesv3 😈

(Well, strictly speaking, even FindRegistry could be affected because a caller could do FindRegistry().Prefix; that one is just less likely. And of course encapsulation only gets us so far, any API we choose bakes in some assumptions.)

At least in the API we did not give any explicit guarantees on the syntax/format of Registry.Prefix. It could still be a problem if users relied on that.

A quick skim seems to suggest that it’s mostly fine, or even just unused code that could be deleted. And then Podman’s “info” endpoint seems to provide, untyped, the full sysregistriesv2.Registry object, if I’m not reading that wrong… I do hope I’m reading that wrong.

Yes, that's the case. The reasoning for that was to improve issue and bug analyses. It can be hugely beneficial to see the fully parsed set of registries rather than asking for all possible file contents (especially with the drop in files).

@mtrmac
Copy link
Collaborator

mtrmac commented Mar 31, 2021

The need makes sense.

That approach is already incomplete with aliases, as well as perhaps the coming credential helper list.

Designing a “well-encapsulated” API for that kind of use is going to be… interesting. Just a EffectiveConfiguration() []byte? EffectiveConfiguration() V2RegistriesConf, both with some kind of warning that the actual contents can be changed or get new semantics at any time? Either way the current Podman’s info can’t handle aliases and the like.

@lsm5
Copy link
Member Author

lsm5 commented Mar 31, 2021

Thanks, @mtrmac @vrothberg . I'm working on these for now:

  • green CI
  • tests / testdata
  • remove code duplication
  • update refMatchesPrefix return type
  • documentation

I'll keep pushing commits and let you know once it's ready for further review.

This commit allows the prefix field in registries.conf to be in the
format: `prefix = "*.example.com" for wildcard subdomain matching.

refMatchesPrefix has been renamed to refMatchingPrefix. refMatchingPrefix
now returns the length of the prefix if there's a match
and the prefix doesn't contain `*.`. If prefix contains `*.` and there's
a match, then refMatchingPrefix returns the length of the refString
without the image. This change removes the need for
any additional string comparison in `rewriteReference`.

Co-authored-by: Valentin Rothberg <rothberg@redhat.com>
Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
@lsm5 lsm5 force-pushed the subdomain-matching-registries-conf branch from 0ebd968 to da93a53 Compare April 9, 2021 14:29
Copy link
Collaborator

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for today, with a follow-up for FIXMEs.

lsm5 added 2 commits April 9, 2021 10:43
* Enable subdomain matching in registries.conf
* registries.conf: configure credential helpers
* Bump github.com/containers/storage from 1.28.0 to 1.28.1
* Bump github.com/klauspost/compress from 1.11.12 to 1.11.13
* Bump github.com/vbauerster/mpb/v6 from 6.0.2 to 6.0.3
* storage destination: improve doc comments re: concurrency
* Spelling
* Bump github.com/containers/storage from 1.27.0 to 1.28.0
* manifest: GuessMIMEType: support OCI artifacts
* copy: early commit of storage blobs
* fix docker.GetDigest docker.makeRequestToResolvedURL docker.getExternalBlob socket leak
* fix typo in docs/containers-registries.conf.d.5.md
* fix GetBlob socket leak Closes containers#1177
* platform matcher: use compat matrix
* Cirrus: Migrate from Travis CI
* add option to download foreign layers contents
* Bump github.com/klauspost/compress from 1.11.9 to 1.11.12
* storageImageSource.LayerInfosForCopy(): return uncompressed MediaTypes
* storage: GetBlob: write to a local tempfile
* Bump github.com/klauspost/compress from 1.11.7 to 1.11.9
* Bump github.com/ulikunitz/xz from 0.5.9 to 0.5.10
* Bump github.com/containers/storage from 1.26.0 to 1.27.0
* shortnames: correctly determine cache directory for rootless
* README: godoc: point to v5
* Check destination image exists copy optimization
* copy: compute blob compression on reused blobs based on source MediaType
* copy: provide compression info about copied blobs
* shortnames: resolution error: point to manpage if config is absent
* bump to v5.11.0-dev
* short-name-aliases.conf: use cache folders instead of $HOME
* update progress-bar lib mpb to v6
* Bump github.com/containers/storage from 1.24.5 to 1.25.0
* Bump github.com/containers/ocicrypt from 1.0.3 to 1.1.0
* Update containers-registries.conf.5.md
* Fix writing signatures for updated multi-arch image components
* Beautify copyBlobStream a bit
* move to v5.10.2-dev

Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
Copy link
Member

@vrothberg vrothberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lsm5 lsm5 merged commit 1490d2c into containers:master Apr 9, 2021
@lsm5 lsm5 deleted the subdomain-matching-registries-conf branch April 13, 2021 12:44
lsm5 added a commit to lsm5/containers-image that referenced this pull request Aug 30, 2021
lsm5 added a commit to lsm5/containers-image that referenced this pull request Aug 30, 2021
lsm5 added a commit to lsm5/containers-image that referenced this pull request Aug 30, 2021
lsm5 added a commit to lsm5/containers-image that referenced this pull request Aug 30, 2021
lsm5 added a commit to lsm5/containers-image that referenced this pull request Aug 30, 2021
lsm5 added a commit to lsm5/containers-image that referenced this pull request Aug 30, 2021
lsm5 added a commit to lsm5/containers-image that referenced this pull request Aug 30, 2021
lsm5 added a commit to lsm5/containers-image that referenced this pull request Aug 30, 2021
lsm5 added a commit to lsm5/containers-image that referenced this pull request Sep 1, 2021
Previously an unset Location was only allowed for wildcarded Prefixes.
This commit will allow any valid Prefix.

Fixes: containers#1191 (comment)

Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
lsm5 added a commit to lsm5/containers-image that referenced this pull request Sep 1, 2021
Fixes: containers#1191 (comment)

Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
lsm5 added a commit to lsm5/containers-image that referenced this pull request Sep 1, 2021
Previously an unset Location was only allowed for wildcarded Prefixes.
This commit will allow any valid Prefix.

Fixes: containers#1191 (comment)

Co-authored-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
lsm5 added a commit to lsm5/containers-image that referenced this pull request Sep 1, 2021
Previously an unset Location was only allowed for wildcarded Prefixes.
This commit will allow any valid Prefix.

Fixes: containers#1191 (comment)

Co-authored-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
lsm5 added a commit to lsm5/containers-image that referenced this pull request Sep 1, 2021
Previously an unset Location was only allowed for wildcarded Prefixes.
This commit will allow any valid Prefix.

Remove wildcard prefix check for empty location in rewriteReference.

Fixes:
1. containers#1191 (comment)
2. containers#1191 (comment)

Co-authored-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
lsm5 added a commit to lsm5/containers-image that referenced this pull request Sep 16, 2021
Previously an unset Location was only allowed for wildcarded Prefixes.
This commit will allow configs with emtpy location or empty prefix, so
long as the other is valid.

Remove wildcard prefix check for empty location in rewriteReference.

Fixes:
1. containers#1191 (comment)
2. containers#1191 (comment)

Co-authored-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
lsm5 added a commit to lsm5/containers-image that referenced this pull request Sep 20, 2021
Previously an unset Location was only allowed for wildcarded Prefixes.
This commit will allow configs with emtpy location or empty prefix, so
long as the other is valid.

Remove wildcard prefix check for empty location in rewriteReference.

Fixes:
1. containers#1191 (comment)
2. containers#1191 (comment)

Co-authored-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
lsm5 added a commit to lsm5/containers-image that referenced this pull request Sep 23, 2021
Fixes: containers#1191 (comment)

Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
lsm5 added a commit to lsm5/containers-image that referenced this pull request Sep 23, 2021
Empty location is now allowed with a valid prefix, and we check for
these in postProcessRegistries already.

This check doesn't need to exist.

Fixes: containers#1191 (comment)

Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
lsm5 added a commit to lsm5/containers-image that referenced this pull request Sep 23, 2021
Empty location is now allowed with a valid prefix, and we check for
these in postProcessRegistries already.

This check doesn't need to exist.

Fixes: containers#1191 (comment)

Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
lsm5 added a commit to lsm5/containers-image that referenced this pull request Oct 4, 2021
Previously an unset Location was only allowed for wildcarded Prefixes.
This commit will allow configs with emtpy location or empty prefix, so
long as the other is valid.

Remove wildcard prefix check for empty location in rewriteReference.

Fixes:
1. containers#1191 (comment)
2. containers#1191 (comment)

Co-authored-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
lsm5 added a commit to lsm5/containers-image that referenced this pull request Oct 5, 2021
Previously an unset Location was only allowed for wildcarded Prefixes.
This commit will allow configs with emtpy location or empty prefix, so
long as the other is valid.

Remove wildcard prefix check for empty location in rewriteReference.

Fixes:
1. containers#1191 (comment)
2. containers#1191 (comment)

Co-authored-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants