Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

autohttps: Implement auto_https prefer_wildcard option #6146

Merged
merged 4 commits into from
Oct 2, 2024
Merged

Conversation

francislavoie
Copy link
Member

@francislavoie francislavoie commented Mar 4, 2024

Closes #5447

This implements a new auto_https prefer_wildcard option, which drops automation policies for non-wildcard domains when there's already a wildcard in another policy.

This allows users to flatten their config, and instead of using a pattern like https://caddyserver.com/docs/caddyfile/patterns#wildcard-certificates they can instead do something like:

{
	auto_https prefer_wildcard
}

*.example.com {
	tls {
		dns <provider>
	}
	respond "fallback"
}

foo.example.com {
	respond "foo"
}

This would only produce a single wildcard certificate, and no individual certificate for foo.example.com since it's already covered by the wildcard.

This also allows specifying multiple arguments to auto_https, so you can do the following to set multiple Automatic HTTPS options. Previously only one could be set at a time, which was generally fine because there wasn't actually any usecase where it would be useful:

{
	auto_https prefer_wildcard disable_redirects
}

I've only manually (visually) tested with a few simple usecases. Unfortunately we have a big lack of tests for the Automatic HTTPS logic because it manipulates config at runtime. I probably need help with testing this to make sure it doesn't have weird side effects. Thankfully, this should be safe/backwards compatible as long as users don't enable this option. We could call it experimental for now.

@francislavoie francislavoie added the feature ⚙️ New feature or request label Mar 4, 2024
@francislavoie francislavoie added this to the v2.8.0 milestone Mar 4, 2024
@francislavoie francislavoie requested a review from mholt March 4, 2024 02:32
@francislavoie francislavoie changed the title Allow specifying multiple auto_https options autohttps: Implement auto_https prefer_wildcard option Mar 4, 2024
@mholt
Copy link
Member

mholt commented Mar 6, 2024

Thanks Francis, this looks appealing. Will wait for one or two people to field-test it.

@abjugard
Copy link

abjugard commented Apr 6, 2024

This looks amazing and a super useful feature to reduce subdomain discovery through certificate transparency!

I would suggest a small addition in order to minimise this even further.

E.g. to make Caddy always use wildcards.

This could maybe be implemented as another option for this directive, called force_wildcard, which figures out the minimal number of wildcard certs that need to be generated to cover all sites, and then uses your logic from this code to apply those certs to the correct sites.

Say I have two sites setup, for serviceA.domain.tld and serviceB.domain.tld, and use this setting, Caddy would then generate *.domain.tld and use it.

If I have hostX.serviceA.domain.tld, hostY.serviceA.domain.tld, hostX.serviceB.domain.tld, hostY.serviceB.domain.tld, then all of those would be covered by the *.*.domain.tld certificate (does ACME even allow such certificates? 🤔).

Alternatively, add another directive that allows users to specify a list of wildcard certs that will always be managed regardless of if they're used by sites or not, and then your feature here will be able to use those to avoid generating specific certs. A little less "magic" than the force_wildcard idea, but probably much easier to implement.

What do you think about this @francislavoie?

I guess it might be necessary to give Caddy a list of domains we control in order for this to work when using multiple domains so it doesn't see domain1.tld and domain2.tld and tries to generate *.tld... 😂

@JeDaYoshi
Copy link

JeDaYoshi commented Apr 8, 2024

If I have hostX.serviceA.domain.tld, hostY.serviceA.domain.tld, hostX.serviceB.domain.tld, hostY.serviceB.domain.tld, then all of those would be covered by the *.*.domain.tld certificate (does ACME even allow such certificates? 🤔).

@abjugard *.domain.tld won't cover sub-sub-domains. Caddy would need to generate certs for *.serviceA.domain.tld and *.serviceB.domain.tld. At least Let's Encrypt doesn't allow for *.*.domain.tld, AFAIK.

@abjugard
Copy link

abjugard commented Apr 8, 2024

If I have hostX.serviceA.domain.tld, hostY.serviceA.domain.tld, hostX.serviceB.domain.tld, hostY.serviceB.domain.tld, then all of those would be covered by the *.*.domain.tld certificate (does ACME even allow such certificates? 🤔).

@abjugard *.domain.tld won't cover sub-sub-domains. Caddy would need to generate certs for *.serviceA.domain.tld and *.serviceB.domain.tld. At least Let's Encrypt doesn't allow for *.*.domain.tld, AFAIK.

Right, unfortunate limitation but understandable.

Then I think we can simplify the behaviour and not need to specify which domains are owned, but just let Caddy automatically figure out to make wildcards for the leftmost label.

Perhaps we should just wait for this PR to get merged then I can take a crack at a force-wildcard PR?

@mholt
Copy link
Member

mholt commented Apr 8, 2024

I'm actually implementing part of this in CertMagic -- likely will become a "subject transformer" that allows changes to the subject name when managing certs, so, e.g. one can lob off the leftmost label and replace it with * to make it a wildcard. Of course, the Caddy interface will still need to be developed, which this PR could be. But I kind of want to come back to this after CertMagic has its change made, since it coincides with caddyserver/certmagic#280.

@mholt
Copy link
Member

mholt commented Apr 15, 2024

@francislavoie How would you feel if we/(I?) updated this PR to use the new SubjectTransformer introduced in the linked issue/commit? It should be relatively simple I think, if the global option is set, then set the SubjectTransformer for CertMagic configs to be a 3-line function.

@francislavoie
Copy link
Member Author

I'm not sure how that would look 🤔 maybe make a branch off this one if you think it's simple?

@mholt
Copy link
Member

mholt commented Apr 16, 2024

I will try soon! :)

@mholt
Copy link
Member

mholt commented Apr 16, 2024

Actually the SubjectTransformer might be slightly tangential to this, rather than directly related -- the transformer is for, like, "I have a.b.c and b.b.c, but I want you to manage a single wildcard instead of individual certs for each specific domain." Whereas this change is, "I have *.b.c and a.b.c, and I want a.b.c to use the wildcard cert."

There's another aspect I want to consider as well, that is some users want just specific domains to be served under a wildcard, while the others shouldn't be. For example, in the config above, if there was also a site, bar.example.com, but they wanted that served with its own cert (maybe to try to obscure the fact that other subdomains are being served?) how would they do that?

I think I want to give this more thought, even before I implement anything here.

This is a needed change though and I think it's a good start. I just don't want to commit to its API/syntax/implementation quite yet until we have a better picture of the bigger picture.

@francislavoie
Copy link
Member Author

There's another aspect I want to consider as well, that is some users want just specific domains to be served under a wildcard, while the others shouldn't be. For example, in the config above, if there was also a site, bar.example.com, but they wanted that served with its own cert (maybe to try to obscure the fact that other subdomains are being served?) how would they do that?

I think that's already handled by my approach, because prefer_wildcard only applies to a subdomain if there's already a wildcard cert in the config being managed (in which case it uses that policy). For other domains if you just don't have a wildcard that covers it, it'll still make a cert for that single domain. If you want to opt-out for just one domain that's covered by a wildcard, then don't use this feature and do it the handle way 🤷‍♂️

@mholt
Copy link
Member

mholt commented Apr 24, 2024

Ah, okay. Hmm.

I might still wait on this until after 2.8 so I can give this a little more thought. We now have a way, in CertMagic, of mapping/transforming one subject name (domain name) to another, for the purposes of cert management. Even if this PR ends up being good as-is, I just want a little more time on it.

@mholt mholt modified the milestones: v2.8.0, v2.9.0 Apr 24, 2024
@omltcat
Copy link

omltcat commented May 3, 2024

Glad there is a PR for this and I really look forward to it. Thank you for your great work. One big advantage of having separate site blocks is when using caddy-docker-proxy labels.

Since these labels can be distributed across multiple docker-compose files, there can be some possibilities of misconfiguration somewhere (especially done by multiple people). With the current handle approach, if one of the subdomain is misconfigured, causes the ENTIRE wildcard site block to be removed, bringing down everything. With this PR, the failure would at least be localized (hopefully😊)

@polarathene
Copy link

polarathene commented Aug 25, 2024

If you want to opt-out for just one domain that's covered by a wildcard, then don't use this feature and do it the handle way 🤷‍♂️

Could there not just be a tls or similar directive for a more explicit opt-out? (I don't need such functionality myself though)

It already seems to be an issue according to this report where an internal wildcard cert is being used instead of the LetsEncrypt one for an explicit site address.

Alternatively, you could go the other way around like with local_certs / tls internal, and instead have something like tls internal_wildcard or prefer_wildcard in the actual tls directive options? (assuming that could also be used to prefer FQDN as an override too).


Since the certmagic subject transformer feature is available and Caddy 2.8 is released, is there anything that can be done to assist moving this feature forward?

I've only manually (visually) tested with a few simple usecases. Unfortunately we have a big lack of tests for the Automatic HTTPS logic because it manipulates config at runtime.
I probably need help with testing this to make sure it doesn't have weird side effects.

If you can provide a rough outline of what to test, I could put together configs to verify?


One potential bug (without this PR) that already appears to exist is assigning a domain a wildcard cert from external files, and another site block for that domain with a different subdomain implicitly using the wildcard. (EDIT: Nope, that was due to incorrect auto_https mode set)

@mholt
Copy link
Member

mholt commented Sep 26, 2024

I am now wondering if we should make preferring wildcards the default behavior as @abjugard suggested above.

In Slack, it was expressed that it would be a breaking change, but I am not sure if there are (m)any(?) use cases that would actually break. A wildcard cert is just as good as a subdomain cert.

@polarathene
Copy link

EDIT: Ignore me. I mistook the "prefer" wildcard cert to prefer provisioning an explicit subdomain with a wildcard cert (even if no wildcard was explicitly configured/requested).


A wildcard cert is just as good as a subdomain cert.

Isn't it generally considered better practice to prefer explicit SAN for certs than a wildcard? Some businesses may have compliance requirements related to that expectation?

That concern is only relevant if the private key was compromised for the attacker to leverage it before expiry/detection, but that may be sufficient time for the attacker? 🤷‍♂️

I don't operate at a scale where I'm paranoid about such personally, but I could understand it being a compliance concern elsewhere, which would make it a breaking change for those users? Thus you may want to introduce as opt-in, and switch to opt-out when a breaking change is acceptable?

Unless you don't consider such implicit assumptions/trust in software to be a breaking change? You could also take the route of adding a notice (to release notes and pinned issue) for awareness before switching to opt-out, giving any potential users time to become aware of the change landing in future?

To be fair though, I think when compliance matters that upgrading to newer releases would have a more strict policy than trusting semantic versioning 😅 So perhaps I'm rambling about a non-issue.

@kanashimia
Copy link
Contributor

In Slack, it was expressed that it would be a breaking change, but I am not sure if there are (m)any(?) use cases that would actually break. A wildcard cert is just as good as a subdomain cert.

If Caddy fails to obtain cert for *.example.com then foo.example.com won't have any cert too right? That seems problematic, especially because LetsEncrypt won't give you wildcard cert without DNS-01 challenge.
Anyways, default behavior can be changed later.

@abjugard
Copy link

abjugard commented Sep 27, 2024

If Caddy fails to obtain cert for *.example.com then foo.example.com won't have any cert too right? That seems problematic, especially because LetsEncrypt won't give you wildcard cert without DNS-01 challenge.

The scenario you describe here isn't really a concern as the requirements for procuring a wildcard cert are strictly different from certs for specific subdomains. Caddy (I'm guessing) won't try to get a wildcard cert unless its able to, and if its able to then it will either succeed or fail in which case you've configured it wrong and it should correctly refuse to continue starting up the site in question.

@mholt
Copy link
Member

mholt commented Sep 27, 2024

@kanashimia

If Caddy fails to obtain cert for *.example.com then foo.example.com won't have any cert too right?

Wildcard certs require DNS challenges which have to be explicitly configured. If someone is configuring a wildcard into their server, then they probably don't intend to get subdomain certs at all. IMO.

Anyways, default behavior can be changed later.

Well, if we default to always preferring the wildcard cert, that changes this PR. I just want to do what is most expected and intuitive, and I feel like setting up wildcard hostnames is juuust explicit and involved enough that a user probably wants to use that cert for the rest of the subdomains.

Maybe an inverse of this PR would be useful then, that is, to ignore wildcard certs.

@polarathene
Copy link

If someone is configuring a wildcard into their server, then they probably don't intend to get subdomain certs at all. IMO.

Well there is this report where someone does want some subdomains to not use wildcard.

While the feature is named as prefer wildcard, I don't suppose there would be a way to ensure that only wildcards are provisioned when valid? As in if it fails, to opt-out of a fallback to explicit FQDN?

This is due to CT public logs concern, which presently is avoided with a wildcard site address, but with this feature the benefit is explicit site addresses that could use a wildcard won't be as trustworthy for that convenience if wildcard provisioning falls back to explicit FQDNs.


I've seen many users (people asking for help in the forums) that try to load a wildcard cert into Caddy e.g. a Cloudflare Origin cert, which should not apply to all their subdomains because they might have some subdomains which are not proxied by Cloudflare (e.g. resolve to their private IP address for their LAN, non-public subdomain).

Could we have an example of that? I have seen users mixing public + internal certs for the same domain, and earlier in this PR discussion suggested tls directive option for preference instead of auto_https global might be worth considering.

I've got a public wildcard and server, as well as local services that I can connect to over my LAN with split DNS using that same domain and wildcard cert but resolving to a different IP. That is quite a common practice for self-host enthusiasts.

I'm not familiar with Cloudflare Origin, but assume that's different to a public CA cert like from LetsEncrypt with the bulk of devices will have trust store coverage. I could understand with tls internal wildcard where that would not be the case, if I did want to expose some services with a LetsEncrypt cert for devices to use instead of requiring adding my private CA to their trust stores that'd perhaps be similar to whatever the Cloudflare Origin concern is?


I much rather take the current implementation (which is opt-in, not opt-out), have users try it and find the edgecases over time, and then consider making it the default later on.

I would also suggest this approach 👍

You can always bump to a Caddy 3.0 release at a later point when it makes sense. A new major release doesn't need to be due to a massive rewrite / refactor. Or if acceptable switching the default after sufficient notice in a future minor release as suggested.

I want to get this out there so people can use it ASAP, as an experimental feature.

There's already some features marked as experimental right? Or has been in the past, that's a great approach 🚀

@francislavoie
Copy link
Member Author

I'm not familiar with Cloudflare Origin, but assume that's different to a public CA cert like from LetsEncrypt with the bulk of devices will have trust store coverage.

It's a cert issued by Cloudflare's non-publicly-trusted CA which only their servers trust, allowing HTTPS between Cloudflare and the upstream server without the upstream needing to have a publicly trusted cert. It was the go-to option before ACME & DNS challenge became more common-place.

Could we have an example of that?

I didn't make note of them, but one recently this week on discord that I can't really link to.

You can always bump to a Caddy 3.0 release at a later point when it makes sense.

No plan to do a v3 anytime soon. Problem is it would be a breaking change to the entire plugin ecosystem because we'd need to change the Go package to have /v3 in it. Go's module ecosystem kinda tied our hands. Caddy is not semver.

There's already some features marked as experimental right?

Many, yes. Via godoc comments or elsewhere in the docs with ⚠️ Experimental.

@mholt
Copy link
Member

mholt commented Sep 30, 2024

Ok, I hear you. If we merge this, I would like for it to be an experimental / transitional feature, as we prepare to make using existing wildcards the norm/default behavior. This might be the kind of thing best learned from field experience, so I don't want to lock us into one choice either way.

So, how about this for a plan:

  • We can merge this in, and document it as experimental/temporary, and that it may soon be the default behavior.
  • If that goes well overall, it becomes the default behavior and we remove this option.
  • When it becomes the default behavior, we add a new option to get unique certificates for every explicitly-configured subdomain, even if they are covered by wildcards. Or maybe the list of domains is explicitly specified in config. Either way, there will be an "escape hatch" for the new default behavior for the case(s) that is needed.

Sound good?

@francislavoie
Copy link
Member Author

Yep! That's what I was trying to get across.

Copy link
Member

@mholt mholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright. Sounds like a plan. Thanks for working on this. Let's try it out!

@kanashimia
Copy link
Contributor

Actually I imagine it should work with acme automation disabled, does this work?:

{
	auto_https prefer_wildcard disable_redirects disable_certs
}

*.example.com {
	tls cert.pem key.pem
	respond "fallback"
}

foo.example.com {
	respond "foo"
}

@francislavoie
Copy link
Member Author

francislavoie commented Oct 1, 2024

I'm not certain @kanashimia (would appreciate if you'd test it), but I think that's already handled without this feature. The ignore_loaded_certs flag is what turns off that behaviour (causing your non-wildcard one to automate TLS management).

@polarathene
Copy link

I'm not certain @kanashimia (would appreciate if you'd test it), but I think that's already handled without this feature

For reference, that was effectively covered here: #5216 (comment)

With @francislavoie providing the ignore_loaded_certs solution if wanting to prevent the other site-block from using the loaded wildcard.

From the perspective of a user with Caddyfile, the difference between external vs Caddy provisioned (internal / acme) certs with such behaviour is probably a bit unexpected. prefer_wildcard I think makes that behaviour consistent, so that's another benefit to eventually making it default?


I've not tried the suggested Caddyfile, but assume prefer_wildcard / auto_https has no relevance for to the wildcard site block due to tls cert.pem key.pem, while the other site block with explicit site-address should still behave like the linked issue example demonstrates (using the external wildcard).

Perhaps relevant for an eventual opt-out option once switching prefer_wildcard to default. At least ignore_loaded_certs while valid, I am not sure why an explicit domain would use a wildcard for an external cert when present but not a provisioned internal/acme one prior to this PR 🤔

@mholt mholt modified the milestones: v2.9.0-beta.1, v2.9.0-beta.2 Oct 2, 2024
@mholt mholt merged commit 1672484 into master Oct 2, 2024
33 checks passed
@mholt mholt deleted the prefer-wildcard branch October 2, 2024 13:32
@mholt mholt modified the milestones: v2.9.0-beta.2, v2.9.0-beta.1 Oct 2, 2024
@mholt
Copy link
Member

mholt commented Oct 2, 2024

This will go out with 2.9 beta 1. Would appreciate some field testing!

@coandco
Copy link

coandco commented Oct 16, 2024

I think I'm missing something about this. When I use a single wildcard domain, everything works as expected and it gets a cert. If I try to define multiple wildcard domains, it fails with no solvers available for remaining challenges (configured=[http-01 tls- alpn-01] offered=[dns-01] remaining=[dns-01]) and when I look at the resulting caddy JSON from http://localhost:2019/config/ the DNS solver info didn't make it in. I would expect defining multiple wildcard domains to result in multiple wildcard certs which the other domains can use, not for all of them to fail. I've attached the Caddyfiles and resulting JSON configs to this comment.

multiple.Caddyfile.txt
multiple.json
single.Caddyfile.txt
single.json

@francislavoie
Copy link
Member Author

Interesting, thanks @coandco, looks like the Caddyfile adapter part is consolidating automation policies too aggressively. I'll try to make a fix.

@francislavoie
Copy link
Member Author

Fixed in #6636 @coandco, thanks for the test cases!

richid added a commit to richid/homelab-v2 that referenced this pull request Nov 7, 2024
The previous config would end requesting a TLS certificate for each
individual subdomain and not use the wildcard certificate. This change
modifies the labels used on the containers to create host matchers and
handlers to do the routing under a single wildcard Caddyfile site. This
is a little trickier and more verbose while defining the labels but ends
a much cleaner Caddyfile[1][2] and only requires a single certificate.

Hopefully this will all be moot once the auto_https prefer_wildcard
option is released in `2.9.x`.

1. https://caddyserver.com/docs/caddyfile/patterns#wildcard-certificates
2. https://caddy.community/t/docker-proxy-wildcard-subdomains/22170
3. caddyserver/caddy#6146
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature ⚙️ New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Auto HTTPS changes for better wildcard cert config ergonomics
8 participants