-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
caddytls: fix permission requirement with AutomationPolicy #6328
Conversation
Certificate automation has permission modules that are designed to prevent inappropriate issuance of unbounded or wildcard certificates. When an explicit cert manager is used, no additional permission should be necessary. For example, this should be a valid caddyfile: https:// { tls { get_certificate tailscale } respond OK } This is accomplished when provisioning an AutomationPolicy by tracking whether there were explicit managers configured directly on the policy (in the ManagersRaw field). Only when a number of potentially unsafe conditions are present AND no explicit cert managers are configured is an error returned. The problem arises from the fact that ctx.LoadModule deletes the raw bytes after loading in order to save memory. The first time an AutomationPolicy is provisioned, the ManagersRaw field is populated, and everything is fine. An AutomationPolicy with no subjects is treated as a special "catch-all" policy. App.createAutomationPolicies ensures that this catch-all policy has an ACME issuer, and then calls its Provision method again because it may have changed. This second time Provision is called, ManagesRaw is no longer populated, and the permission check fails because it appears as though the policy has no explicit managers. Address this by storing a new boolean on AutomationPolicy recording whether it had explicit cert managers configured on it. Also fix an inverted boolean check on this value when setting failClosed. Updates caddyserver#6060 Updates caddyserver#6229 Updates caddyserver#6327 Signed-off-by: Will Norris <will@tailscale.com>
Oh wow, that's super subtle and tricky. Nice find. I wonder if we should be resolving the underlying problem here though, i.e. it being Provision'd twice. That really should only happen one time. I'm not sure if we have a good way of doing that though. Maybe the catch-all one would need to be a totally fresh instance instead of re-provisioning? Maybe too complex. But this workaround is probably sufficient for now. Thanks! |
@@ -262,9 +265,9 @@ func (ap *AutomationPolicy) Provision(tlsApp *TLS) error { | |||
// prevent issuance from Issuers (when Managers don't provide a certificate) if there's no | |||
// permission module configured | |||
noProtections := ap.isWildcardOrDefault() && !ap.onlyInternalIssuer() && (tlsApp.Automation == nil || tlsApp.Automation.OnDemand == nil || tlsApp.Automation.OnDemand.permission == nil) | |||
failClosed := noProtections && hadExplicitManagers // don't allow on-demand issuance (other than implicit managers) if no managers have been explicitly configured | |||
failClosed := noProtections && !ap.hadExplicitManagers // don't allow on-demand issuance (other than implicit managers) if no managers have been explicitly configured |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made this change based on the inline comment, and it seems to make sense. However, now looking at it again, this is a noop since it is the same check that results in the returned error directly below. So either the inline comment here and/or I am a little confused, or we can just remove this failClosed
var entirely. Admittedly, this is my first time digging into this section of code, so I may just be misunderstanding the intent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I probably need more testing here 😅
Yeah, I had that thought as well. I wasn't sure if it's generally best practice to avoid provisioning a module multiple times. Though for what it's worth, this particular logic regarding re-provisioning the catch-all automation policy is from 2020, so it's pretty well established. |
Ahh okay, so yeah... we do reprovision APs at times because we need to update the underlying CertMagic configs. Let's give this a try... not really sure a cleaner/simpler way to do it right now, but would be interested with revisiting it if we find one more elegant :) |
Certificate automation has permission modules that are designed to prevent inappropriate issuance of unbounded or wildcard certificates. When an explicit cert manager is used, no additional permission should be necessary. For example, this should be a valid caddyfile:
This is accomplished when provisioning an AutomationPolicy by tracking whether there were explicit managers configured directly on the policy (in the ManagersRaw field). Only when a number of potentially unsafe conditions are present AND no explicit cert managers are configured is an error returned.
The problem arises from the fact that ctx.LoadModule deletes the raw bytes after loading in order to save memory. The first time an AutomationPolicy is provisioned, the ManagersRaw field is populated, and everything is fine.
An AutomationPolicy with no subjects is treated as a special "catch-all" policy. App.createAutomationPolicies ensures that this catch-all policy has an ACME issuer, and then calls its Provision method again because it may have changed. This second time Provision is called, ManagesRaw is no longer populated, and the permission check fails because it appears as though the policy has no explicit managers.
Address this by storing a new boolean on AutomationPolicy recording whether it had explicit cert managers configured on it.
Also fix an inverted boolean check on this value when setting failClosed.
Updates #6060
Updates #6229
Updates #6327