Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FR: Allow startup with unreacheable provisioner #589

Closed
LecrisUT opened this issue May 27, 2021 · 1 comment · Fixed by #1765
Closed

FR: Allow startup with unreacheable provisioner #589

LecrisUT opened this issue May 27, 2021 · 1 comment · Fixed by #1765
Labels
good first issue roadmap An item for roadmap discussion

Comments

@LecrisUT
Copy link
Contributor

Description

If a provisioner cannot be accessed, e.g. OAuth server is down, allow step-ca to boot up with the remaining functioning provisioners. Probably this is already in the new management revamp but it's worth keeping an issue for this. @dopey could you confirm this?

Use case

This is part of my recent hiccups when bootstrapping a fully integrated server after a long power outage.
The relevant setup for this issue is:

  • keycloak uses certificates from step-ca ACME with caddy automatically updating the certificates.
  • step-ca uses keycloak's https endpoints for its OAuth provisioner. Probably as a workaround we could link to the internal .well-known without https, but this needs to be tested.
  • After a long outage keycloak's certificate is expired, and step-ca will not boot up because it detects the OAuth has expired TLS. But reversely, the ACME endpoint is not accessible for caddy to update the certificate because step-ca is not booting.
@maraino maraino added the needs triage Waiting for discussion / prioritization by team label Jun 2, 2021
@dopey
Copy link
Contributor

dopey commented Jun 8, 2021

Discussed during a triage meeting and, in short, we agree.

Currently, step-ca caches OIDC well known results at start up and then refreshes them periodically. It should be changed to not request the OIDC details on startup (allowing the CA to load) and then to attempt first load on first use of the provisioner.

Going into our backlog, but if anyone is looking for a way to contribute we'd happily accept a PR. Please reach out if you're interested.

As a workaround for the original issue, you can remove the OIDC provisioner, wait for the keycloak server to get it's cert from the acme provisioner, then add the OIDC provisioner back. Not ideal, but it will get you unstuck.

This will actually need to be fixed in short order when managed provisioners are mainstream because users will have no way to make changes to provisioners if the CA configuration cannot even startup. (Right now you can just update the json, but we're moving away from that).

@LecrisUT thanks for bringing this to our attention.

@dopey dopey added roadmap An item for roadmap discussion and removed needs triage Waiting for discussion / prioritization by team labels Jun 8, 2021
maraino added a commit that referenced this issue Mar 13, 2024
This commit will mark a provisioner as disabled if it fails to
initialize. The provisioner will be visible, but authorizing a token
with a disabled provisioner will always fail.

Fixes: #589, #1757
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue roadmap An item for roadmap discussion
Projects
Development

Successfully merging a pull request may close this issue.

3 participants