-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Root CA getting generated even if error is returned #18667
Comments
@nsimons I think this is desired behavior, within the bounds of Vault today. A warning is meant to indicate there is more to do, or something might not have been done quite as intended. An error indicates something went wrong, and IMO, may not necessarily be fatal. In our case, Vault today lacks storage transactions, making it nearly impossible to fully and cleanly revert a partial operation. Here, an error occurred during CRL building. This isn't fatal -- the issuer was still generated and is discoverable under the APIs -- but indicates that more-up-to-date CRLs will be unavailable until this is addressed and rectified. This could lead to operational issues if not addressed, as if clients attempt to fetch more up-to-date CRLs, none may be available. |
Right, yeah it's a classical problem. Would it then make sense that the client always tries to delete the issuer and start over if it encounters an error? Because the client cannot really know whether an error is something that it can ignore (e.g. CRL will be rebuilt later) or something that it should act on (CA generated, but e.g. serial number was not stored (so revoke not possible?)). We can't really inspect error messages since those can change between Vault versions. |
I don't think that's sufficient in all cases. To take your other issue (#18583 - sorry :-), if you were to In general, as much as we can, we do attempt to keep key substrings of error messages stable, but I do agree in general you can't rely on it... What about a usage-based testing approach? You can list issuers+keys before/after, and attempt to issue a new (short-lived) cert under it, attempt to revoke it, rotate the CRL (and optionally, run a tidy with a short safety buffer once the temporary leaf has expired). If you don't see the issuer, you know it really is a fatal error somewhere (perhaps bad parameters?), if you're able to use it but just not able to revoke (and so could flag the mount to a human operator), or if it all works fine. (The assumption here is that you need automated creation of Root CAs, which seems a little interesting as a use case but could be an interesting problem in general :-) |
No worries, the background info are the same across the two issues ;), basically we have hundreds of PKI backends and sometimes the storage has a hiccup. So for us re-creating the issuer and trying again could help I believe (we only have one issuer per PKI mount). Usage-based testing is a good idea and that's what we are now trying to do. We check the issuers endpoint in order to determine whether the CA was created or not. So far it works, but I feel it's not that future proof. If CRL (re)building is not fatal, would it make sense to handle the error case differently?
Issue a warning and continue ( |
Describe the bug
We came across a scenario where generating root CA failed (error was returned) but the CA was actually generated successfully.
Upon inspecting the code, the culprit is here. The CA is generated and persisted but there are other actions that follow. If any of those actions fail, an error is returned to the caller, which suggests that the CA generation failed. But actually it succeeded. For example, in our case the function errors out when trying to build the CRL: b.crlBuilder.rebuild().
This can lead to some nasty side effects
To Reproduce
Easiest way is to modify the code to always return an error from b.crlBuilder.rebuild().
Then simply try to generate a root CA by following the tutorial: Generate Root CA.
Even if an error is returned the CA is visible e.g. through the /issuers endpoint.
Expected behavior
An error should not be returned if CA generation is actually successful.
Alternatively, CA should not be persisted if an error is being returned.
Additional context
Happened on 1.11.3, but also reproducible on current main branch.
The text was updated successfully, but these errors were encountered: