Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ACME (Let's Encrypt) support! #7058

Closed
wants to merge 1 commit into from

Conversation

vancluever
Copy link
Contributor

@vancluever vancluever commented Jun 7, 2016

UPDATE: This is almost complete! I'm just putting together the last bit of tasks here so that I have an easy place to go through them:

  • Mark various fields as sensitive (generally having to do with private keys, including respective fields in tls_private_key, tls_cert_request, tls_cert_request and tls_locally_signed_cert in addition to ACME resources)
  • Write tests for functions in acme_structure.go for more coverage granularity
  • Write acceptance test covering http_challenge_port and tls_challenge_port
  • Write acceptance test for DNS provider with config
  • Manual "real world" acceptance test from built binary
  • Rebase on master
  • Squash and remove WIP

This WIP is a pretty much functional implementation of @apparentlymart's proposal from #3599.

As per the original proposal, there are two resources:

acme_registration // Manages registrations
acme_certificate  // Manages certificates

Further to that, thanks to the wonderful lego library, we are able to handle challenges as well - HTTP, TLS, and DNS. I've added a couple of functions to support this that I will be putting in a PR for over there (current fork is here).

There are some things that need work (ECDSA support for certificates, marking some fields as sensitive, cleanup, testing subject/SANs for certs, and docs), but it's far along enough to share. Feedback welcome!

One last thing - not too sure if delete for a registration is 100% implemented yet, I haven't scoured too much yet to get a full answer, but adding in some delete functionality for the reg can be done after the fact, for now it's just being dropped on the floor.

@apparentlymart
Copy link
Contributor

Thanks for working on this, @vancluever!

It looks like ACME moved on a bit since I originally wrote up my proposal. In particular, I was expecting that the acme_certificate resource would take a signing request from the tls_cert_request resource rather than taking all the same arguments itself, but it looks like that's not how the protocol ended up working... or is that just being hidden from us by the client library?

That lego library looks pretty awesome, though I am a little nervous about bundling so much functionality into a single resource... normally we've built things out of smaller parts, but admittedly I didn't come up with any reasonable way to do that in my original proposal. In particular, it feels a little bit weird for the acme_certificate resource to make Route53 records behind the AWS provider's back, and have its own separate AWS configuration.

However I certainly don't want to disuade you from implementing this; it seems like a pretty pragmatic solution that allows us to outsource a bunch of the complexity I was originally expecting.

@vancluever
Copy link
Contributor Author

vancluever commented Jun 7, 2016

@apparentlymart thanks! :) Some replies on this:

  • For the DNS challenge stuff - it's not 100% obvious because I haven't documented it yet, and the example uses environment, but the config block in the dns_challenge block should allow the ability to set the correct AWS config that the operator intends when they are setting up the provider (ie: they can use the same variables they use in whatever provider instance they wanted the resource to land in). lego supports a large amount of DNS providers as well, so we have a lot more options than just AWS too, even some that TF does not fully support yet.
  • The certificate request process is indeed hidden right now by lego's API, but I'm thinking that's something I could fix easy enough - we could take CSR's out of band, or just add the fields to the certificate resource and work on passing those in to lego via a pkix.Name structure (or even just as a CSR as they already take a key). I'll investigate that after I clean up the rest of the code.
  • I think having the entire cert lifecycle rolled into a single resource is, for the most part, OK, as:
    • Authorizations (and ultimately challenges) usually serve to authorize a single DNS name
    • CSR's mainly serve as subject data for a single certificate

So in essence, they are all part of the same lifecycle (auth -> request -> certificate/renewal -> revoke).

Having lego did indeed make this process a lot easier. :)

PS: Trying to vendor the lego deps right now, govendor is giving me grief... will update ASAP.

@vancluever
Copy link
Contributor Author

So some interesting things while I've been testing the CSR support: looks like boulder strips everything out but the common names from a certificate's subject.

The support is in, we might as well keep it, but definitely it's probably mainly going to be worth it from an external CSR standpoint, rather than if you want more info in the subject than the common name.

@apparentlymart
Copy link
Contributor

That's a shame... I was hoping that by using tls_cert_request as our one and only CSR-generation code we'd ensure that we have consistent features/interface across all different mechanism for generating certificates, but if it's gonna quietly ignore the rest of the stuff in there anyway then that seems more confusing than helpful...

@vancluever
Copy link
Contributor Author

Yeah, I think there is a good use case for supporting both methods - having the CSR support at the very least allows both channels to be available in the case you are generating certificates via TF for another provider or process, or you need to use a CSR from an external source, or you can just use common_name and subject_alternative_names in the event that you aren't.

Also, maybe it's possible at some point in time they will support it, you never know (but also since it's domain validated they don't necessarily want to give the impression that you are getting any level of trust higher than that...)

@vancluever
Copy link
Contributor Author

Everyone, this is now complete.

The commit has a pretty good summary of the changes (quoted below). Ready for review and looking forward to getting it merged!

A note on the testing: most of the acceptance tests use DNS validation with Route 53. Due to the nature of HTTP and TLS challenges, the HTTP challenge is useable, but requires that ACME_HTTP_R53_ZONE_ID and ACME_HTTP_HOST_IP are set so that the test knows what to set the domain name's DNS to. The test skips if these are not defined. Otherwise the same environment that is required to run the AWS acceptance tests should run these ones just fine.

This commit adds the ACME provider, enabling support for automated
domain-validated certificate authorities such as Let's Encrypt
(https://letsencrypt.org/).

There are 2 resources:

 * acme_registration - This resource manages registrations with an ACME
   CA, which in the resource is a private key/email contact combination.
 * acme_certificate - This resource manages certificates, taking domains
   directly via the resource or through a pre-generated CSR. HTTP, TLS,
   and DNS challenges are supported, the latter through an assortment of
   providers.

There is no explicit provider configuration for this provider, to
address the chicken-and-egg relationship between registrations and
certificates. Neither resource has a hard dependency on the other and
acme_certificate can use a registration that is not managed by
Terraform.

Consult the ACME provider documentation at
website/source/docs/providers/acme for full details, or at
https://www.terraform.io/docs/providers/acme/index.html once the merged
documentation is online.

Also, as part of this commit, we flag several private key-related fields
in the resources in the TLS provider as sensitive, so that they can be
used within the resource should the need be there, ensuring that their
field data does not get leaked in logs.

@vancluever vancluever changed the title [WIP] ACME (Let's Encrypt) support! ACME (Let's Encrypt) support! Jun 15, 2016
This commit adds the ACME provider, enabling support for automated
domain-validated certificate authorities such as Let's Encrypt
(https://letsencrypt.org/).

There are 2 resources:

 * acme_registration - This resource manages registrations with an ACME
   CA, which in the resource is a private key/email contact combination.
 * acme_certificate - This resource manages certificates, taking domains
   directly via the resource or through a pre-generated CSR. HTTP, TLS,
   and DNS challenges are supported, the latter through an assortment of
   providers.

There is no explicit provider configuration for this provider, to
address the chicken-and-egg relationship between registrations and
certificates. Neither resource has a hard dependency on the other and
acme_certificate can use a registration that is not managed by
Terraform.

Consult the ACME provider documentation at
website/source/docs/providers/acme for full details, or at
https://www.terraform.io/docs/providers/acme/index.html once the merged
documentation is online.

Also, as part of this commit, we flag several private key-related fields
in the resources in the TLS provider as sensitive, so that they can be
used within the resource should the need be there, ensuring that their
field data does not get leaked in logs.
@vancluever
Copy link
Contributor Author

Hey guys!

I haven't heard much on this one for the last little bit, but just thought I'd give this one a refresh today while I was doing a rebase of all my outstanding PRs. The main things that have been updated:

  • Updated lego to the most recent upstream now that all my PRs have been merged over there
  • Added the ovh DNS challenge provider (for https://www.ovh.com, from most recent lego release)
  • Updated the changes to the tls_cert_request to reflect its move to being a data source

Again, any feedback to help get this merged is welcome!

@apparentlymart
Copy link
Contributor

@vancluever thanks for all the work on this!

I'm still feeling a little uneasy about using all of these challenge implementations from lego... it feels awkward to have the ACME provider implicitly be a client for a number of other things that all have their own providers, and do so much in a single action.

I'm sorry that I'm being a bit difficult here, but how would you feel about a cut-down version of this to start, that just assumes a verification is already performed and takes a CSR and obtains a cert? Then we could get that in quickly while spending a little more time thinking about whether this is the best way to handle the verification portion.

Also it'd be easier to review this if you could split the vendoring of lego into a separate commit from the provider implementation, so we can more easily look just at the diff of the latter. (Right now the diff is too big for github's PR UI to render, so the page is performing super slowly for me and even then not showing the whole diff.)

@vancluever
Copy link
Contributor Author

vancluever commented Jul 11, 2016

Hey @apparentlymart,

OK, will do. Would you be open to reviewing two things actually?

  • A revised acme_certificate resource with everything but the CSR bits stripped out - this will force the workflow into using tls_cert_request
  • An acme_authorization resource where the authorization and challenge functionality will be moved to.

I really do believe that having the challenge resource as an end-to-end thing is the best route, after some thought I gave this on the weekend:

  • Having an external challenge handler would relegate an authorization resource's functionality to little more than a null resource, as it would only be able to handle a small portion of the actual process of handling an authorization - considering the handler would need to be in a local-exec provisioner, possibly not even half of the process. On top of that, the way that such a resource would need to handle credentials possibly does not make the situation any better than it currently is - at least this way we can guide people on how to manage credentials when using the DNS challenge handlers.
  • Moving the DNS challenge handlers to the providers they work with be much worse, would be a complete mis-assignment of code duties and would probably lead to duplication and more of a crossing of provider boundaries than currently exists - I don't think the thought is even worth entertaining.

Since lego does not currently handle these things separately, there will need to be some de-coupling of it as well that I'll need to work on before this is complete (EDIT: might just have to export a few un-exported functions) - let me know your thoughts in the meantime!

@vancluever vancluever force-pushed the paybyphone_acme branch 2 times, most recently from 45b5c38 to 51a6fa9 Compare July 11, 2016 21:08
@apparentlymart
Copy link
Contributor

@vancluever in my original conception of the problem, the authorization was something I expected would probably be handled entirely out-of-band of Terraform. A nice aspect of the ACME protocol is that the authorization process and the request process are completely decoupled, so Terraform doesn't need to know anything about the prior authorizations in order to request a cert... it can just assume that something else did the authorizations earlier, and get an error back from the server if not.

In this model, I would expect that acme_certificate would just be a small wrapper around ObtainCertificateForCSR. (I do like the idea of having this resource just accept a PEM-encoded CSR, which the user would then produce using the tls_cert_request data source.)

I notice there's currently no subcommand of the lego CLI to just do the authorization, and leave the cert to be issued separately. I wonder if we could convince the lego author to add a command to just authorize, and then we could recommend people to use this as a one-off manual setup step as part of getting ACME set up in Terraform, and then afterwards Terraform would be responsible for obtaining and renewing the certificates on an ongoing basis. (Even if such a command isn't added, one could potentially issue a certificate once using the lego CLI and then throw that cert away and let Terraform request a new one for it to manage.)

With all of that said, I want you to know that I would love to have ACME work end-to-end in Terraform in principle, and I'm very grateful for your work on this... I just have some reservations about the unintended consequences of having the ACME provider essentially going behind the back of various other providers and creating or modifying resources they can't see, which may cause problems for those providers. My preference is to leave this particular resource out for the moment (accepting that this makes Terraform's ACME support incomplete) and then see if further improvements to Terraform's core capabilities (such as conditional configuration as discussed in #1604) could allow ACME authorization resources to be represented within Terraform's configuration language itself, rather than as hidden side-effects of a single authorization resource.

(As a specific example of what I mean, if I have Terraform managing my DNS with aws_route53_zone and aws_route53_record resources and acme_authorization creates an additional record in there, now Terraform wouldn't be able to destroy the zone successfully because there will be an extra record in the zone that it isn't managing.)

@vancluever
Copy link
Contributor Author

vancluever commented Jul 11, 2016

@apparentlymart keeping authorization and certificate requests separate might not be the way ACME is going - check the latest draft out (I just found this out as I was working on de-coupling the two resources:

It almost looks like, just from looking here, that new-authz is going away, hence there will be no way to manage the lifecycle of an authorization in an exclusively CRUD-oriented fashion. Certificates will be requested via a new-app endpoint, and it will be up to the CA to derive the authorizations needed and send those and the challenges to the client. (Mind you, who knows when it will be adopted or changed in the future - I'd imagine that LE won't necessarily just drop their version and switch once ACME is out of draft, so this could be moot).

With all of that said, I understand the concerns - what I will do for now is break off the authorization resource from this PR and move it to a secondary branch so that the code is there and useable, should we need it (and also available for my own use).

In regards to the other items:

  • De-coupling was easy (all the steps that ObtainCertificate* does look like they were easily exportable, but I will leave final word for after I get the new work complete), so we can use RequestCertificateForCsr and RenewCertificate without issue,
  • I'm only modifying API, leaving granularity in the lego CLI to upstream if they so want to do it,
  • Not too sure if it changes anything, but lego deletes authorization records after it's finished with them. The authorization process happens every time ObtainCertificate* is called. This would mitigate the instances where conflicts happen when resources exist out of state.

@vancluever
Copy link
Contributor Author

OK, so scratch my last point about lego being easily de-coupleable. The getChallenges and solveChallenges functions in particularly also do creation, which has kind of made me question the sustainability of the library as a fully CRUD-able tool.

It's a great piece of code, but I think if we want more granularity in this (especially if we want to do things like manage authorizations and send challenges to different resources, and be able to refresh them afterwards to get status) I think we are going to have to look elsewhere (or maybe make our own). I'm not necessarily too sure when I'll have time to fully tackle this, but when I do, I also want to take into account things like #1604 and #4149. I'm going to leave this open here for now just due to the fact that it's available and functional, and will update when I have something tangible with the re-factor.

@apparentlymart
Copy link
Contributor

Thanks for the pointers to that discussion and change, @vancluever. It does seem like a pretty big shift in philosophy of the protocol, moving away from separate "authorize domain" and "get cert for domain" steps to a more general "submit an arbitrary CSR and get back list of authorization steps".

One thing that isn't clear to me yet is whether it's expected that doing the authorization steps is a one-time thing or something that needs to be repeated for each renewal. If the latter (and based on discussion in that thread, it seems like that some providers are expected to behave that way) then a Terraform resource for cert issuance alone becomes significantly less useful, since it would require manual intervention each time the cert expires.

My instinct from reading that thread is that we should probably pause for a little while and let the flow stabilize a bit before we try to map it onto Terraform resources. The concepts still seem pretty fluid.

I really appreciate your time and energy on this so far, @vancluever, and I also appreciate your patience in working through this with me.

@roboll
Copy link
Contributor

roboll commented Aug 13, 2016

@vancluever have you considered making your current work available as a standalone terraform plugin? i'm using it currently and i'm sure others would find it useful as well.

@vancluever
Copy link
Contributor Author

@roboll that's a great idea! I'll try and work on that this weekend, it would be a great way to have the best of both worlds on this one. I did consider it briefly but I've been distracted with other stuff, but it looks like there's at least a little bit of a demand for it, considering the +1's. I'll update this thread once it's up!

vancluever pushed a commit to vancluever/terraform-provider-acme-old that referenced this pull request Sep 5, 2016
This was originally hashicorp/terraform#7058, however due to issues
(documented in the README) it can't be included in Terraform mainline.
More than a few people have still found it useful though, and an
external plugin has been requested.

This has been tested as of today and is still functional (the HTTP test
was skipped but the Route53 tests are all functional, and the plugin
works outside of the tests).

Also added code to use the dnssimple and powerdns lego modules.
@vancluever
Copy link
Contributor Author

Everyone,

Sorry for the delay! I have now completed the standalone plugin and it has been tested both via the integration tests that were included and also via a manual test flight of the plugin.

You can refer to the new project page to get your hands on the plugin (I've even published a signed release).

Seeing as this is now complete, I'm going to close this one now. If you have any issues with this plugin please report an issue on the new project page versus putting in an issue here, of course.

Thanks everyone for their interest in this - it makes me happy to hear that it's useful for you all!

@ghost
Copy link

ghost commented Apr 22, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators Apr 22, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants