Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aws_acm_certificate appears to be waiting for DNS validation to complete when SANs are present #8530

Closed
tdmalone opened this issue May 6, 2019 · 6 comments · Fixed by #12371
Labels
bug Addresses a defect in current functionality. service/acm Issues and PRs that pertain to the acm service.
Milestone

Comments

@tdmalone
Copy link
Contributor

tdmalone commented May 6, 2019

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform Version

Terraform v0.11.13
+ provider.aws v2.8.0

Affected Resource(s)

Terraform Configuration Files

resource "aws_acm_certificate" "main" {
  domain_name = "example.com"
  validation_method = "DNS"

  subject_alternative_names = [
    "one.example.com",
  ]
}

Expected Behavior

The certificate should be created in a state of pending validation, and Terraform should return success.

Actual Behavior

The certificate is created in a state of pending validation, but Terraform appears to wait for validation of the SAN to succeed (which it won't, DNS records haven't been retrieved and added yet), and then errors as follows:

aws_acm_certificate.main: Creating...
  arn:                         "" => "<computed>"
  domain_name:                 "" => "example.com"
  domain_validation_options.#: "" => "<computed>"
  subject_alternative_names.#: "" => "1"
  subject_alternative_names.0: "" => "one.example.com"
  validation_emails.#:         "" => "<computed>"
  validation_method:           "" => "DNS"
aws_acm_certificate.main: Still creating... (10s elapsed)
aws_acm_certificate.main: Still creating... (20s elapsed)
aws_acm_certificate.main: Still creating... (30s elapsed)
aws_acm_certificate.main: Still creating... (40s elapsed)
aws_acm_certificate.main: Still creating... (50s elapsed)
aws_acm_certificate.main: Still creating... (1m0s elapsed)

Error: Error applying plan:

1 error(s) occurred:

* aws_acm_certificate.main: 1 error(s) occurred:

* aws_acm_certificate.main: No validation options need to retry: {
  DomainName: "one.example.com",
  ValidationMethod: "DNS",
  ValidationStatus: "PENDING_VALIDATION"
}

Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with
any resources that successfully completed. Please address the error
above and apply again to incrementally change your infrastructure.

Note: when there are no SANs included, the certificate is created without errors, as expected, so this issue appears to be unique to SANs.

Steps to Reproduce

  1. Use the code supplied above
  2. terraform apply and enter yes when prompted

Important Factoids

I don't appear to be able to re-produce this issue every time. I originally discovered it on my own domain, hence trying to reproduce it with the sample code above using example.com. However, I am now not able to reproduce this if I replace example.com with one of my own domains. This appears to point to example.com being an issue with ACM (makes sense).... except that the domain I originally experienced it with should have been just fine!

Out of interest, the domain that I did experience this on before went on to take some time to validate - over an hour, which is much more than usual - so perhaps something had triggered inside AWS and validation was just extended for that domain. Perhaps that resulted in a different API response which is similar to what happens with example.com here??

Either way, I understand that if this is difficult/flimsy to replicate, it's difficult to fix.

References

Might be related to another SAN issue I just lodged: #8531

@aeschright aeschright added needs-triage Waiting for first response or review from a maintainer. service/acm Issues and PRs that pertain to the acm service. labels Jun 19, 2019
@alkis-hexa
Copy link

I am experiencing the same issue, terraform is still on module.wildcard_certificate.aws_acm_certificate_validation.cert: Still creating... [27m1s elapsed]

Terraform v0.12.1

  • provider.aws: version = "~> 2.13"

Zone example.com. is been created, certificate *.example.com is also created but does not seem to pick up the DNS record for validation...

@petewilcock
Copy link

Just to add to this for those in pain - I was also experiencing this error with Terraform trying to create forever, and then tried to create my certificate manually only to get the bemusing "com.amazon.coral.service.InternalFailure" error.

If you're like me and working in an Organization with a service control policy or a permissions-restricted IAM user, the permissions of acm:* aren't enough, as it turns out you also need kms:CreateGrant.

Since the AWS console doesn't even surface this error, Terraform has no chance and just continuously retries.

kcburge pushed a commit to kcburge/terraform-provider-aws that referenced this issue Nov 8, 2019
This pull request is similar to, and was based on, hashicorp#8708. However, it resolves a few issues I discovered with that patch.

The certificate creation process is clearly asynchronous, and, given
that the provider is attempting to read properties of an
asynchronously created object, it must poll, retrying, until all
critical information is available. hashicorp#8530, however, expects that this
object creation succeeds BEFORE validation is complete, so, we cannot
wait until the certificate is status succeeded, OR, wait until the
domain validation is complete; however, terraform requires the state
to be intact before returning succesfully from creation (as I
understand it), and about the only way to assure the object is created successfully is to retry, which is what this resource does.

My updates:

- I added a retry in case the subject alternate names was empty.

- I wait to Set the subject alternate names until after we've received
all of the domain validation options (if any), so as to prevent
side-effects from retrying.

- Like hashicorp#8708, this patch sorts the SANs and DVOs according to the
order in the original request / terraform state file, so that the
order is predictable.

This should address issue: hashicorp#8531.

If this patch is applied, users will be required to either recreate
their certificates, OR, manually edit the terraform state files to
ensure that the order in the state file reflects the order in their
terraform code.

If found three places that must be edited:

- Reorder domain_validation_options

'''
"domain_validation_options.0.resource_record_name": "domain.com",
"domain_validation_options.0.resource_record_type": "CNAME",
"domain_validation_options.0.resource_record_value": "...",
'''

Replace ".N." in the name with the zero-based index of each domain_validation_options.

- Reorder subject_alternative_names

'''
"subject_alternative_names.0": "*.domain.com"
'''

Replace ".N" in the name with the zero-based index of each subject_alternative_name.

- Reorder aws_route53_record validation resources:

'''
"aws_route53_record.validation.1": {
'''

Replace ".N" with the zero-based index of each route 53 record's domain.

Kevin Burge
Nice, Inc. (https://nice.com)
@aeschright aeschright added bug Addresses a defect in current functionality. and removed needs-triage Waiting for first response or review from a maintainer. labels Dec 17, 2019
@thepont
Copy link

thepont commented Jan 16, 2020

Removing the cert from the console, then running apply fixed this for me.

kcburge pushed a commit to kcburge/terraform-provider-aws that referenced this issue Mar 20, 2020
This pull request is similar to, and was based on, hashicorp#8708. However, it resolves a few issues I discovered with that patch.

The certificate creation process is clearly asynchronous, and, given
that the provider is attempting to read properties of an
asynchronously created object, it must poll, retrying, until all
critical information is available. hashicorp#8530, however, expects that this
object creation succeeds BEFORE validation is complete, so, we cannot
wait until the certificate is status succeeded, OR, wait until the
domain validation is complete; however, terraform requires the state
to be intact before returning succesfully from creation (as I
understand it), and about the only way to assure the object is created successfully is to retry, which is what this resource does.

My updates:

- I added a retry in case the subject alternate names was empty.

- I wait to Set the subject alternate names until after we've received
all of the domain validation options (if any), so as to prevent
side-effects from retrying.

- Like hashicorp#8708, this patch sorts the SANs and DVOs according to the
order in the original request / terraform state file, so that the
order is predictable.

This should address issue: hashicorp#8531.

If this patch is applied, users will be required to either recreate
their certificates, OR, manually edit the terraform state files to
ensure that the order in the state file reflects the order in their
terraform code.

If found three places that must be edited:

- Reorder domain_validation_options

'''
"domain_validation_options.0.resource_record_name": "domain.com",
"domain_validation_options.0.resource_record_type": "CNAME",
"domain_validation_options.0.resource_record_value": "...",
'''

Replace ".N." in the name with the zero-based index of each domain_validation_options.

- Reorder subject_alternative_names

'''
"subject_alternative_names.0": "*.domain.com"
'''

Replace ".N" in the name with the zero-based index of each subject_alternative_name.

- Reorder aws_route53_record validation resources:

'''
"aws_route53_record.validation.1": {
'''

Replace ".N" with the zero-based index of each route 53 record's domain.

Kevin Burge
Nice, Inc. (https://nice.com)
@bflad bflad linked a pull request May 27, 2020 that will close this issue
@bflad bflad added this to the v2.64.0 milestone May 27, 2020
@bflad
Copy link
Contributor

bflad commented May 27, 2020

Hi folks 👋 Apologies if this error is confusing in any way. This happens during the asynchronous process where a certificate with DNS validation is requested and the ACM service has yet to return back the DNS validation records for the domains. We have just merged an update to the aws_acm_certificate resource that will now allow it to wait for up to 5 minutes (instead of just 1 minute) for the ACM service to generate the DNS validation records for certificates with higher amounts of Subject Alternative Names or if this asynchronous ACM DNS validation value creation is otherwise being slow. This will release in version 2.64.0 of the Terraform AWS Provider, later this week. Thanks to @gilbsgilbs for the implementation. 👍

@ghost
Copy link

ghost commented May 29, 2020

This has been released in version 2.64.0 of the Terraform AWS provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template for triage. Thanks!

@ghost
Copy link

ghost commented Jun 26, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks!

@ghost ghost locked and limited conversation to collaborators Jun 26, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Addresses a defect in current functionality. service/acm Issues and PRs that pertain to the acm service.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants