Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS IAM certificates get deleted on rerun even though tf files were not changed #3837

Closed
avinci opened this issue Nov 10, 2015 · 19 comments
Closed

Comments

@avinci
Copy link

avinci commented Nov 10, 2015

I have 2 certs for SSL pointing to a 2 load balancers on AWS. Upon the first run of the terraform apply, it works fine. Now if I rerun apply again and not changing any part of the tf files, it will try to delete the IAM certs and this will fail as LBs are using it. Here is the log with TF_LOG enabled

Cert is present in the state file.

015/11/09 18:48:34 [DEBUG] vertex aws_iam_server_certificate.apiBetaCert, got dep: aws_iam_server_certificate.apiBetaCert (destroy)
2015/11/09 18:48:34 [DEBUG] vertex aws_elb.betaAPILb, got dep: aws_iam_server_certificate.apiBetaCert
2015/11/09 18:48:34 [DEBUG] vertex aws_elb.betaAPILb, got dep: var.amis
2015/11/09 18:48:34 [DEBUG] vertex provider.aws (close), got dep: aws_elb.betaAPILb
2015/11/09 18:48:34 [DEBUG] root: eval: *terraform.EvalWriteState
2015/11/09 18:48:34 [DEBUG] root: eval: *terraform.EvalApplyPost
2015/11/09 18:48:34 [ERROR] root: eval: *terraform.EvalApplyPost, err: 1 error(s) occurred:

* aws_iam_server_certificate.wwwBetaCert: [WARN] Error deleting server certificate: DeleteConflict: Certificate: ASCAIS4G4XMALDFK6DDAY is currently in use by arn:aws:elasticloadbalancing:us-east-1:412520076220:loadbalancer/betaWWWLb. Please remove it first before deleting it from IAM.
2015/11/09 18:48:34 [ERROR] root: eval: *terraform.EvalSequence, err: 1 error(s) occurred:
``
@catsby
Copy link
Contributor

catsby commented Nov 12, 2015

Hey @avinci – do you have a configuration that demonstrates this (minus any secrets)?
I'd like to see if this is impacted by #3898, and otherwise reproduce it

@catsby catsby added the waiting-response An issue/pull request is waiting for a response from the community label Nov 12, 2015
@avinci
Copy link
Author

avinci commented Nov 14, 2015

attached https://github.com/avinci/terraform-repro

I realized this is happening to even simple security groups. I am not sure what I might be doing wrong here for the NATSG. It keeps re doing SG. This is happening for ELBs with certs too and of course its happening for IAM certs and I did not want to upload my certs to repro that.

@avinci
Copy link
Author

avinci commented Nov 14, 2015

I think the main issue is the fact that the diff code is not working as expected between whats on AWS and whats in the tfstate.

@avinci
Copy link
Author

avinci commented Nov 14, 2015

I did another quick test. I made a change directly onto AWS SG and ran terraform refresh. I am assuming that should have updated my tfstate with AWS state. It did not.

Based on this, I am doing something really wrong or the state comparison has some issues.

@avinci
Copy link
Author

avinci commented Nov 15, 2015

I think I figured out the issue here about SGs getting updated. Not sure if its a bug or not...but for sure its weird as plan does not throw any error but keeps repeating the ops

does not work

  ingress {
    from_port = 443
    to_port = 443
    protocol = "tcp"
    cidr_blocks = [
      "${var.private0-1CIDR}"]
  }
  ingress {
    from_port = 443
    to_port = 443
    protocol = "tcp"
    cidr_blocks = [
      "${var.private0-2CIDR}"]
  }

works

  ingress {
    from_port = 443
    to_port = 443
    protocol = "tcp"
    cidr_blocks = [
      "${var.private0-1CIDR}",
      "${var.private0-2CIDR}"]
  }

I will play with the certs now and see whats the issue and update a sample

@avinci
Copy link
Author

avinci commented Nov 15, 2015

I added the repro for certificate issue now and the repo has been updated minus the keys

I ran terraform apply 3 times
*Upon first run, I think the order or sleep is not enough for us to reference the cert in LB i guess.

  • The second time I run, I get diff does not match and it errors
  • 3rd run, It deleted the cert and recreated it and I got the resource not found error
aws_security_group.natSg: Modifications complete
Error applying plan:

1 error(s) occurred:

* aws_elb.betaAPILb: Error creating ELB: CertificateNotFound: Server Certificate not found for the key: arn:aws:iam::291318889788:server-certificate/apiBetaCert
    status code: 400, request id: 3876c7e2-8b49-11e5-ba93-d906d4705cc4

Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with
any resources that successfully completed. Please address the error
above and apply again to incrementally change your infrastructure.
avicMP:terraform-repro avic$ terraform apply
aws_iam_server_certificate.apiBetaCert: Refreshing state... (ID: ASCAJMDKBD7PFCBBHQJZO)
aws_vpc.betaVPC: Refreshing state... (ID: vpc-e16d4c84)
aws_security_group.betaWebSG: Refreshing state... (ID: sg-00cd8764)
aws_internet_gateway.betaIG: Refreshing state... (ID: igw-eef99c8b)
aws_security_group.natSg: Refreshing state... (ID: sg-07cd8763)
aws_subnet.betaPubSN0-0: Refreshing state... (ID: subnet-c04903b7)
aws_route_table.betaPubSN0-0RT: Refreshing state... (ID: rtb-a56257c0)
aws_route_table_association.betaPubSN0-0RTAssn: Refreshing state... (ID: rtbassoc-d0d8d0b5)
aws_iam_server_certificate.apiBetaCert: Destroying...
aws_iam_server_certificate.apiBetaCert: Destruction complete
Error applying plan:

1 error(s) occurred:

* aws_iam_server_certificate.apiBetaCert: diffs didn't match during apply. This is a bug with Terraform and should be reported.

Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with
any resources that successfully completed. Please address the error
above and apply again to incrementally change your infrastructure.

avicMP:terraform-repro avic$ terraform apply
aws_vpc.betaVPC: Refreshing state... (ID: vpc-e16d4c84)
aws_internet_gateway.betaIG: Refreshing state... (ID: igw-eef99c8b)
aws_subnet.betaPubSN0-0: Refreshing state... (ID: subnet-c04903b7)
aws_security_group.betaWebSG: Refreshing state... (ID: sg-00cd8764)
aws_security_group.natSg: Refreshing state... (ID: sg-07cd8763)
aws_route_table.betaPubSN0-0RT: Refreshing state... (ID: rtb-a56257c0)
aws_route_table_association.betaPubSN0-0RTAssn: Refreshing state... (ID: rtbassoc-d0d8d0b5)
aws_iam_server_certificate.apiBetaCert: Creating...
  arn:              "" => "<computed>"
  certificate_body: "" => "a30fbbe0b158c8a21ff05e80a1c9bb95abdfd76e"
  name:             "" => "apiBetaCert"
  path:             "" => "/"
  private_key:      "" => "159c57e7d8e703d2d3939ff868603ee2297c6595"
aws_iam_server_certificate.apiBetaCert: Creation complete
aws_elb.betaAPILb: Creating...
  availability_zones.#:                        "" => "<computed>"
  connection_draining:                         "" => "0"
  connection_draining_timeout:                 "" => "300"
  dns_name:                                    "" => "<computed>"
  health_check.#:                              "" => "1"
  health_check.1152471759.healthy_threshold:   "" => "2"
  health_check.1152471759.interval:            "" => "5"
  health_check.1152471759.target:              "" => "HTTP:61111/"
  health_check.1152471759.timeout:             "" => "3"
  health_check.1152471759.unhealthy_threshold: "" => "2"
  idle_timeout:                                "" => "60"
  instances.#:                                 "" => "<computed>"
  internal:                                    "" => "<computed>"
  listener.#:                                  "" => "1"
  listener.3661128776.instance_port:           "" => "61111"
  listener.3661128776.instance_protocol:       "" => "http"
  listener.3661128776.lb_port:                 "" => "443"
  listener.3661128776.lb_protocol:             "" => "https"
  listener.3661128776.ssl_certificate_id:      "" => "arn:aws:iam::291318889788:server-certificate/apiBetaCert"
  name:                                        "" => "betaAPILb"
  security_groups.#:                           "" => "1"
  security_groups.1068360494:                  "" => "sg-00cd8764"
  source_security_group:                       "" => "<computed>"
  subnets.#:                                   "" => "1"
  subnets.1354782321:                          "" => "subnet-c04903b7"
  zone_id:                                     "" => "<computed>"
Error applying plan:

1 error(s) occurred:

* aws_elb.betaAPILb: Error creating ELB: CertificateNotFound: Server Certificate not found for the key: arn:aws:iam::291318889788:server-certificate/apiBetaCert
    status code: 400, request id: d1d26ce8-8b49-11e5-ba93-d906d4705cc4

Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with
any resources that successfully completed. Please address the error
above and apply again to incrementally change your infrastructure.
``

@avinci
Copy link
Author

avinci commented Nov 15, 2015

I think I figured out the bug. Our certs are chained and hence have the certificate authority information too. I think the diff is not working for the length of the string.

Our certs structure is

  • our certificate
  • class1 provider cert
  • certificate provider cert

they are chained together and the length of the file must be causing issues.

@jandre
Copy link

jandre commented Nov 18, 2015

+1 I'm having the same issue with certs being deleted/re-created on re-run despite no change. I'm loading the certs from a file.

@catsby catsby removed the waiting-response An issue/pull request is waiting for a response from the community label Nov 20, 2015
@gbarboza
Copy link
Contributor

gbarboza commented Dec 8, 2015

It appears that AWS likes to do some pre-processing on certs before installing them.

  1. Performing dos2unix line-break translations.
  2. Pulling out the chain part of the cert automatically.

Pulling the chain out manually and specifying certificate_chain stopped the re-creation issues for me. So, to clarify, certificate_body should consist of only one certificate and the certificate_chain should contain the rest.

@nTraum
Copy link
Contributor

nTraum commented Dec 10, 2015

Can confirm this still happens on 0.6.8. The workaround mentioned by @gbarboza did fix it for us too.

@antonbabenko
Copy link
Contributor

We had same issue with DigiCert certificates, so we had to update them ourself before pushing to Terraform & AWS:

cat digi_cert.crt.orig | tr -d '\r' > digi_cert_fixed.crt

@jmstone617
Copy link
Contributor

I am specifying certificate_body and certificate_chain separately and am still seeing this issue. I am using TF 0.6.8.

I am loading both from a file. The certificate_body is a .pem file, and the certificate_chain is a .crt file (provided by the CA)

@gbarboza
Copy link
Contributor

gbarboza commented Jan 5, 2016

@jmstone617 Did you make sure that the files are using UNIX line breaks? Most CA's I've encountered don't issue files in that format. Use the command antonbabenko provided above or the dos2unix command to fix the chain and cert files before uploading them.

@jmstone617
Copy link
Contributor

I didn't. @antonbabenko's fix seems to work -- would be good to include this somewhere in the docs.

gbarboza added a commit to gbarboza/terraform that referenced this issue Jan 5, 2016
AWS does some funky stuff to handle all the variations in certificates that CA's like to hand out to users. This commit adds a note about this and details how to avoid issues. See hashicorp#3837 for more information.
@jmstone617
Copy link
Contributor

FWIW, antonbabenko's fix does indeed remove the error, but I get an SSL validation error when trying to hit my ELB. Using the original certificate from the CA resolves properly, but presents the TF error. So, I would guess that the .crt file itself is formatted properly, since AWS is accepting it.

@nathanielks
Copy link
Contributor

certificate_chain is causing the issue for me. It's most confusing because I've:

  1. Separated the certificate chain and body into separate files
  2. The chain file is valid (no ^M line breaks)

I'm not sure what else could be wrong? Here's a gist of the plan. I'm targeting a route53 resource for reference.

Edit:
Terraform 0.6.12
OS X 10.10.5

@jedi4ever
Copy link

got mine working - it was not the end of lines but the line length in one of the intermediate certs.
It contained lines longer then the standard 65 . Re-formatting the cert made it work.

Just do openssl x509 -in bad.pem > good.pem

thx to the tip of @RykHawthorn
#2625

@stack72
Copy link
Contributor

stack72 commented Sep 3, 2016

Closed via #8074

@ghost
Copy link

ghost commented Apr 22, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators Apr 22, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests