Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hang during terraform plan in 0.6.7 #4043

Closed
landro opened this issue Nov 24, 2015 · 33 comments
Closed

Hang during terraform plan in 0.6.7 #4043

landro opened this issue Nov 24, 2015 · 33 comments

Comments

@landro
Copy link

landro commented Nov 24, 2015

Performing a terraform plan on an existing infrastructure (based on cloudstack and aws providers) that was created using 0.6.6 crashes:

crash.txt

@phinze phinze added the bug label Nov 24, 2015
@phinze
Copy link
Contributor

phinze commented Nov 24, 2015

Hi @landro - this is the same crash reported in #4036 and elsewhere that we believe is fixed in #4024. We're drawing a circle around a few important bugfixes and we'll be looking at a bugfix release soon.

In the meantime, if you're able to check latest master to see if it solves your issue, that'd be helpful!

@landro
Copy link
Author

landro commented Nov 24, 2015

I built fra master, and gave it a go. Now it hangs again (instead of crashing). I've attached debug logs
debug.txt

@phinze
Copy link
Contributor

phinze commented Nov 24, 2015

Okay, so the key portion of the hanging debug log is here:

2015/11/24 16:42:56 [DEBUG] vertex aws_route53_record.api, waiting for: provider.aws
2015/11/24 16:42:56 [DEBUG] vertex provider.aws (close), waiting for: aws_route53_record.api
2015/11/24 16:42:56 [DEBUG] vertex root, waiting for: provider.aws (close)
2015/11/24 16:43:01 [DEBUG] vertex aws_route53_record.api, waiting for: provider.aws
2015/11/24 16:43:01 [DEBUG] vertex provider.aws (close), waiting for: aws_route53_record.api
2015/11/24 16:43:01 [DEBUG] vertex root, waiting for: provider.aws (close)
2015/11/24 16:43:06 [DEBUG] vertex aws_route53_record.api, waiting for: provider.aws
2015/11/24 16:43:06 [DEBUG] vertex provider.aws (close), waiting for: aws_route53_record.api
2015/11/24 16:43:06 [DEBUG] vertex root, waiting for: provider.aws (close)
2015/11/24 16:43:11 [DEBUG] vertex aws_route53_record.api, waiting for: provider.aws
2015/11/24 16:43:11 [DEBUG] vertex provider.aws (close), waiting for: aws_route53_record.api
2015/11/24 16:43:11 [DEBUG] vertex root, waiting for: provider.aws (close)

Those three lines will likely continue to repeat every 5s while it hangs.

Will dig in some more and follow up.

@phinze
Copy link
Contributor

phinze commented Nov 24, 2015

Can you share the config for your aws_route52_record.api and describe how you are authenticating the AWS provider in your environment? That will help us narrow this down.

@landro
Copy link
Author

landro commented Nov 24, 2015

resource "aws_route53_record" "api" {
   zone_id = "${var.route53_zone_id}"
   name = "my.domain.com"
   type = "A"
   ttl = "300"
   records = [
    "${cloudstack_ipaddress.lbext_tcp.0.ipaddress}",
    "${cloudstack_ipaddress.lbext_tcp.1.ipaddress}"   
    ]
}

I'm setting TF_VAR_access_key and TF_VAR_secret_key, reading into variable,

  variable "access_key" {}
  variable "secret_key" {}
  provider "aws" {
    access_key = "${var.access_key}"
    secret_key = "${var.secret_key}"
    region     = "eu-central-1"
  }

and passing them to the aws provider like so

  provider "aws" {
    access_key = "${var.access_key}"
    secret_key = "${var.secret_key}"
    region     = "eu-central-1"
  }

@phinze
Copy link
Contributor

phinze commented Nov 24, 2015

Great thank you. Going to see if I can cook up a reproduction with this.

@phinze phinze changed the title Crash during terraform plan in 0.6.7 Hang during terraform plan in 0.6.7 Nov 24, 2015
@phinze
Copy link
Contributor

phinze commented Nov 24, 2015

Our initial attempts to reproduce have not been successful. Does Terraform v0.6.6 still complete a plan successfully in your environment?

@phinze
Copy link
Contributor

phinze commented Nov 24, 2015

Another shot in the dark to try out would be to switch from using variable and provider blocks to using environment variables to configure your providers (so AWS_* and CLOUDSTACK_*).

Sorry for the trouble! We'll get to the bottom of this.

@landro
Copy link
Author

landro commented Nov 25, 2015

So I tried running a plan using 0.6.6 (and the original provider config), and that also hangs for a long time, repetedly showing

2015/11/25 08:27:32 [DEBUG] vertex aws_route53_record.api, waiting for: provider.aws
2015/11/25 08:27:37 [DEBUG] vertex provider.aws (close), waiting for: aws_route53_record.api

for like 5 mins and the it moves on, before it repeatedly shows the same thing for another 5 mins a few more times. In totalt, 20 minutes pass before plan is completed. Debug log attached.
0.6.6_terraform_plan_hang.txt

Grep-ing for "waiting for: aws_route53_record.api" in the log, shows it appearing every 5 sec during the entire run.

I executed terraform plan one more time - again it took 20 minutes.

@landro
Copy link
Author

landro commented Nov 25, 2015

Anywhere I could add debugging code in order to shed light on what's going on?
What does
2015/11/25 08:27:32 [DEBUG] vertex aws_route53_record.api, waiting for: provider.aws
2015/11/25 08:27:37 [DEBUG] vertex provider.aws (close), waiting for: aws_route53_record.api
actually mean? What kind of thing are we waiting for - a process, a response from aws?

@landro
Copy link
Author

landro commented Nov 25, 2015

I tried configuring the aws provider like so:

  provider "aws" {
    access_key = "whatever"
    secret_key = "something"
    region     = "eu-central-1"
  }

That also triggers waiting for: provider.aws

However:

  provider "aws" {
    access_key = "whatever"
    secret_key = "something"
    region     = "eu-central-9"
  }

immediately fails with: Not a valid region: eu-central-9
as expected

@dverbeek84
Copy link

I have the same problem, only i am not using route53.
It just hangs forrever

2015/11/25 21:19:25 [INFO] Terraform version: 0.6.7  b7c41aed92c2d6ea841f618b54e400fcbf824845+CHANGES
2015/11/25 21:19:25 [DEBUG] Detected home directory from env var: /Users/Danny
2015/11/25 21:19:25 [DEBUG] Discovered plugin: atlas = /opt/homebrew-cask/Caskroom/terraform/0.6.7/terraform-provider-atlas
2015/11/25 21:19:25 [DEBUG] Discovered plugin: aws = /opt/homebrew-cask/Caskroom/terraform/0.6.7/terraform-provider-aws
2015/11/25 21:19:25 [DEBUG] Discovered plugin: azure = /opt/homebrew-cask/Caskroom/terraform/0.6.7/terraform-provider-azure
2015/11/25 21:19:25 [DEBUG] Discovered plugin: cloudflare = /opt/homebrew-cask/Caskroom/terraform/0.6.7/terraform-provider-cloudflare
2015/11/25 21:19:25 [DEBUG] Discovered plugin: cloudstack = /opt/homebrew-cask/Caskroom/terraform/0.6.7/terraform-provider-cloudstack
2015/11/25 21:19:25 [DEBUG] Discovered plugin: consul = /opt/homebrew-cask/Caskroom/terraform/0.6.7/terraform-provider-consul
2015/11/25 21:19:25 [DEBUG] Discovered plugin: digitalocean = /opt/homebrew-cask/Caskroom/terraform/0.6.7/terraform-provider-digitalocean
2015/11/25 21:19:25 [DEBUG] Discovered plugin: dme = /opt/homebrew-cask/Caskroom/terraform/0.6.7/terraform-provider-dme
2015/11/25 21:19:25 [DEBUG] Discovered plugin: dnsimple = /opt/homebrew-cask/Caskroom/terraform/0.6.7/terraform-provider-dnsimple
2015/11/25 21:19:25 [DEBUG] Discovered plugin: docker = /opt/homebrew-cask/Caskroom/terraform/0.6.7/terraform-provider-docker
2015/11/25 21:19:25 [DEBUG] Discovered plugin: dyn = /opt/homebrew-cask/Caskroom/terraform/0.6.7/terraform-provider-dyn
2015/11/25 21:19:25 [DEBUG] Discovered plugin: google = /opt/homebrew-cask/Caskroom/terraform/0.6.7/terraform-provider-google
2015/11/25 21:19:25 [DEBUG] Discovered plugin: heroku = /opt/homebrew-cask/Caskroom/terraform/0.6.7/terraform-provider-heroku
2015/11/25 21:19:25 [DEBUG] Discovered plugin: mailgun = /opt/homebrew-cask/Caskroom/terraform/0.6.7/terraform-provider-mailgun
2015/11/25 21:19:25 [DEBUG] Discovered plugin: null = /opt/homebrew-cask/Caskroom/terraform/0.6.7/terraform-provider-null
2015/11/25 21:19:25 [DEBUG] Discovered plugin: openstack = /opt/homebrew-cask/Caskroom/terraform/0.6.7/terraform-provider-openstack
2015/11/25 21:19:25 [DEBUG] Discovered plugin: packet = /opt/homebrew-cask/Caskroom/terraform/0.6.7/terraform-provider-packet
2015/11/25 21:19:25 [DEBUG] Discovered plugin: rundeck = /opt/homebrew-cask/Caskroom/terraform/0.6.7/terraform-provider-rundeck
2015/11/25 21:19:25 [DEBUG] Discovered plugin: template = /opt/homebrew-cask/Caskroom/terraform/0.6.7/terraform-provider-template
2015/11/25 21:19:25 [DEBUG] Discovered plugin: terraform = /opt/homebrew-cask/Caskroom/terraform/0.6.7/terraform-provider-terraform
2015/11/25 21:19:25 [DEBUG] Discovered plugin: tls = /opt/homebrew-cask/Caskroom/terraform/0.6.7/terraform-provider-tls
2015/11/25 21:19:25 [DEBUG] Discovered plugin: vsphere = /opt/homebrew-cask/Caskroom/terraform/0.6.7/terraform-provider-vsphere
2015/11/25 21:19:25 [DEBUG] Discovered plugin: chef = /opt/homebrew-cask/Caskroom/terraform/0.6.7/terraform-provisioner-chef
2015/11/25 21:19:25 [DEBUG] Discovered plugin: file = /opt/homebrew-cask/Caskroom/terraform/0.6.7/terraform-provisioner-file
2015/11/25 21:19:25 [DEBUG] Discovered plugin: local-exec = /opt/homebrew-cask/Caskroom/terraform/0.6.7/terraform-provisioner-local-exec
2015/11/25 21:19:25 [DEBUG] Discovered plugin: remote-exec = /opt/homebrew-cask/Caskroom/terraform/0.6.7/terraform-provisioner-remote-exec
2015/11/25 21:19:25 [DEBUG] Detected home directory from env var: /Users/Danny
2015/11/25 21:19:25 [DEBUG] Attempting to open CLI config file: /Users/Danny/.terraformrc
2015/11/25 21:19:25 [DEBUG] Detected home directory from env var: /Users/Danny

@landro
Copy link
Author

landro commented Nov 25, 2015

@dverbeek84 - have you tried building from master?

@dverbeek84
Copy link

@landro yes, i just did. It was not working. It also hangs.
switching back to 0.6.6 still works.

@mlehner
Copy link

mlehner commented Dec 1, 2015

Switching to AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY and AWS_DEFAULT_REGION resolved the issue for me. Originally my access credentials using variables and TF_VAR_ environment variables similar to @landro.

I was experiencing this issue while running 'terraform plan' or 'terraform refresh'. All the aws resources would just repeatedly say they were waiting on another resource every 5 seconds. I tried three different versions, but none of them worked. v0.6.6 (hang), v0.6.7 (crash) and master (hang)

@phinze
Copy link
Contributor

phinze commented Dec 1, 2015

Hey folks, some of the occurrences here might be explained by the HCL issue we found in #4082, which caused config parsing to hang on any file that ended in a comment line. That should be fixed on master with a make updatedeps to fetch the latest hashicorp/hcl dependency (and will be included in the forthcoming v0.6.8 release).

@antonbabenko
Copy link
Contributor

I think such bug is rather critical and would be great if new version of Terraform is released (at least it should contain just this fix, if possible). Kind of, version 0.6.7.1 :)

@landro
Copy link
Author

landro commented Dec 1, 2015

so removed trailing # and set credentials using AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY and AWS_DEFAULT_REGION - and everything started working both in master and 0.6.6
Setting aws credential using variables, still hangs

@phinze
Copy link
Contributor

phinze commented Dec 1, 2015

@antonbabenko agreed! We're cutting a release today. :)

@phinze
Copy link
Contributor

phinze commented Dec 1, 2015

@landro and @mlehner - can either of you share with us some repro config / steps on the "AWS credentials with variables" hang? I've so far been unable to reproduce based on your description of your setup.

@landro
Copy link
Author

landro commented Dec 1, 2015

I'll try to reproduce in a simple repo later today, @phinze

@phinze
Copy link
Contributor

phinze commented Dec 1, 2015

Thanks @landro!

@landro
Copy link
Author

landro commented Dec 1, 2015

Managed to reproduce issue with a dead simple tf-file
https://github.com/landro/terraform_aws_bug

HTH in tracking down the cause of this issue, @phinze

@phinze
Copy link
Contributor

phinze commented Dec 1, 2015

This is super helpful, @landro. Will dig into this tomorrow.

@phinze
Copy link
Contributor

phinze commented Dec 2, 2015

Hello again, unfortunately I'm not having any luck reproducing this using your example quite yet.

I can plan, apply, refresh, and destroy using v0.6.8. I also tried v0.6.6 and v0.6.7 and was unable to reproduce any hang.

The r53 APIs are quite slow - often taking minutes to complete creation, but I've gotten no hangs.

My prereqs config executed separately to get a valid zone ID:

variable "my_access_key" {}
variable "my_secret_key" {}

provider "aws" {
  access_key = "${var.my_access_key}"
  secret_key = "${var.my_secret_key}"
  region     = "eu-central-1"
}

resource "aws_vpc" "primary" {
  cidr_block = "10.1.2.0/24"
}

resource "aws_route53_zone" "primary" {
  vpc_id = "${aws_vpc.primary.id}"
  name = "example.com"
}

output "zone_id" {
  value = "${aws_route53_zone.primary.id}"
}

The repro config:

variable "my_access_key" {}
variable "my_secret_key" {}

provider "aws" {
  access_key = "${var.my_access_key}"
  secret_key = "${var.my_secret_key}"
  region     = "eu-central-1"
}

resource "aws_route53_record" "default" {
   zone_id = "Z3TP2LAVI4BAXL" # Output from prereqs config
   name = "example.com"
   type = "A"
   ttl = "300"
   records = [
    "10.0.0.1"
    ]
}

Other details:

  • No AWS_* env vars set.
  • TF_VAR_my_access_key and TF_VAR_my_secret_key set to valid values.
  • Using darwin_amd64 on a mac.

Questions for @landro and anybody else seeing hangs:

  • Are you still seeing hangs on v0.6.8
  • Anything else could be affecting your environment that would help me reproduce?

@jwadolowski
Copy link

Hi guys,

I experienced the same for both v0.6.7 and v0.6.8. After a little bit of debugging and network tracing it turned out that the reason it hangs and loops forever is the fact that AWS provider does some background checks and tries to connect 169.254.169.254:80 (source code reference) before it moves on with further steps.

On AWS machines response to GET request to http://169.254.169.254/latest/meta-data/ is almost immediate. Surprisingly on my machine it was sometimes possible to establish TCP connection to this host (actually the varying factor was the network I was connected to).

As soon as I added a rule that drops all connections to 169.254.0.0/16 this issue is completely gone.

@wirescrossed
Copy link

After upgrading to 0.6.7 it works fine on OS X for me.

@catsby
Copy link
Contributor

catsby commented Dec 3, 2015

Hey @landro – are you running this on your local machine, or from an EC2 instance?

@landro
Copy link
Author

landro commented Dec 4, 2015

Sorry for the long wait. Just to clarify @phinze - I manage to reproduce the issue using

  TF_VAR_my_access_key=acc TF_VAR_my_secret_key=sec TF_LOG=TRACE terraform plan

without any prereqs. I run the above command as is - literally using acc as my access key and sec as my secret key, and without modifying the bug.tf file in any way. I have run this from my laptop @catsby, on different networks, home, two different company networks and mobile 4g network. Same hangs all over.

I tried looking into what's going on with the 169.254.169.254:80 port that @jwadolowski mentioned. It turns out, it accepts connections even when I'm disconnected from the internet (I take down my wifi connection).

nmap -sT -p80 169.254.169.254
Nmap scan report for 169.254.169.254
Host is up (0.000065s latency).
PORT STATE SERVICE
80/tcp open http

Nmap done: 1 IP address (1 host up) scanned in 0.03 seconds
I noticed however, that often, curling this service hangs.

I believe, it might be a good idea to find a better way to sort out if we're running terraform from amazon or not. BTW, in my current project, we're using a private cloud running cloudstack, and only using aws for dns.

@phinze
Copy link
Contributor

phinze commented Dec 4, 2015

@landro this is very good information. We had an internal chat about this issue and #3243 + #2693 and decided to revamp AWS creds handling in the provider. @catsby explains more in his comment over here: #2693 (comment)

Stay tuned for a new PR that hopefully addresses all these issues! 👍

@catsby
Copy link
Contributor

catsby commented Dec 16, 2015

Hello friends!
I just merged #4254 which refactored our AWS Authentication chain some. I'm not sure it will help here, however, some bits have moved and it's worth checking out.

If any of you could be so kind as to build from master and evaluate, please let us know your findings!

@catsby
Copy link
Contributor

catsby commented Feb 17, 2016

Hey all – I'm going to go ahead and close this, please let us know if you're still hitting this!

@catsby catsby closed this as completed Feb 17, 2016
@ghost
Copy link

ghost commented Apr 28, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators Apr 28, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

8 participants