Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File provisioning hangs with Terraform 12.* terraforming a Windows server with Powershell 5 installed #22006

Closed
ericblackburn opened this issue Jul 8, 2019 · 56 comments
Labels
bug provisioner/file provisioner/winrm v0.12 Issues (primarily bugs) reported against v0.12 releases waiting for reproduction unable to reproduce issue without further information

Comments

@ericblackburn
Copy link

ericblackburn commented Jul 8, 2019

Terraform Version

Terraform v0.12.3
+ provider.aws v2.17.0

Terraform Configuration Files

data "aws_ami" "ami" {
  most_recent = true
  owners      = ["self"]

  filter {
    name   = "name"
    values = ["TestPowershell5Ami*"]
  }
}
resource "aws_instance" "TestTerraform12" {
  ami                    = data.aws_ami.ami.id
  count                  = 1
  instance_type          = "t2.medium"
  availability_zone      = "us-east-1a"
  key_name               = "****"
  subnet_id              = "subnet-*********"
  vpc_security_group_ids = ["sg-*******"]
  root_block_device {
    volume_size = 80
  }
  provisioner "file" {
    source      = "TestFolder1"
    destination = "C:/Terraform/TestFolder1"
    connection {
      host     = coalesce(self.public_ip, self.private_ip)
      type     = "winrm"
      user     = "Test"
      password = "Password"
      timeout  = "15m"
      https    = true
      port     = "5986"
      insecure = true
    }
  }
  provisioner "file" {
    source      = "TestFolder2"
    destination = "C:/Terraform/TestFolder2"
    connection {
      host     = coalesce(self.public_ip, self.private_ip)
      type     = "winrm"
      user     = "Test"
      password = "Password"
      timeout  = "15m"
      https    = true
      port     = "5986"
      insecure = true
    }
  }
}

Debug Output

terraform.exe apply -auto-approve -no-color -target aws_instance.TestTerraform12
data.aws_ami.ami: Refreshing state...
aws_instance.TestTerraform12[0]: Creating...
aws_instance.TestTerraform12[0]: Still creating... [10s elapsed]
aws_instance.TestTerraform12[0]: Still creating... [20s elapsed]
aws_instance.TestTerraform12[0]: Still creating... [30s elapsed]
aws_instance.TestTerraform12[0]: Provisioning with 'file'...
aws_instance.TestTerraform12[0]: Still creating... [40s elapsed]
aws_instance.TestTerraform12[0]: Still creating... [50s elapsed]
aws_instance.TestTerraform12[0]: Still creating... [1m0s elapsed]
aws_instance.TestTerraform12[0]: Still creating... [1m10s elapsed]
aws_instance.TestTerraform12[0]: Still creating... [1m20s elapsed]
aws_instance.TestTerraform12[0]: Still creating... [1m30s elapsed]
...

aws_instance.TestTerraform12[0]: Still creating... [14m40s elapsed]
aws_instance.TestTerraform12[0]: Still creating... [14m50s elapsed]
aws_instance.TestTerraform12[0]: Still creating... [15m0s elapsed]
aws_instance.TestTerraform12[0]: Still creating... [15m10s elapsed]
aws_instance.TestTerraform12[0]: Still creating... [15m20s elapsed]
aws_instance.TestTerraform12[0]: Still creating... [15m30s elapsed]
aws_instance.TestTerraform12[0]: Still creating... [15m40s elapsed]
Interrupt received.
Please wait for Terraform to exit or data loss may occur.
Gracefully shutting down...
Stopping operation...
Two interrupts received. Exiting immediately. Note that data
loss may have occurred.

The interrupt received was me cancelling the cmd line process.

Crash Output

N/A

Expected Behavior

On a Windows Server with Powershell 5 installed, be able to use Terraform 12.* to complete file provisioning of TestFolder1, copying all content and subfolder content, and continue to the next file provisioning step.

This works in Terrafform 11.*
I am using terraform v0.11.8 successfully with provider.aws v1.35.0

Actual Behavior

Since upgrading to Terraform 12.*, the file provisioning of TestFolder1 copies a couple of the 1 KB files over, seems to copy over what can be done in 1 minute, and then stops copying over content. The Terraform console logging continues to report "Still creating..." without end and doesn't obey the timeout. The Terraform process never completes.

Steps to Reproduce

Setup Windows Server 2012 to have Powershell 5

  1. Create a Window Server 2012 R2 64 bit instance and manually or via cli
    For AWS, I am using the source AMI filter
"source_ami_filter": {
			"filters": {
				"virtualization-type": "hvm",
				"name": "Windows_Server-2012-R2_RTM-English-64Bit-Base-*",
				"root-device-type": "ebs"
			},
			"owners": ["amazon"],
			"most_recent": true
		},
  1. Install Powershell 5 directly via the MSU or through Chocolatey, both have the same effect.
    2.1 Chocolatey via, "choco install powershell"
    2.2 Manually by downloading Win8.1AndW2K12R2-KB3191564-x64.msu from https://www.microsoft.com/en-us/download/details.aspx?id=54616, Windows Management Framework 5.1 (KB3191564).
    2.2.1 Install via the GUI manually or run "Win8.1AndW2K12R2-KB3191564-x64.msu /quiet" in a cmd line Administrator elevated prompt.
  2. Restart the system to complete the install
  3. Save the image as TestPowershell5Ami

Another option is to leverage a Windows Server 2016 instance, which natively uses Powershell 5.1. For AWS, I used the "name": "Windows_Server-2016-English-Full-Containers-*" source_ami_filter.

Attempt to terraform

  1. Setup a TestFolder1 folder with over 10 1kb files, something that would take over a minute to file provision on your network.
  2. terraform init
  3. terraform apply

Additional Context

Terraform 11.* has never been an issue. The Terraform file isn't new, outside of being upgraded to Terraform 12's syntax, using the automated upgrade cmd. Without Powershll 5 installed, I can terraform the Windows Server system, with Powershell 4.0 natively, just fine. After installing Powershell 5 or against a Windows Server 2016 instance with Powershell 5.1 natively installed, Terraform hangs and never completes or errors out.

The Windows system being terraformed has a Powershell execution policy set to: localmachine bypass.
"set-executionpolicy bypass -force"

The server I run Terraform from has Powershell 5.1 installed.

References

N/A

@ericblackburn ericblackburn changed the title File provisioning hangs in Terraform 12.* after installing Powershell 5 on Windows Server 2012 R2 File provisioning hangs in Terraform 12.* on a Windows server with Powershell 5 installed Jul 8, 2019
@ericblackburn ericblackburn changed the title File provisioning hangs in Terraform 12.* on a Windows server with Powershell 5 installed File provisioning hangs with Terraform 12.* terraforming a Windows server with Powershell 5 installed Jul 8, 2019
@acbreeze
Copy link

acbreeze commented Jul 9, 2019

I have the exact same issue. I switched to 11.14 and now file provisioner step completes no problem.

@anmiles
Copy link

anmiles commented Jul 17, 2019

Have the same issue on 0.12.3.
It's being reproduced on the file provisioners when either instance or null_resource created.
For example, here is what I see when file provisioner working when aws_instance created. It waits 5 minutes until instance created and then starting uploading the files. Finally all the files are being uploaded (I've double-checked this) but provisioner will never stop. Although timeout is set to 15m.

aws_instance.default: Still creating... [5m0s elapsed]
2019-07-17T15:51:35.754+0300 [DEBUG] plugin.terraform.exe: file-provisioner (internal) 2019/07/17 15:51:35 [DEBUG] connecting to remote shell using WinRM
2019-07-17T15:51:37.325+0300 [DEBUG] plugin.terraform.exe: file-provisioner (internal) 2019/07/17 15:51:37 [DEBUG] Uploading dir 'scripts/remote/' to 'C:/scripts/'
aws_instance.default: Still creating... [5m10s elapsed]
aws_instance.default: Still creating... [5m20s elapsed]
aws_instance.default: Still creating... [5m30s elapsed]

(still waiting and waiting)

aws_instance.default: Still creating... [30m20s elapsed]
aws_instance.default: Still creating... [30m30s elapsed]

(etc)

@mc-jared
Copy link

I also have this problem with terraform 0.12.5 and aws 2.19
Including the timeout failing to be hit.
I let it run through a meeting and came back to over 90 minutes still waiting.

aws_instance.DIR[0]: Still creating... [1h30m20s elapsed]
aws_instance.DIR[2]: Still creating... [1h30m20s elapsed]
aws_instance.DIR[0]: Still creating... [1h30m30s elapsed]
aws_instance.DIR[2]: Still creating... [1h30m30s elapsed]
aws_instance.DIR[0]: Still creating... [1h30m40s elapsed]
aws_instance.DIR[2]: Still creating... [1h30m40s elapsed]
aws_instance.DIR[0]: Still creating... [1h30m50s elapsed]
aws_instance.DIR[2]: Still creating... [1h30m50s elapsed]
aws_instance.DIR[0]: Still creating... [1h31m0s elapsed]
aws_instance.DIR[2]: Still creating... [1h31m0s elapsed]
aws_instance.DIR[0]: Still creating... [1h31m10s elapsed]
(etc)

My behavior is that occasionally 1-2 of the connections will occur so long as you only have 1 resource defined in the .tf file. If you have more than 1 resource that have the file provisioner none of them will connect using winRM.

@zlilA
Copy link

zlilA commented Aug 12, 2019

Having the same issue, terraform 0.12 using the google provider.
Trying to create a file using the file provisioner and the contents is generated by templatefile.

@mauri
Copy link

mauri commented Aug 25, 2019

Having the same issue, using aws provider

Terraform v0.12.7
+ provider.aws v2.17.0

Creating an aws_instance with

provisioner "file" {
    destination = "/root/foo"
    content     = <<-EOF
    NODE_ENV=production
    APPLICATION_URL=foo
EOF
}

hangs on Still creating... for 5 minutes and then fails with timeout. The instance is created successfully but the file is never created.

@kchoudhury
Copy link

Facing the same issue with windows 10 over winrm using vsphere provider and file provisioner , from the folder only some of the files get copied and the creation goes into endless loop .

If inside the folder i am copying i only have 1 file it always works as expected

@mildwonkey
Copy link
Contributor

Hi folks! I'm sorry that y'all are experiencing this behavior.

Please do not post "+1" comments here, since it creates noise for others watching the issue and ultimately doesn't influence our prioritization because we can't actually report on these. Instead, react to the original issue comment with 👍, which we can and do report on during prioritization.

That being said, if you have an example of this issue that includes new information, please do continue to share!

@hashibot hashibot added the v0.12 Issues (primarily bugs) reported against v0.12 releases label Aug 28, 2019
@Bandyman
Copy link

Bandyman commented Aug 30, 2019

I saw this issue as well when upgrading to terraform v12.3 as soon as it became viable for us to update but I assumed it was something wrong with our windows template or VSphere instance as changes had been done to these at the same time.

I have noticed this behaviour on WIndows 10 versions, 1511, 1803 and 1903.
This has been an issue for us since 0.12.3 after upgrading from 0.11.14, and still an issue when upgrading from 0.12.3 to 0.12.6.
Updated to 0.12.7 a couple of days ago and still have this issue.

During this version timeframe we have been using Vpshere provider 1.11.0 and 1.12.0.

The issue seems inconsitend however, Say I roll out with count=10, perhaps the first 3 make it ok and the rest just hang.

We install a bunch of things into the template but the only things we touch with powershell is PowerCLI and AD Module.

Hope this info helps.

edit
Forgot to mention, we see this with both provisioning a file and when trying to write text to a .txt file. for example:

  provisioner "file" {
    content     = var.local_admin_password
    destination = "C:\\scripts\\file_local.txt"
    connection {
      host     = self.default_ip_address
      type     = "winrm"
      user     = "Administrator"
      password = var.local_admin_password
    }
  }

  provisioner "file" {
    source      = "${var.project_dir}/scripts/"
    destination = "C:\\scripts"
    connection {
      host     = self.default_ip_address
      type     = "winrm"
      user     = "Administrator"
      password = var.local_admin_password
    }
  }

@0xVox
Copy link

0xVox commented Sep 6, 2019

This issue appears to occur with remote-exec's as well. I replaced all my file provisioners with a combination of local-exec and remote-exec that would transfer the files to my instance(s) via AWS S3 - now the unzip stage (Of very small files <1MB) will hang in this same way.

Experienced this on Windows Server 2019 DC Edition

@danielcbright
Copy link

Experiencing the same issue on W2K16 - going to downgrade to 11.8 for now to see if it solves it

@mcascone
Copy link

mcascone commented Sep 19, 2019

Having this same issue with the chef provisioner. Please see the issue i logged for all my details. #22722

@zghafari
Copy link

zghafari commented Oct 4, 2019

Seeing the same issue via remote-exec, 12.7

@ghost
Copy link

ghost commented Oct 17, 2019

I am getting same issue with

$terraform --version
Terraform v0.12.9
+ provider.azurerm v1.35.0
+ provider.null v2.1.2
+ provider.random v2.2.1

target machine is Windows 2019 DC with PowerShell 5

trace logs are

2019-10-17T21:51:37.644+0900 [DEBUG] plugin.terraform.exe: file-provisioner (internal) 2019/10/17 21:51:37 [DEBUG] connecting to remote shell using WinRM
2019-10-17T21:51:37.934+0900 [DEBUG] plugin.terraform.exe: file-provisioner (internal) 2019/10/17 21:51:37 [DEBUG] Uploading dir './setup_script.stage' to 'c:/terraform/.'
2019-10-17T21:51:37.971+0900 [DEBUG] plugin.terraform.exe: file-provisioner (internal) 2019/10/17 21:51:37 [DEBUG] Uploading dir './setup_script.stage' to 'c:/terraform/.'
2019/10/17 21:51:40 [TRACE] dag/walk: vertex "module.windowsservers.provider.azurerm (close)" is waiting for "module.windowsservers.azurerm_virtual_machine.vm-windows-with-datadisk-and-provision[1]"
2019/10/17 21:51:40 [TRACE] dag/walk: vertex "meta.count-boundary (EachMode fixup)" is waiting for "module.windowsservers.output.vm_ids"
2019/10/17 21:51:42 [TRACE] dag/walk: vertex "provisioner.file (close)" is waiting for "module.windowsservers.azurerm_virtual_machine.vm-windows-with-datadisk-and-provision[1]"
2019/10/17 21:51:42 [TRACE] dag/walk: vertex "provisioner.remote-exec (close)" is waiting for "module.windowsservers.azurerm_virtual_machine.vm-windows-with-datadisk-and-provision[0]"
2019/10/17 21:51:42 [TRACE] dag/walk: vertex "root" is waiting for "provisioner.remote-exec (close)"
2019/10/17 21:51:42 [TRACE] dag/walk: vertex "provisioner.local-exec (close)" is waiting for "module.windowsservers.azurerm_virtual_machine.vm-windows-with-datadisk-and-provision[1]"
2019/10/17 21:51:42 [TRACE] dag/walk: vertex "module.windowsservers.output.vm_ids" is waiting for "module.windowsservers.azurerm_virtual_machine.vm-windows-with-datadisk-and-provision[1]"
2019/10/17 21:51:45 [TRACE] dag/walk: vertex "module.windowsservers.provider.azurerm (close)" is waiting for "module.windowsservers.azurerm_virtual_machine.vm-windows-with-datadisk-and-provision[1]"
2019/10/17 21:51:45 [TRACE] dag/walk: vertex "meta.count-boundary (EachMode fixup)" is waiting for "module.windowsservers.output.vm_ids"
2019/10/17 21:51:47 [TRACE] dag/walk: vertex "provisioner.file (close)" is waiting for "module.windowsservers.azurerm_virtual_machine.vm-windows-with-datadisk-and-provision[1]"
2019/10/17 21:51:47 [TRACE] dag/walk: vertex "provisioner.remote-exec (close)" is waiting for "module.windowsservers.azurerm_virtual_machine.vm-windows-with-datadisk-and-provision[0]"
2019/10/17 21:51:47 [TRACE] dag/walk: vertex "root" is waiting for "provisioner.remote-exec (close)" 
x infinite

@desdic
Copy link

desdic commented Nov 5, 2019

I have the same issue
Terraform v0.12.12

  • provider.aws v2.34.0

@aaronsteers
Copy link

aaronsteers commented Nov 5, 2019

I've posted a workaround to stack overflow which encodes extra files as base64 encoded strings and then decodes them inline back using the desired file name.

Posting here in case it is helpful for others as well.

E.g. in userdata text:

echo ${base64encode(file("${path.module}/config.json"))} | base64 --decode > config.json

Ref: https://stackoverflow.com/questions/58631004/deploy-local-files-to-instances-without-using-terraform-file-provisioners

@mawinter69
Copy link

We see the same issue on openstack when creating Windows Server 2016 instances.
$terraform --version
Terraform v0.12.10

  • provider.external: version = "~> 1.2"
  • provider.null: version = "~> 2.1"
  • provider.openstack: version = "~> 1.23"

What we usually see that it works when we create just one instance. But when creating multiple instances in parallel it usually hangs on all of them. Occasionally it works for the first and hangs on the rest.
We see that the files are copied to the target machine.

@jpatigny
Copy link

jpatigny commented Nov 7, 2019

Same issue vsphere provider :

Version :

$ terraform --version
Terraform v0.12.13

  • provider.vsphere v1.13.0

Apply output :

module.srv6.vsphere_virtual_machine.Windows[0]: Creating... module.srv9.vsphere_virtual_machine.Windows[0]: Creating... module.srv8.vsphere_virtual_machine.Windows[0]: Creating... module.srv5.vsphere_virtual_machine.Windows[0]: Creating... module.srv6.vsphere_virtual_machine.Windows[0]: Still creating... [10s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [10s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [10s elapsed] module.srv5.vsphere_virtual_machine.Windows[0]: Still creating... [10s elapsed] module.srv6.vsphere_virtual_machine.Windows[0]: Still creating... [20s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [20s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [20s elapsed] module.srv5.vsphere_virtual_machine.Windows[0]: Still creating... [20s elapsed] module.srv6.vsphere_virtual_machine.Windows[0]: Still creating... [30s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [30s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [30s elapsed] module.srv5.vsphere_virtual_machine.Windows[0]: Still creating... [30s elapsed] module.srv6.vsphere_virtual_machine.Windows[0]: Still creating... [40s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [40s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [40s elapsed] module.srv5.vsphere_virtual_machine.Windows[0]: Still creating... [40s elapsed] module.srv6.vsphere_virtual_machine.Windows[0]: Still creating... [50s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [50s elapsed] module.srv5.vsphere_virtual_machine.Windows[0]: Still creating... [50s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [50s elapsed] module.srv6.vsphere_virtual_machine.Windows[0]: Still creating... [1m0s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [1m0s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [1m0s elapsed] module.srv5.vsphere_virtual_machine.Windows[0]: Still creating... [1m0s elapsed] module.srv6.vsphere_virtual_machine.Windows[0]: Still creating... [1m10s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [1m10s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [1m10s elapsed] module.srv5.vsphere_virtual_machine.Windows[0]: Still creating... [1m10s elapsed] module.srv6.vsphere_virtual_machine.Windows[0]: Still creating... [1m20s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [1m20s elapsed] module.srv5.vsphere_virtual_machine.Windows[0]: Still creating... [1m20s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [1m20s elapsed] module.srv6.vsphere_virtual_machine.Windows[0]: Still creating... [1m30s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [1m30s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [1m30s elapsed] module.srv5.vsphere_virtual_machine.Windows[0]: Still creating... [1m30s elapsed] module.srv6.vsphere_virtual_machine.Windows[0]: Still creating... [1m40s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [1m40s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [1m40s elapsed] module.srv5.vsphere_virtual_machine.Windows[0]: Still creating... [1m40s elapsed] module.srv6.vsphere_virtual_machine.Windows[0]: Still creating... [1m50s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [1m50s elapsed] module.srv5.vsphere_virtual_machine.Windows[0]: Still creating... [1m50s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [1m50s elapsed] module.srv6.vsphere_virtual_machine.Windows[0]: Still creating... [2m0s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [2m0s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [2m0s elapsed] module.srv5.vsphere_virtual_machine.Windows[0]: Still creating... [2m0s elapsed] module.srv6.vsphere_virtual_machine.Windows[0]: Still creating... [2m10s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [2m10s elapsed] module.srv5.vsphere_virtual_machine.Windows[0]: Still creating... [2m10s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [2m10s elapsed] module.srv6.vsphere_virtual_machine.Windows[0]: Still creating... [2m20s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [2m20s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [2m20s elapsed] module.srv5.vsphere_virtual_machine.Windows[0]: Still creating... [2m20s elapsed] module.srv6.vsphere_virtual_machine.Windows[0]: Still creating... [2m30s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [2m30s elapsed] module.srv5.vsphere_virtual_machine.Windows[0]: Still creating... [2m30s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [2m30s elapsed] module.srv6.vsphere_virtual_machine.Windows[0]: Still creating... [2m40s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [2m40s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [2m40s elapsed] module.srv5.vsphere_virtual_machine.Windows[0]: Still creating... [2m40s elapsed] module.srv6.vsphere_virtual_machine.Windows[0]: Still creating... [2m50s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [2m50s elapsed] module.srv5.vsphere_virtual_machine.Windows[0]: Still creating... [2m50s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [2m50s elapsed] module.srv6.vsphere_virtual_machine.Windows[0]: Still creating... [3m0s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [3m0s elapsed] module.srv5.vsphere_virtual_machine.Windows[0]: Still creating... [3m0s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [3m0s elapsed] module.srv6.vsphere_virtual_machine.Windows[0]: Still creating... [3m10s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [3m10s elapsed] module.srv5.vsphere_virtual_machine.Windows[0]: Still creating... [3m10s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [3m10s elapsed] module.srv6.vsphere_virtual_machine.Windows[0]: Still creating... [3m20s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [3m20s elapsed] module.srv5.vsphere_virtual_machine.Windows[0]: Still creating... [3m20s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [3m20s elapsed] module.srv5.vsphere_virtual_machine.Windows[0]: Provisioning with 'file'... module.srv6.vsphere_virtual_machine.Windows[0]: Still creating... [3m30s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [3m30s elapsed] module.srv5.vsphere_virtual_machine.Windows[0]: Still creating... [3m30s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [3m30s elapsed] module.srv5.vsphere_virtual_machine.Windows[0]: Provisioning with 'remote-exec'... module.srv5.vsphere_virtual_machine.Windows[0] (remote-exec): Connecting to remote host via WinRM... module.srv5.vsphere_virtual_machine.Windows[0] (remote-exec): Host: 10.7.19.146 module.srv5.vsphere_virtual_machine.Windows[0] (remote-exec): Port: 5986 module.srv5.vsphere_virtual_machine.Windows[0] (remote-exec): User: Administrator module.srv5.vsphere_virtual_machine.Windows[0] (remote-exec): Password: true module.srv5.vsphere_virtual_machine.Windows[0] (remote-exec): HTTPS: true module.srv5.vsphere_virtual_machine.Windows[0] (remote-exec): Insecure: true module.srv5.vsphere_virtual_machine.Windows[0] (remote-exec): NTLM: false module.srv5.vsphere_virtual_machine.Windows[0] (remote-exec): CACert: false module.srv5.vsphere_virtual_machine.Windows[0] (remote-exec): Connected! module.srv5.vsphere_virtual_machine.Windows[0] (remote-exec): C:\Users\Administrator>powershell.exe -ExecutionPolicy bypass -File C:\InstallBinaries\Add-RoleToReg.ps1 -role "role1" -envir "envir1" module.srv5.vsphere_virtual_machine.Windows[0] (remote-exec): C:\Users\Administrator>powershell.exe -ExecutionPolicy bypass -File C:\InstallBinaries\Install-SCCMClient.ps1 module.srv6.vsphere_virtual_machine.Windows[0]: Still creating... [3m40s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [3m40s elapsed] module.srv5.vsphere_virtual_machine.Windows[0]: Still creating... [3m40s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [3m40s elapsed] module.srv6.vsphere_virtual_machine.Windows[0]: Still creating... [3m50s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [3m50s elapsed] module.srv5.vsphere_virtual_machine.Windows[0]: Still creating... [3m50s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [3m50s elapsed] module.srv5.vsphere_virtual_machine.Windows[0]: Creation complete after 3m51s [id=4236dbd1-2970-826e-0a7f-c11ff102d10a] module.srv6.vsphere_virtual_machine.Windows[0]: Still creating... [4m0s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [4m0s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [4m0s elapsed] module.srv6.vsphere_virtual_machine.Windows[0]: Still creating... [4m10s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [4m10s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [4m10s elapsed] module.srv6.vsphere_virtual_machine.Windows[0]: Provisioning with 'file'... module.srv6.vsphere_virtual_machine.Windows[0]: Still creating... [4m20s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [4m20s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [4m20s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Provisioning with 'file'... module.srv8.vsphere_virtual_machine.Windows[0]: Provisioning with 'file'... module.srv6.vsphere_virtual_machine.Windows[0]: Still creating... [4m30s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [4m30s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [4m30s elapsed] module.srv6.vsphere_virtual_machine.Windows[0]: Still creating... [4m40s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [4m40s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [4m40s elapsed] module.srv6.vsphere_virtual_machine.Windows[0]: Provisioning with 'remote-exec'... module.srv6.vsphere_virtual_machine.Windows[0] (remote-exec): Connecting to remote host via WinRM... module.srv6.vsphere_virtual_machine.Windows[0] (remote-exec): Host: 10.7.19.147 module.srv6.vsphere_virtual_machine.Windows[0] (remote-exec): Port: 5986 module.srv6.vsphere_virtual_machine.Windows[0] (remote-exec): User: Administrator module.srv6.vsphere_virtual_machine.Windows[0] (remote-exec): Password: true module.srv6.vsphere_virtual_machine.Windows[0] (remote-exec): HTTPS: true module.srv6.vsphere_virtual_machine.Windows[0] (remote-exec): Insecure: true module.srv6.vsphere_virtual_machine.Windows[0] (remote-exec): NTLM: false module.srv6.vsphere_virtual_machine.Windows[0] (remote-exec): CACert: false module.srv6.vsphere_virtual_machine.Windows[0] (remote-exec): Connected! module.srv6.vsphere_virtual_machine.Windows[0] (remote-exec): C:\Users\Administrator>powershell.exe -ExecutionPolicy bypass -File C:\InstallBinaries\Add-RoleToReg.ps1 -role "role1" -envir "envir1" module.srv6.vsphere_virtual_machine.Windows[0] (remote-exec): C:\Users\Administrator>powershell.exe -ExecutionPolicy bypass -File C:\InstallBinaries\Install-SCCMClient.ps1 module.srv6.vsphere_virtual_machine.Windows[0]: Still creating... [4m50s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [4m50s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [4m50s elapsed] module.srv6.vsphere_virtual_machine.Windows[0]: Still creating... [5m0s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [5m0s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [5m0s elapsed] module.srv6.vsphere_virtual_machine.Windows[0]: Creation complete after 5m0s [id=42360ef8-a3c0-1f8d-a056-a8d5ff9699cc] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [5m10s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [5m10s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [5m20s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [5m20s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [5m30s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [5m30s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [5m40s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [5m40s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [5m50s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [5m50s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [6m0s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [6m0s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [6m10s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [6m10s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [6m20s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [6m20s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [6m30s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [6m30s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [6m40s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [6m40s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [6m50s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [6m50s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [7m0s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [7m0s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [7m10s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [7m10s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [7m20s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [7m20s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [7m30s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [7m30s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [7m40s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [7m40s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [7m50s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [7m50s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [8m0s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [8m0s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [8m10s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [8m10s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [8m20s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [8m20s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [8m30s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [8m30s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [8m40s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [8m40s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [8m50s elapsed] module.srv8.vsphere_virtual_machine.Windows[0]: Still creating... [8m50s elapsed] module.srv9.vsphere_virtual_machine.Windows[0]: Still creating... [9m0s elapsed]

Provisionner code :

  connection {
    host     = "${self.default_ip_address}"
    type     = "winrm"
    port     = 5986
    https    = true
    timeout  = "4m"
    user     = "Administrator"
    password = "${var.local_adminpass}"
    insecure = true
  }

  provisioner "file" {
    source      = "${path.module}/scripts/remote"
    destination = "C:/InstallBinaries"
  }

  provisioner "remote-exec" {
    inline = [
      "powershell.exe -ExecutionPolicy bypass -File C:\\InstallBinaries\\Add-RoleToReg.ps1 -role \"${var.role}\" -envir \"${var.environment}\"",
      "powershell.exe -ExecutionPolicy bypass -File C:\\InstallBinaries\\Install-SCCMClient.ps1",
    ]
  }

Scenarios tested :

  • Moved provisionner part (file and remote exec) to a null_resource executed after
    vsphere_virtual_machine resource.
  • Chaing parrallelism to apply command to two 2
    --> First batch of servers works, second is failing

Workaround :

None...i'm deploying VMs two by two by adding them to my plan

@mcascone
Copy link

mcascone commented Nov 7, 2019

@jpatigny this may not be an option for you, and it’s a hassle, but I believe we’ve all found that parallel provisioning works if you downgrade to v0.11. The syntax reversion is the most time-consuming part, and you’ll have to determine if you’re using 0.12 features that aren’t supported in v0.11.

@pixelicous
Copy link

pixelicous commented Nov 13, 2019

Same issue with latest azurerm provider and terraform 0.12.9, file and remote provisioners have stopped working all together, trying to use winrm with certificate and port 5986

Is this prioritized? I really can't understand how such a major problem is open since July.. seem like a long due bug that many people are experiencing across providers.. we were about to move to 0.12, i am so happy that we didn't.. only converted around 5 modules out of 30, not going to convert any other

From output:

azurerm_virtual_machine.rls_agent[0]: Creating...
azurerm_virtual_machine.rls_agent[0]: Provisioning with 'file'...
azurerm_virtual_machine.rls_agent[0]: Still creating... [10s elapsed]
azurerm_virtual_machine.rls_agent[0]: Still creating... [20s elapsed]
azurerm_virtual_machine.rls_agent[0]: Still creating... [30s elapsed]
azurerm_virtual_machine.rls_agent[0]: Still creating... [40s elapsed]
azurerm_virtual_machine.rls_agent[0]: Still creating... [50s elapsed]
azurerm_virtual_machine.rls_agent[0]: Still creating... [1m0s elapsed]
azurerm_virtual_machine.rls_agent[0]: Still creating... [1m10s elapsed]
azurerm_virtual_machine.rls_agent[0]: Still creating... [1m20s elapsed]
azurerm_virtual_machine.rls_agent[0]: Still creating... [1m30s elapsed]

From Log:

2019-11-13T17:09:27.558+0200 [DEBUG] plugin.terraform.exe: file-provisioner (internal) 2019/11/13 17:09:27 [DEBUG] connecting to remote shell using WinRM
2019/11/13 17:09:28 [TRACE] dag/walk: vertex "provisioner.remote-exec (close)" is waiting for "azurerm_virtual_machine.myname[0]"
2019/11/13 17:09:28 [TRACE] dag/walk: vertex "provisioner.file (close)" is waiting for "azurerm_virtual_machine.myname[0]"
2019/11/13 17:09:28 [TRACE] dag/walk: vertex "root" is waiting for "meta.count-boundary (EachMode fixup)"
2019-11-13T17:09:29.274+0200 [DEBUG] plugin.terraform.exe: file-provisioner (internal) 2019/11/13 17:09:29 [DEBUG] Uploading dir './files/' to 'C:/bootstrap'
2019/11/13 17:09:32 [TRACE] dag/walk: vertex "provider.azurerm (close)" is waiting for "azurerm_virtual_machine.myname[0]"
2019/11/13 17:09:32 [TRACE] dag/walk: vertex "meta.count-boundary (EachMode fixup)" is waiting for "azurerm_virtual_machine.myname[0]"
2019/11/13 17:09:33 [TRACE] dag/walk: vertex "provisioner.remote-exec (close)" is waiting for "azurerm_virtual_machine.myname[0]"
2019/11/13 17:09:33 [TRACE] dag/walk: vertex "provisioner.file (close)" is waiting for "azurerm_virtual_machine.myname[0]"
2019/11/13 17:09:33 [TRACE] dag/walk: vertex "root" is waiting for "meta.count-boundary (EachMode fixup)"
2019/11/13 17:09:37 [TRACE] dag/walk: vertex "provider.azurerm (close)" is waiting for "azurerm_virtual_machine.myname[0]"
2019/11/13 17:09:37 [TRACE] dag/walk: vertex "meta.count-boundary (EachMode fixup)" is waiting for "azurerm_virtual_machine.myname[0]"
2019/11/13 17:09:38 [TRACE] dag/walk: vertex "provisioner.file (close)" is waiting for "azurerm_virtual_machine.myname[0]"
2019/11/13 17:09:38 [TRACE] dag/walk: vertex "provisioner.remote-exec (close)" is waiting for "azurerm_virtual_machine.myname[0]"
2019/11/13 17:09:38 [TRACE] dag/walk: vertex "root" is waiting for "meta.count-boundary (EachMode fixup)"

@mcascone
Copy link

I certainly don't mean to cast aspersions, but I get a sense that this is being lowly-prioritized in part due to HC/TF's negative position on provisioners in general: https://www.terraform.io/docs/provisioners/index.html#provisioners-are-a-last-resort

@pixelicous
Copy link

pixelicous commented Nov 14, 2019

@mcascone interesting read, wow, it really doesn't seem like it is going to be fixed, so once again we need to go ahead and start modifying lots of terraform files, i evne have no idea how i am going to be able to cope with some of the issues, such as running remote exec provisioner only during destroy

thank you for that link.. i didn't know that.. following the history of the docs i see this was only inserted lately, on september 9th.. so i wonder how are we supposed to copy files now and execute scripts, i really dont get why this product supported software was removed and we need to solely rely on the os' options. so many other tools are implemeting such technology, i would accepted such requirement in the problematic windows' worlds, but not on linux. I don't get the answers on the page asking to switch to working with imaging process, this is just not logical, so many artifacts and items that cannot get into images as they need to be dynamically generated after server is up.. cant believe we had to change so much to get to 0.12 and now in 0.12 so much to get even working, i thought 0.12 will make these issues disappear, this really makes me think of other altenatives to terraform that can handle such authentication.

anyhow, thanks @mcascone

@mcascone
Copy link

@pixelicous , parallel provisioning works in v0.11, so if you aren't using v0.12-exclusive features, you can roll back the syntax. It's a hassle, for sure.
I'm using the Chef provisioner, and if they really are going this route, I'll be forced to look into alternatives as well. They are pushing people to immutable containers/images, whether by their own Packer product or otherwise. I can't say that's wrong - it's their business, they can do what they want - but it is frustrating to impose that limitation when it worked in an older version.

@mrtristan
Copy link

mrtristan commented Nov 21, 2019

rolling back isn't an option for me as i'm using a good deal of semi-complex 12-only features. i had to kill the process yesterday which yielded a corrupt state file that had a handle on dozens of resources so this is a pretty painful bug

@algh
Copy link

algh commented Dec 3, 2019

managed to get past this by changing my file destination path to use double backslashes "\" instead of forward slashes "/" . I recognize others on the forum already used backslashes so it may not be the real solution; it's something to try if you get stuck on ideas.

@philthynz
Copy link

@dhekimian No I haven't tested it with an older version, nor have I been able to replicate the issue here.

@Joohansson
Copy link

Joohansson commented Apr 14, 2020

I have been struggling for a whole day getting winRM work with AWS and Win Server 2019. I couldn't get the file provisioner to work. However, the latest implementation by philthynz using https worked for me. I got the file over but now there is something strange with access rights over winRM! That may be outside the subject of this issue but maybe someone know..

The first line in mount_efs.ps1 give this:
CMDKEY: Credentials cannot be saved from this logon session.

The second line gives this:
System error 58 has occurred. The specified server cannot perform the requested operation.

Running the same commands in the user_data powershell together with the winrm setup (without using remote-exec) works though, using the same admin account. I found that a bit strange! Why does not the winrm session allow me to create credentials?

Terraform script:

# Prepare the instance win WinRM support and set admin password
user_data = <<EOF
  <powershell>
  	$Cert = New-SelfSignedCertificate -CertStoreLocation Cert:\LocalMachine\My -DnsName $env:COMPUTERNAME -Verbose
  	$CertThumbprint = $Cert.Thumbprint
  	Set-NetConnectionProfile -NetworkCategory Private
  	Enable-PSRemoting -Force -Verbose
  	Set-Item WSMan:\localhost\Client\TrustedHosts * -Force -Verbose
  	New-Item -Path WSMan:\localhost\Listener -Transport HTTPS -Address * -CertificateThumbPrint $CertThumbprint -Force -Verbose
  	Restart-Service WinRM -Verbose
  	New-NetFirewallRule -DisplayName "Windows Remote Management (HTTPS-In)" -Name "WinRMHTTPSIn" -Profile Any -LocalPort 5986 -Protocol TCP -Verbose
  	winrm delete winrm/config/Listener?Address=*+Transport=HTTP
  	$users = Get-LocalGroupMember -Group "Administrators"
  	Add-LocalGroupMember -Group "Remote Management Users" -Member $users.Name
  	# Set Administrator password
  	$admin = [adsi]("WinNT://./administrator, user")
  	$admin.psbase.invoke("SetPassword", "${var.admin_password}")
  </powershell>
  EOF
  
connection {
  host     = self.public_ip
  type     = "winrm"
  port     = 5986
  https    = true
  timeout  = "4m"
  user     = "Administrator"
  password = var.admin_password
  insecure = true
  use_ntlm = true
}

# Set up EFS to mount on boot using the Linux IP
provisioner "file" {
  source      = "scripts/win/mount_efs.ps1"
  destination = "C:/mount_efs.ps1"
}

provisioner "remote-exec" {
  inline = [
    "powershell.exe write-host \"Mounting EFS on Z:\"",
    "powershell \"c:\\mount_efs.ps1 ${aws_instance.linux_main.private_ip}\""
  ]
}

mount_efs.ps1:

$ip=$args[0]
cmdkey /add:$ip /user:$ip\ubuntu /pass:xxxx
net use Z: \\$ip\efs /savecred /p:yes /persistent:yes

The combination that worked for me for mounting an SMB drive from a Linux host (which in turn is an EFS drive because AWS refuse to make EFS work directly on windows). That is, not use remote-exec for this, only for other scripts:

# Prepare the instance win WinRM support and set admin password
  user_data = <<EOF
	<powershell>
		$Cert = New-SelfSignedCertificate -CertStoreLocation Cert:\LocalMachine\My -DnsName $env:COMPUTERNAME -Verbose
		$CertThumbprint = $Cert.Thumbprint
		Set-NetConnectionProfile -NetworkCategory Private
		Enable-PSRemoting -Force -Verbose
		Set-Item WSMan:\localhost\Client\TrustedHosts * -Force -Verbose
		New-Item -Path WSMan:\localhost\Listener -Transport HTTPS -Address * -CertificateThumbPrint $CertThumbprint -Force -Verbose
		Restart-Service WinRM -Verbose
		New-NetFirewallRule -DisplayName "Windows Remote Management (HTTPS-In)" -Name "WinRMHTTPSIn" -Profile Any -LocalPort 5986 -Protocol TCP -Verbose
		winrm delete winrm/config/Listener?Address=*+Transport=HTTP
		$users = Get-LocalGroupMember -Group "Administrators"
		Add-LocalGroupMember -Group "Remote Management Users" -Member $users.Name
		
		# Set Administrator password
		$admin = [adsi]("WinNT://./administrator, user")
		$admin.psbase.invoke("SetPassword", "${var.admin_password}")
		
		# Mount EFS
		cmdkey /add:"${aws_instance.linux_main.private_ip}" /user:"${aws_instance.linux_main.private_ip}"\ubuntu /pass:xxxxxxxxx
		net use Z: \\"${aws_instance.linux_main.private_ip}"\efs /savecred /p:yes /persistent:yes
	</powershell>
	EOF

@danieldreier
Copy link
Contributor

I wanted to give an update on this, because I know that the lack of a response has created the impression that this is deprecated. WinRM is supported with the file provisioner, and we intend to fix this.

We talked about this issue as a team today and decided to work on it. I don't have an ETA for fixing it yet: the next step will be for an engineer to dig into it, and assess how involved it is.

I wanted to give an update so that people who are trying to make decisions based on whether this is supported or not know that it is indeed supported.

@apparentlymart
Copy link
Contributor

Hi all,

We're still not totally sure what's going on here, but I did some investigation today and wanted to share what I learned both as a starting point for someone possibly picking up this bug to work on later and also in case any of what I've learned causes any theories for anyone reading this who is more familiar with WinRM than I am. (I am essentially totally unfamiliar, so I'm not hard to beat!)

I understand from the discussion above that uploading files over WinRM was working in 0.11.14 but stopped working somewhere before 0.12.3. Given how early 0.12.3 was within the 0.12 series I'm going to assume for the moment that this regression occurred in 0.12.0, as part of the broader internal refactoring that came with that release.

Terraform's WinRM communicator is mainly just a thin wrapper around a third-party library github.com/masterzen/winrm, and so my initial instinct was to see if our version of that library had changed between 0.11.14 and 0.12.0. Sadly, it appears not: the last upgrade happened a few months before 0.11.14 was released and so both 0.11.14 and 0.12.0 seem to be including the same upstream version.

For file copying in particular (which is what the file provisioner does), Terraform uses github.com/packer-community/winrmcp/winrmcp. Our most recent update of that library seems to be over a year before 0.11.14 was released, and it too seems to be a wrapper around github.com/masterzen/winrm.

Based on the above, my sense is that this change in behavior wasn't caused directly by a change to the WinRM components in Terraform, but rather a change to something in the broader system that has changed some assumptions that the WinRM support was relying on.


One possibly-relevant thing that changed in 0.12.0 was switching from the old plugin protocol to the new one based on gRPC. A key implication of that shift is in how it handles "instances" of plugins: the old protocol separated the idea from starting up the child process from the idea of creating an instance of the provisioner type inside it, which I believe meant that each separate provisioner block was handled by its own instance of the file provisioner.

The new protocol changed the model so that each plugin process contains exactly one "instance" of the plugin's main type, which directly answers incoming gRPC requests. Consequently, from Terraform 0.12.0 onwards I believe (though have so far only confirmed by reading code, not active testing) that there is only one instance of the file provisioner being shared across all calls.

With that said, at first look I've been unable to find a reason why that change in model should have a negative effect. Unlike provider plugins, provisioner plugins are intended to retain no state between calls, and indeed the file provisioner in particular creates a new instance of the WinRM communicator for each call:

func applyFn(ctx context.Context) error {
connState := ctx.Value(schema.ProvRawStateKey).(*terraform.InstanceState)
data := ctx.Value(schema.ProvConfigDataKey).(*schema.ResourceData)
// Get a new communicator
comm, err := communicator.New(connState)
if err != nil {
return err
}

The file provisioner code is also the same whether it's using ssh or winrm to communicate; all that varies is the result of that communicator.New call and thus, further down, exactly which code runs when the provisioner calls Connect and Upload or UploadDir on that object.

So with that said, I've not been able yet to find a specific link between the plugin model change and the behavior change described in this bug. Further detailed debugging of both versions can hopefully confirm that it is indeed creating an entirely new communicator instance per provisioner call.


Another thing that changed between Terraform 0.11.14 and 0.12.0 is that we began building against a different version of Go. Unfortunately we were not yet tracking specific Go versions in the repository during the 0.11.14 line, but I believe that Terraform 0.11.14 was built with one of the Go 1.11 releases, while Terraform 0.12.0 was built with Go 1.12.4.

I don't have any immediate ideas about any specific things that changed between Go 1.11 and Go 1.12, but I just wanted to note that in case it spurs any thoughts from others who might know of some changes to how Go standard library functions behave on Windows between those two releases.


Finally, I considered that Packer is another similar program which has a "file" provisioner that can work over WinRM. I had a look in the Packer repository to see if they'd encountered any similar issues but so far I wasn't able to find anything similar to what's reported here in the issues for their file provisioner.

Packer has quite a few issues relating to WinRM though, and timeouts are a common symptom of misconfigured provisioners, so I can't be sure I saw everything.

If any readers of this encountered any similar problems with Packer at around the same time as Terraform 0.12.0 was released (I see several mentions of Packer in the comments above, so I assume some of you use it), please let me know! It would be very useful to be able to find a correlated change in Packer to help narrow down what's going on here.


This is all I was able to get from some initial research here. If anything above causes any ideas for anyone else reading, please let me know. Otherwise, someone on the Terraform team will dig into this deeper in the future and see if we can figure out what's changed here.

@mildwonkey
Copy link
Contributor

Hi everyone!

I'd like to take the next steps digging into this issue, and it looks like I'm going to need some help. I would like to get a working Terraform 0.11 config that fails after upgrading to the latest version of terraform. It took me all day to get the configuration right for a windows instance provisioned with the file provisioner in terraform v0.11 - and it works just fine in 0.13. I did spent 6 hours today looking at timeouts, but they were all caused by configuration issues.

Here's a gist with the configuration I used; if anyone here has a reproduction case that works in 0.11 and not 0.12/0.13 I would really appreciate it if you could share. I would prefer AWS, but I believe I can get access to azure as well for testing.

Thanks!

@mildwonkey mildwonkey self-assigned this Jun 16, 2020
@dandunckelman
Copy link

@mildwonkey This issue appears when we use a file provisioner to copy a folder of many files/folders to the remote host.

For example, we clone https://github.com/dsccommunity/SqlServerDsc to the local filesystem and then use a file provisioner to copy that to the remote node. The process takes a really long time and eventually just stops working (on specific terraform versions).

This works on 0.11.14. When we updated to 0.12.23, that's when we first found this issue.

Yesterday, I did some testing and found the following:

  • Success on 0.11.14
  • Failure on 0.12.23
  • Failure on 0.12.24
  • Failure on 0.12.25
  • Success on 0.12.26

I couldn't identify anything obvious here: v0.12.25...v0.12.26, but it seems like something definitely changed from 0.12.25 to 0.12.26. We'll do more testing w/ 0.12.26 to see if we uncover other issues.

For reference, here's the provisioner content we use:

{
  "file": {
    "connection": {
      "host": "CHANGEME",
      "https": false,
      "insecure": true,
      "password": "CHANGEME",
      "port": 7770,
      "type": "winrm",
      "user": "Administrator"
    },
    "destination": "c:\\tmp\\SqlServerDsc",
    "source": "/tmp/SqlServerDsc/"
  }
}

We're trying to use 0.13.0-beta2, but we've have had issues w/ our providers properly loading w/ terraform init. We'll revisit that again soon.

@mildwonkey
Copy link
Contributor

Thank you @dandunckelman , that's helpful extra info! I'll use that same repository so I have a more realistic set of files to transfer.

Can I ask what backend you are using? There may be some changes to the backends worth looking into.

@dandunckelman
Copy link

@mildwonkey
Copy link
Contributor

That's good to know too, but what backend are you using for your state storage? (local, remote, etc?)

@dandunckelman
Copy link

@mildwonkey right, that was provider.

Backend is local.

@darrens280
Copy link

darrens280 commented Jun 24, 2020

Thanks @dandunckelman

I've also been suffering from this issue. Just done some tests with Terraform 0.12.26 and can confirm this now works to deploy multiple Windows VMs and with multiple file provisioners into Azure. Tests with same code using Terraform 0.12.24 fails.

Code extract below for anyone else

Thank you

connection {
    host     = azurerm_network_interface.this[count.index].private_ip_address
    user     = var.admin_user
    password = var.admin_pw
    type     = "winrm"
    port     = "5985"
    timeout  = "300s"
    insecure = "true"
}

provisioner "file" {
    source      = "${path.module}/files/"
    destination = "C:/AzureData"
}

@mildwonkey
Copy link
Contributor

I'm still working on this issue, but I have merged a PR that slightly helps matters by resulting in a time out, instead of an endless hanging run. This will be included in the next 0.13 release.

The underlying winrmcp library uses filepath.Walk to grabs all files recursively, and in the example repository (thanks again for that suggestion!) if I don't remove the .git directory, a lot of time gets used up moving .git files that I suspect you don't need (look at the end of the last line):

file-provisioner (internal) 2020/06/25 08:41:36 Copying file to $env:TEMP\winrmcp-113d897d-b781-4c27-7c1d-423d2a2be590.tmp
file-provisioner (internal) 2020/06/25 08:41:36 Moving file from $env:TEMP\winrmcp-113d897d-b781-4c27-7c1d-423d2a2be590.tmp to C:\Terraform\TestFolder2\.git\logs\refs\remotes\origin\HEAD

While I absolutely still want to figure out why this was working for you and then stopped working, it's important to know that the file provisioner is not very efficient at uploading larger directories. One possible workaround could be to upload an archive and use remote-exec to extract the archive.

@alastairtree
Copy link

alastairtree commented Jul 24, 2020

Just hit this bug on up to date windows 10 with powershell installed and terraform 0.12.28. Even some trivial terraform to write a file locally just hangs forever:

resource "null_resource" "writer" {
    provisioner "file" {
      source      = "some data in a file"
      destination = "output.txt"
    }
}

I managed to work round it using a very hacky local-exec and some base64 round tripping (borrowed from SO). Ugly but effective while we wait for a proper fix.

resource "null_resource" "writer" {
    provisioner "local-exec" {
      command = "del output-file.* && echo ${base64encode(file("${path.module}/input-file.txt"))} > output-file.b64 && certutil -decode output-file.b64 output-file.xml && del output-file.b64"
    }
}

@danieldreier
Copy link
Contributor

@alastairtree I have confirmed the hangs-forever issue you showed with 0.12.26 and the code you showed. This specific example no longer works in 0.13 because about 8 months ago the ability to use the file provisioner locally without a remote host was removed, so I suspect this issue is still there but this is no longer a usable reproduction case.

@danieldreier
Copy link
Contributor

I've tried to reproduce this using a reproduction case @mildwonkey put together and linked in her gist, above. Using the current 0.14.0 alpha release, the Windows EC2 instance took 6m20s to provision and accept a winrm connection, and then provisioning ran fairly quickly, not at all the long hang I've previously observed trying to reproduce this. Based on this test result, @darrens280 reporting similar success, I think that this issue is resolved. I've re-published the reproduction case in https://github.com/danieldreier/terraform-issue-reproductions/tree/master/22006.

I want to be clear that this is not a catch-all for all Windows provisioning issues: the specific problem that was reported was windows file provisioning hanging entirely, or being unusable, even with tiny files. I was able to reproduce that in earlier 0.12.x versions, and I'm not able to reproduce that failure mode anymore in 0.14.0 alpha, so I would like to close this issue.

If you've been encountering these problems, please test again with a recent 0.13.x release or an 0.14.0 pre-release, and if you're able to reproduce it, please share a clear reproduction case or contribute to the one I linked above with a pull request. If we don't have a reproduction case by the time the second 0.14 beta ships, I'm going to consider this fixed and close the issue. Please feel free to reach out and ask for help if you're seeing the issue in practice but are struggling to make a clear reproduction case - if this is still a problem I want to help.

@jbardin
Copy link
Member

jbardin commented Mar 23, 2021

Closing as we have not seen any more recent reproductions of the issue. If anyone encounters a similar case, please verify against the latest release (which is 0.15-beta2 at this time) and file a new issue.

Thanks!

@jbardin jbardin closed this as completed Mar 23, 2021
@surfd4wg
Copy link

surfd4wg commented Mar 24, 2021

Well, I see you closed this. I'm still having the problem with 0.15-beta2. Now I can't destroy the environment, it gets hung up on rtb assoc.
2021-03-24_13-19-47
Turns out if I use the terraform 0.13 executable, destroy works...

@surfd4wg
Copy link

Having the issue with 0.15beta2

#28331

@ghost
Copy link

ghost commented Apr 23, 2021

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked as resolved and limited conversation to collaborators Apr 23, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug provisioner/file provisioner/winrm v0.12 Issues (primarily bugs) reported against v0.12 releases waiting for reproduction unable to reproduce issue without further information
Projects
None yet
Development

No branches or pull requests