Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vagrant 1.8.4 and Win10 - shell provisioner error: "but stderr present" #7476

Closed
gtirloni opened this issue Jun 17, 2016 · 18 comments
Closed

Comments

@gtirloni
Copy link

Vagrant version

1.8.4

Host operating system

Fedora 22 x86_64

Guest operating system

Windows 10 64-bit

Vagrantfile

Vagrant.require_version ">= 1.8.0"

# By default this VM will use 2 processor cores and 2GB of RAM. The 'VM_CPUS' and
# "VM_RAM" environment variables can be used to change that behaviour.
cpus = ENV["VM_CPUS"] || 2
ram = ENV["VM_RAM"] || 2048

Vagrant.configure(2) do |config|

  config.vm.box = "inclusivedesign/windows10-eval"
  config.vm.guest = :windows

  config.vm.communicator = "winrm"
  config.winrm.username = "vagrant"
  config.winrm.password = "vagrant"
  config.vm.network :forwarded_port, guest: 3389, host: 3389, id: "rdp", auto_correct:true
  config.vm.network :forwarded_port, guest: 5985, host: 5985, id: "rdp", auto_correct:true

  config.vm.provider :virtualbox do |vm|
    vm.gui = true
    vm.customize ["modifyvm", :id, "--memory", ram]
    vm.customize ["modifyvm", :id, "--cpus", cpus]
    vm.customize ["modifyvm", :id, "--vram", "128"]
    vm.customize ["modifyvm", :id, "--accelerate3d", "off"]
    vm.customize ["modifyvm", :id, "--audio", "null", "--audiocontroller", "hda"]
    vm.customize ["modifyvm", :id, "--ioapic", "on"]
    vm.customize ["setextradata", "global", "GUI/SuppressMessages", "all"]
  end

  config.vm.provision "shell", inline: <<-SHELL
    choco upgrade firefox googlechrome -y
  SHELL
end

Debug output

https://gist.github.com/gtirloni/24791d9de1a1da4b782f0ff30e5eeaf1

Expected behavior

vagrant up run the shell provisioner without errors (upload script and executes)

Actual behavior

A box created on 5/28 was working fine with Vagrant 1.8.1 and 1.8.4 but it seems something may have changed with recent Windows 10 updates that it now fails with Vagrant 1.8.4 (not tested with 1.8.1).

When trying to run the shell provisioner, WinRM will upload the file but complain that the stderr stream contains data, even though the exit code was 0.

DEBUG winrmshell: Output: {:data=>[{:stderr=>"#< CLIXML\r\n"}, {:stdout=>"New connections will be remembered.\r\n"}, {:stdout=>"\r\nThere are no entries in the list.\r\n\r\n"}, {:stderr=>"<Objs Version=\"1.1.0.1\" xmlns=\"http://schemas.microsoft.com/powershell/2004/04\"><Obj S=\"progress\" RefId=\"0\"><TN RefId=\"0\"><T>System.Management.Automation.PSCustomObject</T><T>System.Object</T></TN><MS><I64 N=\"SourceId\">1</I64><PR N=\"Record\"><AV>Preparing m"}, {:stderr=>"odules for first use.</AV><AI>0</AI><Nil /><PI>-1</PI><PC>-1</PC><T>Completed</T><SR>-1</SR><SD> </SD></PR></MS></Obj><Obj S=\"progress\" RefId=\"1\"><TNRef RefId=\"0\" /><MS><I64 N=\"SourceId\">2</I64><PR N=\"Record\"><AV>Preparing modules for first use.</AV><AI>0</AI><Nil /><PI>-1</PI><PC>-1</PC><T>Completed</T><SR>-1</SR><SD> </SD></PR></MS></Obj></Objs>"}], :exitcode=>0}
 INFO winrm: Uploading: /tmp/vagrant-shell20160617-7308-1ft0jgs.ps1 to /tmp/vagrant-shell.ps1
DEBUG winrmshell: [WinRM] opening remote shell on http://127.0.0.1:5985/wsman
DEBUG winrmshell: [WinRM] remote shell F1D2DCEA-7A82-4354-937F-C0E1289785CF is open on http://127.0.0.1:5985/wsman
DEBUG winrmshell: creating hash for file /tmp/vagrant-shell.ps1
DEBUG winrmshell: Running check_files.ps1
DEBUG winrmshell: @{
DEBUG winrmshell:   "/tmp/vagrant-shell.ps1" = "3edc07c95704956392c55ca1e8150706"
DEBUG winrmshell: }
DEBUG winrmshell: [WinRM] closing remote shell F1D2DCEA-7A82-4354-937F-C0E1289785CF on http://127.0.0.1:5985/wsman
DEBUG winrmshell: [WinRM] remote shell F1D2DCEA-7A82-4354-937F-C0E1289785CF closed
ERROR warden: Error occurred: [WinRM::FS::Core::FileTransporter] Upload failed (exitcode: 0), but stderr present

 INFO warden: Beginning recovery process...
 INFO warden: Recovery complete.
ERROR warden: Error occurred: [WinRM::FS::Core::FileTransporter] Upload failed (exitcode: 0), but stderr present

Steps to reproduce

  1. Create a Win10 box using https://github.com/idi-ops/packer-windows (or the boxcutter-windows10 template)
  2. Use a Vagrantfile that has at least one shell provisioner
  3. vagrant up

Workaround

The error is actually triggered in winrm-fs (file_transporter.rb). If lines 396-398 are commented, vagrant works as expected and fully provisions the box as expected.

This is the output with the workaround in place:
https://gist.github.com/gtirloni/24791d9de1a1da4b782f0ff30e5eeaf1

The file that was edited was: /opt/vagrant/embedded/gems/gems/winrm-fs-0.3.2/lib/winrm-fs/core/file_transporter.rb

References

Submitting this against Vagrant even though it might be an issue in winrm-fs or the win10 because 1) Vagrant is not using the latest winrm-fs version which may not have this issue and 2) the win10 has not had any changes to its packer configuration besides re-running it to create a new box a month later since the old box had been created.

We're trying to understand what may have changed in Win10 that is causing this to fail though. It seems stderr shouldn't have some data that it now has.

@gtirloni
Copy link
Author

Since this issue does not happen with a boxcutter/windows10 box, I'm assuming it's an issue with the box we are using.

If the shell provisioners are removed from the Vagrantfile, vagrant up, shell provisioners added back and then finally vagrant provision, this is the error (which isn't showing during a vagrant up):

ERROR warden: Error occurred: [WinRM::FS::Core::FileTransporter] Upload failed (exitcode: 0), but stderr present
Remove-Item : Cannot remove item 
C:\Users\vagrant\AppData\Local\Temp\hash-9dab26b3-1013-4d37-96e3-457b98ddd0c8.txt.ps1: The process cannot access the 
file 'C:\Users\vagrant\AppData\Local\Temp\hash-9dab26b3-1013-4d37-96e3-457b98ddd0c8.txt.ps1' because it is being used 
by another process.
At line:42 char:3
+   Remove-Item $in,$decoded -Force
+   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : WriteError: (C:\Users\vagran...8ddd0c8.txt.ps1:FileInfo) [Remove-Item], IOException
    + FullyQualifiedErrorId : RemoveFileSystemItemIOError,Microsoft.PowerShell.Commands.RemoveItemCommand

@StefanScherer
Copy link
Contributor

I have the same problem with Windows Server 2016 TP5, worked fine with Vagrant 1.8.1.
Thanks @gtirloni for the workaround, I'll try that.

BTW: The Ruby winrm-fs gem is already at 0.4.3

@StefanScherer
Copy link
Contributor

StefanScherer commented Jul 1, 2016

Editing the

sudo vi /opt/vagrant/embedded/gems/specifications/vagrant-1.8.4.gemspec

and changing all three places

      s.add_runtime_dependency(%q<winrm-fs>, ["~> 0.3.0"])
      s.add_dependency(%q<winrm-fs>, ["~> 0.3.0"])
    s.add_dependency(%q<winrm-fs>, ["~> 0.3.0"])

by removing the tilde with

      s.add_runtime_dependency(%q<winrm-fs>, ["> 0.3.0"])
      s.add_dependency(%q<winrm-fs>, ["> 0.3.0"])
    s.add_dependency(%q<winrm-fs>, ["> 0.3.0"])

and then

vagrant plugin install winrm-fs

It installs winrm-fs 0.4.3. After that a vagrant up works again.

@sethvargo
Copy link
Contributor

Hi there,

I am going to strongly discourage manually editing the contents in /opt/vagrant. This is not supported and will result in unexpected behavior.

@StefanScherer
Copy link
Contributor

I totally understand that is at my own risk, so I have to downgrade to 1.8.1 and will be very cautious making the next update.

@dragon788
Copy link
Contributor

@gtirloni I would guess the boxcutter Windows 10 box you are using might not have the same Windows Updates applied that newer custom boxes have that could be causing this issue. I am running into this same failure after installing Boxstarter/Chocolatey/PowerShell 5. I'm not sure if it's simply PowerShell 5, since I can use Boxstarter/Chocolatey without to install other packages and it is fine, or if it is the combination of all of them.

@dragon788
Copy link
Contributor

Weird that this seems to be a regression since #6060 was fixed.

@gtirloni
Copy link
Author

gtirloni commented Aug 3, 2016

The issue is present on a box built using the Win10 Anniversary Update.

Tested with:

Windows 10 (AU) host
Windows 10 (AU) vagrant box
Vagrant 1.8.5
Vmware Workstation 12.1.1

@MarkSummers
Copy link

@gtirloni This is not resolved. Upgrading winrm-fs to 0.4.3 seemed to fix it but this didn't last for long. I have a Windows 10 box that fails to vagrant up every time with the Remove-Item error you mentioned.

@MarkSummers
Copy link

I am starting to suspect that this is somehow related to the size of the Powershell script being uploaded. I added a function to check_files.ps1.erb to find which process is locking the file that Remove-Item is trying to remove. This initially produced an error "The command line is too long (powershell script is too long)". I then removed some unnecessary code and whitespace from the function. Now the bug has gone way, even though the script is still longer than it originally was.

@dragon788
Copy link
Contributor

And if I understand correctly typically the win RM is just passing a base64
encoded Powershell command, if it's using cmd.exe to interpret it on the
other side that could be where it's too long or has too many arguments.

I've actually seen it where poorly coded interpreters can handle extremely
long command lines but they are limited in the number of parameters and
values as arguments that can be input to a single executable.

Can you post what the win RM max memory per threads and Max threads are
inside your VM?

On Sep 22, 2016 6:58 AM, "MarkSummers" notifications@github.com wrote:

I am starting to suspect that this is somehow related to the size of the
Powershell script being uploaded. I added a function to check_files.ps1.erb
to find which process is locking the file that Remove-Item is trying to
remove. This initially produced an error "The command line is too long
(powershell script is too long)". I then removed some unnecessary code and
whitespace from the function. Now the bug has gone way, even though the
script is still longer than it originally was.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#7476 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAdxXkJVYPJJlCLiB7oKKhFV2YQtHJK4ks5qsm12gaJpZM4I4qOl
.

@MarkSummers
Copy link

MarkSummers commented Sep 22, 2016

Max threads: 25
Max memory per thread: 1024M

If I make the slightest change to the code in check_files.ps1.erb then the problem goes away (even if I add code, which would surely aggravate the "command line too long" issue). So, I have to conclude that this is a race condition and I am affecting the timing by adding or removing code.

@MarkSummers
Copy link

MarkSummers commented Sep 23, 2016

I patched the relevant section of check_files.ps1.erb as follows...

  C:\handle64.exe Users\packer\AppData\Local\Temp >> C:\handle.log
  Remove-Item $in,$decoded -Force
  C:\handle64.exe Users\packer\AppData\Local\Temp >> C:\handle.log

The purpose of this was to try to discover the identity of the process with a handle on the file we are trying to remove (by using the Handle utility from Sysinternals).

We have a suspect...

Nthandle v4.1 - Handle viewer
Copyright (C) 1997-2016 Mark Russinovich
Sysinternals - www.sysinternals.com

MsMpEng.exe        pid: 1032   type: File           B38: C:\Users\packer\AppData\Local\Temp\hash-9818ae26-d9f0-4d67-83f1-be0c713241ac.txt.ps1

MsMpEng.exe        pid: 1032   type: File           B4C: C:\Users\packer\AppData\Local\Temp\hash-9818ae26-d9f0-4d67-83f1-be0c713241ac.txt.ps1

MsMpEng.exe        pid: 1032   type: File           B50: C:\Users\packer\AppData\Local\Temp\hash-9818ae26-d9f0-4d67-83f1-be0c713241ac.txt.ps1

This is a Windows Defender process. Evidently, it does not quite finish scanning in time before winrm-fs attempts to Remove-Item.

@MarkSummers
Copy link

MarkSummers commented Sep 23, 2016

I seem to have worked around the issue by running the following Powershell command to prevent scanning of files in the temp directory:

Add-MpPreference -ExclusionPath "$env:TEMP"

@seanamosw
Copy link

I ran into this immediately after upgrading the guest OS' Windows Management Framework to 5.0 (Powershell 5.0). The guest is a Windows Server 2012 R2 machine.

I can reliably reproduce this. Start with a Windows Server 2012 R2 box with PS 4.0. Provisioning files should work. Upgrade to PS 5.0, it will start breaking with the error:
Error occurred: [WinRM::FS::Core::FileTransporter] Upload failed (exitcode: 0), but stderr present

In the very short term, I can work around this by not installing PS 5.0 just yet into the guest OS.

@gtirloni gtirloni reopened this Oct 14, 2016
@dragon788
Copy link
Contributor

The cases where I've seen this are if something creates the Powershell
profile for the user, which was happening from chocolatey up until
recently. As soon as they disabled the creation the problem went away,
something to do with the loading of the profile triggering a leak to
standard out. This may be resolved if a new WinRM Gem cuts out or handles
the five Powershell data streams appropriately and prevents the leaks.

On Oct 14, 2016 1:47 PM, "Giovanni Tirloni" notifications@github.com
wrote:

Reopened #7476 #7476.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#7476 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAdxXg2j4k3iZq2DtjhuYXuTwW2lazS9ks5qz848gaJpZM4I4qOl
.

@chrisroberts
Copy link
Member

Hi there,

It looks like this has been resolved within a previously shipped version of Vagrant so I am now closing this issue. If the original issue was not fully resolved, please reopen this issue or create a new one.

Cheers!

@ghost
Copy link

ghost commented Apr 2, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators Apr 2, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants