-
Notifications
You must be signed in to change notification settings - Fork 175
v2.0 - installations hang during "Setup" #119
Comments
Further debug; I just spotted this morning for Windows 7; "Windows cannot apply the DiskConfiguration in Autounattend.xml".
|
Well I spent the entire day on this.. and I probably should have come here first. I am seeing the same situation with vSphere 6.5 w/DRS and the release version of the packer-builder-vsphere-iso.exe. At first I thought it was because in the release version the disk size is in MB and not GB so I went from an 80GB partition to an 80MB partition. But after I realized that wasn't what was going on I spent pretty much all day today trying to figure out what I had done wrong. The part that through me off is that the vSphere GUI when working with the packer VM that it has created is almost totally unresponsive. Shutting down the VM usually fails or errors out a few times. Getting to the console will hang or not connect at all. The RAM usage generally tries to consume all of the available RAM for the VM as well. CPU doesn't spike. I/O doesn't spikes. Nothing. What I found interesting though is if I left the process running and did a "reset" on the vm through the VMRC the vm booted normally and sped through the setup without and completed quickly. So, there's some interaction with the release version of the plug-in during vm creation and vSphere that wasn't there in prior versions. Initially I thought it was because I was running packer from a different server than I was before. And then I realized that on the new server (running Jenkins) I had downloaded a newer version of the plug-in. As soon as I changed the plug-in for the pre-release version and fixed the disk size (as it was now trying to create in 80000GB drive) it worked as expected. The version of the plugin that works for me -I don't know the version number - but says it from 4/12/18. If there is additional logging or anything else I can do to help you troubleshoot this please let me know. It is very easy to reproduce. |
@embusalacchi looking like it. We'll need some insight on what might have changed between the 2.0beta4 release and the 2.0 release. I've been trying to go build by build from the public teamcity server located here but I don't really have the time to do so and keep these hung VMs in my inventory. I also don't know which build corresponds to the 2.0beta4 release so I can work up from there. |
Last commit for 2.0-beta4 was 15th of March to add Cluster Support. |
I did a build from 25th April after commit #82 and the issue isn't present then. Hope that is of some help @sudomateo ? |
@chris-david-taylor thank you sir. I'll check that build out. |
are you using the boot_cmd in packer?? my VMs lock up after this?? |
I'm not @kempy007. It's not the boot command that is the issue. @embusalacchi suggested these might be all related, possibly to floppy_media. First of all, does 2.0beta4 work for you, and what OS are you templating? I'm not an expert in Golang and currently working 12 hour days, otherwise I'd learn a bit and investigate, but as long as you aren't desperate for winrm, then building from #82 should be OK. Are you comfortable doing that? |
I don't mind trying other builds but I don't have the means to build them on my own. Is there a link somewhere? I don't mind trying them as I have time to nail down when it went bad. |
RHEL6, packer is now 1.2.4. I think it maybe related which is why I wanted to know if you have boot_cmd in your packer file. |
I am not using the boot_cmd in my packer file. |
@embusalacchi Builds can be found here: https://teamcity.jetbrains.com/viewType.html?buildTypeId=PackerVSphere_Build&branch_PackerVSphere=%3Cdefault%3E&tab=buildTypeStatusDiv Just log in as guest and download the build you want. Also I am using |
I'm not using the boot_cmd parameter @kempy007 Thanks @sudomateo :) |
found a line in esxi logs:
seems to be invalid parameter in the vmx file:
=> https://kb.vmware.com/s/article/2085907 maybe this is the issue, if i remove the lines above, the vm is not hanging :) |
Here's the generated .vmx from the not-working version of the plugin (vsphere-iso):
|
What's strange is if you do a RESET on the VM it will boot perfectly and go through the install. If you compare the .vmx when "broken" and the .vmx after the reset it is identical. So, it's not entirely clear where the issue is unless it boots with a bad value but vSphere writes out a good value that it uses when it boots after the reset? |
Hi @schmandforke, Can you possibly try and get the vmx file generated by 2.0beta4 and then do a “diff” and post it here please? I think there could be a workaround; if we know what the specific invalid parameters are, we might be able to set them in the vmx config of the packerfile. |
@chris-david-taylor here you go - this is from beta4 - |
@chris-david-taylor I was looking through the go code (I don't really know go) to see if I could figure out what's being set wrong. If you look at my previous comments the VM works after a reset (and not changing anything else). The .vmx from the slow version and the fast version after the reset seems identical so it's almost like it starts up with a bad param but vSphere fixes it? Not sure - I might just be missing something. |
Hi @embusalacchi, "vmx_data": { |
Yeah I think you're onto something here. When you look at the settings in vCenter, "CPU Limit" is set to "0MHz" instead of what it would normally be which is "Unlimited" |
Following that suspicion, I think I now have a viable workaround:
In your |
confirmed that
worked for me ! |
Yay! Let’s leave this open as it will hopefully help with debugging. |
Can confirm |
"CPU_limit": -1 worked for me too in ESXi 6.5. Thanks. |
Can someone update readme.md to add the above workaround as strongly recommended to avoid this issue? |
I'd argue instead that this needs to be fixed so that "Unlimited" is the default ...... or perhaps there's a way to consume an object from the API that actually reveals the cluster defaults? I'd really hope that this doesn't just become some obligatory settings. |
@xenithorb I agree. This needs to be addressed in the code, whether upstream or in this plugin. |
@sudomateo - I’ll file a bug upstream with VMware at some point today. :) |
Works for me. ESXi 6.0, 2.0 vsphere-iso plugin, CentOs 7 Minimal |
FWIW I'm seeing behaviour like this regularly - especially on Win-10 machines when I apply the cumulative updates. Resetting through VMRC helps but machine goes to hang again. Researching it with out infrastructure folks to see if there is any issues with our vCenter. |
Confirming that adding
|
The vcenter builds hang either shortly after boot, or in windows-10 during the cumulative update. This appears to be the behaviour noted in jetbrains-infra/packer-builder-vsphere#119 so applying the recommended fix there. Also added svga parameters to help correct the console connect issues associated with the above problem.
The |
The fix belongs in VMware’s upstream libraries. I’ve submitted a bug which I should check up on, as I’m starting to write my own code that depends on the upstream. |
Got it, thanks @chris-david-taylor. Is there a link to the upstream bug? I'd like to follow if possible (maybe other people on this would as well.) |
just wanted to say thanks for this, |
Just another nudge to @chris-david-taylor in linking to the upstream bug, so that we could follow it to the extent possible. I couldn't find the issue in govmomi, but I could easily have been searching for the wrong thing. |
Sorry, I've been away @thor - Darn it, is this still a problem? I'll dig it out later today, and if I can't find it, I'll refile. |
@chris-david-taylor I can do a quick check with a build from the latest govmomi sources, if that's what you had in mind? :) |
If you could please @thor that would be great. If the issue persists I'll pass it up on to the govmomi maintainers. :) |
Thanks Guys. CPU Limit is the issue |
I'm sorry this took so much time. |
The v2.0 plugin seems to have a bug, regarding installing Windows. (I haven't tried others yet.). My present lab runs on vSphere 6.5.
Steps to reproduce;
Confirmed on Windows 2012_r2, and Windows 7.
I'll try and get some logging out of our environment tomorrow, my permissions are too locked down for me to look. Part of me thinks this may be related to #112
I have also tried updating Packer to 1.2.3.
The text was updated successfully, but these errors were encountered: