-
Notifications
You must be signed in to change notification settings - Fork 338
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
create multiple azure vms with async and some vms get error "Failed to determine PowerState of virtual machine" ? #1365
Comments
@dgiesselbach Sorry for replying you so late! From your error log, you can see that this error occurred because the VM's 'PowerState' was not correct when creating the end-of-VM sequence number VM information. I have repeatedly tested in the local, have not encountered your problem! In addition, PR #1354 has been submitted, after the creation of the VM, the VM state will be detected, so as to avoid some problems caused by the VM state is not fixed. Could you try again? Is there still this problem? Thank you! |
@Fred-sun Unfortunately it's still stuck in a retry loop. It looks like he doesn't make another API request against Azure in the retries to capture the correct PowerState. Can this be? All retry attempts say "Failed to determine PowerState of virtual machine". Here is an example output of the 206th retry attempt:
|
@dgiesselbach Can you share the process and mistakes you made to get this error? Thank you! |
Hi @Fred-sun Using azure.azcollection version 2.1.1 to create 250 virtual machines in Azure in a loop: ansible code:
Almost 30 Vms (always a coincidence) are successfully deployed but then several fail due to PowerState not determined My guess is that if he has not received the PowerState through the API request and then goes into the retry part, he will no longer make any further API requests to Azure and will therefore never receive the "correct" PowerState again. Hence my question, could you check in the code whether new API requests are being made against Azure? Here is an example output of the 206th retry attempt:
|
@dgiesselbach This problem is very beggar, and I cannot copy it for the time being, so I have tested 'powerstate' in PR#1412. You can update to the branch of 1412, will you still encounter this problem? Thank you! |
@Fred-sun with branch "modify_vm_test" everything works fine :) He goes straight to the correct PowerState without going into retry. Could the fix be integrated into 2.1.1 so that we could use the official version directly? Nice one! |
@dgiesselbach Thanks for your feedback! I will push this PR merge forward as soon as possible and publish it in the latest version. Thank you! |
SUMMARY
We create several azure vms at the same time (via loop and loop_control: pause 2) with async 1800 and poll 0 (fire and forget) and then check with ansible.builtin.async_status whether the vms are ready to be deployed.
If we deploy 250 vms we get the error message "msg" from at least 3 vms: "Failed to determine PowerState of virtual machine vm name" . Unfortunately, automatic retry doesn't work this way because the error message is not output under "stderr_lines" or "stderr".
Is this a bug? Could this message move to stderr_lines or stderr so that the async retry would kick in and capture the vm status in the next api request?
ISSUE TYPE
COMPONENT NAME
azure.azcollection.azure_rm_virtualmachine
ANSIBLE VERSION
COLLECTION VERSION
CONFIGURATION
OS / ENVIRONMENT
STEPS TO REPRODUCE
Large deployment with 250 vms via {{ vmCollection }}
EXPECTED RESULTS
msg message move to stderr_lines or stderr?
{
"started": 1,
"finished": 1,
"stdout": "",
"stderr": "Failed to determine PowerState of virtual machine vm-test-vm",
"stdout_lines": [],
"stderr_lines": [Failed to determine PowerState of virtual machine vm-test-vm],
....
ACTUAL RESULTS
{
"started": 1,
"finished": 1,
"stdout": "",
"stderr": "",
"stdout_lines": [],
"stderr_lines": [],
"ansible_job_id": "j380889400388.25999",
"results_file": "/root/.ansible_async/j380889400388.25999",
"msg": "Failed to determine PowerState of virtual machine vm-test-vm",
"invocation": {
"module_args": {
"resource_group": "rg...",
"name": "vm-test-vm",
"short_hostname": "test-vm",
"vm_size": "Standard_D4s_v5",
"admin_username": "admin",
"admin_password": "VALUE_SPECIFIED_IN_NO_LOG_PARAMETER",
"availability_set": "avail....",
"network_interfaces": "nic-test-vm",
"managed_disk_type": "Premium_LRS",
"os_disk_name": "disk.....",
"os_disk_caching": "ReadWrite",
"os_type": "Windows",
"license_type": "Windows_Client",
"image": {
"name": "img....",
"resource_group": "rg....."
},
"tags": {
},
"started": true,
"network_interface_names": [
"nic-test-vm"
],
"auth_source": "auto",
"cloud_environment": "AzureCloud",
"api_profile": "latest",
"append_tags": true,
"state": "present",
"max_price": -1,
"ssh_password_enabled": true,
"storage_container_name": "vhds",
"public_ip_allocation_method": "Static",
"remove_on_absent": [
"all"
],
"created_nsg": true,
"allocated": true,
"restarted": false,
"force": false,
"generalized": false,
"accept_terms": false,
"profile": null,
"subscription_id": null,
"client_id": null,
"secret": null,
"tenant": null,
"ad_user": null,
"password": null,
"cert_validation_mode": null,
"adfs_authority_url": null,
"log_mode": null,
"log_path": null,
"x509_certificate_path": null,
"thumbprint": null,
"custom_data": null,
"location": null,
"priority": null,
"eviction_policy": null,
"ssh_public_keys": null,
"storage_account_name": null,
"storage_blob_name": null,
"os_disk_size_gb": null,
"os_disk_encryption_set": null,
"proximity_placement_group": null,
"open_ports": null,
"virtual_network_resource_group": null,
"virtual_network_name": null,
"subnet_name": null,
"data_disks": null,
"plan": null,
"zones": null,
"vm_identity": null,
"winrm": null,
"boot_diagnostics": null,
"ephemeral_os_disk": null,
"windows_config": null,
"linux_config": null,
"security_profile": null
}
},
"_ansible_no_log": null,
"attempts": 1,
"changed": false,
"item": {
"failed": 0,
"started": 1,
"finished": 0,
"ansible_job_id": "j380889400388.25999",
"results_file": "/root/.ansible_async/j380889400388.25999",
"changed": true,
"item": {
"AvailSetNumber": 1,
"Name": "test-vm"
},
"ansible_loop_var": "item"
},
"ansible_loop_var": "item",
"_ansible_item_label": {
"failed": 0,
"started": 1,
"finished": 0,
"ansible_job_id": "j380889400388.25999",
"results_file": "/root/.ansible_async/j380889400388.25999",
"changed": true,
"item": {
"AvailSetNumber": 1,
"Name": "test-vm"
},
"ansible_loop_var": "item"
}
}
The text was updated successfully, but these errors were encountered: