Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Az cmdlets should allow any network operations to be timed out. #19256

Closed
heaths opened this issue Aug 18, 2021 · 6 comments
Closed

Az cmdlets should allow any network operations to be timed out. #19256

heaths opened this issue Aug 18, 2021 · 6 comments
Assignees
Labels
ARM az resource/group/lock/tag/deployment/policy/managementapp/account management-group Service Attention This issue is responsible by Azure service team.

Comments

@heaths
Copy link
Member

heaths commented Aug 18, 2021

Is your feature request related to a problem? Please describe.

The Azure SDK team has experienced since REST calls that never returned - even after an hour of waiting. No network app should really be without a way to cancel, and it so happens that the Azure SDKs - in all languages, but I'll focus on .NET here - have a way for apps to do that. This way, we're not blocked waiting for a network call to return even if the server is not responding correctly.

Describe the solution you'd like

All the track 2 SDKs take a CancellationToken as the last parameter. The track 1 SDKs should provide a similar mechanism. One thought it just to declare a -TimeoutSec parameter like Invoke-RestMethod, though there are other options here. New-PSSessionOption has a few parameters like -CancelTimeout that take milliseconds.

Whatever the decision, you can create a CancellationTokenSource to cancel our methods like so:

using var cts = new CancellationTokenSource(timeoutInMilliseconds);
Response<KeyVaultSecret> response = client.GetSecret(secretName, cancellationToken: cts.Token);

You could default to no timeout to maintain compatibility (it an integer, typically -1 means infinite).

Describe alternatives you've considered

Right now I'm starting a ThreadJob (cheaper than a BackgroundJob which creates a module and starts a new pwsh.exe process) as shown in Azure/azure-sdk-tools#1912; however, while I can stop waiting for a job, I can't kill it unless I use a BackgroundJob and kill the process, but starting a new process for each resource to purge would be very expensive time-wise. Ideally, the cmdlets - like all network clients - should have some sort of timeout mechanism.

@ghost ghost added the needs-triage This is a new issue that needs to be triaged to the appropriate team. label Aug 18, 2021
@heaths
Copy link
Member Author

heaths commented Aug 18, 2021

Or is this what is intended with the -AsJob. The original concern was that -AsJob typically returns a background process, but in testing these further it seems they are lightweight and can, in fact, be cancelled via the Cancel() method (unlike ThreadJob.Stop($true) currently to forcibly stop a job, since the force parameter is ignored).

@yonzhan yonzhan added the Network az network vnet/lb/nic/dns/etc... label Aug 18, 2021
@ghost ghost removed the needs-triage This is a new issue that needs to be triaged to the appropriate team. label Aug 18, 2021
@yonzhan yonzhan added ARM az resource/group/lock/tag/deployment/policy/managementapp/account management-group and removed Network az network vnet/lb/nic/dns/etc... labels Aug 18, 2021
@yonzhan
Copy link
Collaborator

yonzhan commented Aug 18, 2021

You could create an issue in powershell repo: https://github.com/Azure/azure-powershell

@yonzhan yonzhan added the Service Attention This issue is responsible by Azure service team. label Aug 18, 2021
@ghost
Copy link

ghost commented Aug 18, 2021

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @armleads-azure.

Issue Details

Is your feature request related to a problem? Please describe.

The Azure SDK team has experienced since REST calls that never returned - even after an hour of waiting. No network app should really be without a way to cancel, and it so happens that the Azure SDKs - in all languages, but I'll focus on .NET here - have a way for apps to do that. This way, we're not blocked waiting for a network call to return even if the server is not responding correctly.

Describe the solution you'd like

All the track 2 SDKs take a CancellationToken as the last parameter. The track 1 SDKs should provide a similar mechanism. One thought it just to declare a -TimeoutSec parameter like Invoke-RestMethod, though there are other options here. New-PSSessionOption has a few parameters like -CancelTimeout that take milliseconds.

Whatever the decision, you can create a CancellationTokenSource to cancel our methods like so:

using var cts = new CancellationTokenSource(timeoutInMilliseconds);
Response<KeyVaultSecret> response = client.GetSecret(secretName, cancellationToken: cts.Token);

You could default to no timeout to maintain compatibility (it an integer, typically -1 means infinite).

Describe alternatives you've considered

Right now I'm starting a ThreadJob (cheaper than a BackgroundJob which creates a module and starts a new pwsh.exe process) as shown in Azure/azure-sdk-tools#1912; however, while I can stop waiting for a job, I can't kill it unless I use a BackgroundJob and kill the process, but starting a new process for each resource to purge would be very expensive time-wise. Ideally, the cmdlets - like all network clients - should have some sort of timeout mechanism.

Author: heaths
Assignees: zhoxing-ms
Labels:

Service Attention, ARM

Milestone: -

@heaths
Copy link
Member Author

heaths commented Aug 18, 2021

So how is cancellation handled in the az CLI? Is there a default timeout? I don't see anything obvious in general or command-specific help. There's -no-wait for kicking off a background job. Does this imply, then, that Ctrl+C against the CLI or just killing background jobs is the solution for the CLI? (Which is fine; Ctrl+C didn't work for the Az cmdlets in this state because of how PowerShell processes cancellation requests since you're not actually killing a separate process most times.)

@jiasli
Copy link
Member

jiasli commented Aug 19, 2021

Azure PowerShell vs Azure CLI

Az cmdlets are from Azure PowerShell, not Azure CLI. They are 2 different tools with completely different implementation mechanisms:

  • Azure PowerShell consists of native PowerShell cmdlets
  • Azure CLI consists of Python scripts and is invoked as python.exe -m azure.cli from an entry script az.cmd (on Windows)

Network timeout vs endless polling

I don't think the timeout is regarding network, but the endless polling on async ARM operations.

If an HTTP request times out, Azure CLI will retry several times and fail. If an az command hangs, most likely it is because it goes into endless polling because ARM never returns "status": "Succeeded" in the polling response. So to be more accurate, the ask here should be a polling timeout.

Indeed, there is no parameter to specify the polling timeout in Azure CLI.

--no-wait

--no-wait doesn't kicks off a background job. It simply means Azure CLI won't poll on async ARM operations and will return the response of the initial PUT request.

Ctrl+C in Azure CLI

All Azure CLI commands (az ...) can be cancelled with Ctrl+C, since Azure CLI is just a python.exe process and Ctrl+C will trigger a KeyboardInterrupt and CLI will stop polling:

> az group delete -n cli_test_azure_firewall_rules_with_ipgroupse4upfibqoujmdn4odvtgrbuvqqgdvwd7 --yes
# Press Ctrl+C
Long-running operation wait cancelled.
Terminate batch job (Y/N)? y
>

The poller in on a daemon thread:

https://github.com/Azure/azure-sdk-for-python/blob/96da2bf412f3c70b11755a10132c4860f5287292/sdk/core/azure-core/azure/core/polling/_poller.py#L181

            self._thread = threading.Thread(
                target=with_current_context(self._start),
                name="LROPoller({})".format(uuid.uuid4()))
            self._thread.daemon = True
            self._thread.start()

so KeyboardInterrupt can be handled on the main thread:

except KeyboardInterrupt:
self.progress_bar.stop()
logger.error('Long-running operation wait cancelled. %s', correlation_message)
raise

You may also press Ctrl+Break to forcibly kill it (https://stackoverflow.com/questions/1364173/stopping-python-using-ctrlc):

> az group delete -n cli_test_azure_firewall_rules_with_ipgroupse4upfibqoujmdn4odvtgrbuvqqgdvwd7 --yes
# Press Ctrl+Break
^CTerminate batch job (Y/N)? y
>

Ctrl+C in Azure PowerShell

I tested with Azure PowerShell, and indeed Ctrl+C is not able to cancel it:

Remove-AzResourceGroup -name test_rg -Force
# Not responding to Ctrl+C

This has been reported at Azure/azure-powershell#14867, Azure/azure-powershell#15213.

@heaths
Copy link
Member Author

heaths commented Aug 19, 2021

Yes, I know they are different. I asked about the Az cmdlets internally and someone told me to open an issue here. Will close.

No, Ctrl+C does not work for PowerShell cmdlets because they are not separate processes nor does PowerShell really rely on Ctrl+C cancellation. It uses jobs, which Az cmdlets do support but not correctly. The internal discussion continues.

@heaths heaths closed this as completed Aug 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ARM az resource/group/lock/tag/deployment/policy/managementapp/account management-group Service Attention This issue is responsible by Azure service team.
Projects
None yet
Development

No branches or pull requests

4 participants