-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Az commands failing intermittently with StackOverflowException when CheckForUpgrade is enabled #26623
Comments
Thanks for the analysis. We indeed initiates a background thread to check for updates. Although it is still unclear why that would result in a stackoverflow. |
I am reposting a post from this Issue: microsoft/azure-pipelines-tasks#20156 Unfortunately, I don't have the call stack from my failed run so I am extrapolating here. If we look at the callstack from the linked Az Powershell issue , the stackoverflow is around assembly resolution. The assembly resolver that's causing the stack overflow is a PowerShell ScriptBlock: [d:\os\src\onecore\admin\monad\src\engine\lang\Scriptblock.cs @ 774] System_Management_Automation_ni!System.Management.Automation.ErrorCategoryInfo.Ellipsize+0x56 [d:\os\src\onecore\admin\monad\src\engine\ErrorPackage.cs @ 511] internal ExecutionContext GetContextFromTLS() Searching azure-pipeline-tasks for cases where an assembly resolver is registered with powershell script we find two instances: VstsAzureHelpers_/Utility.ps1#L216 So, to truly fix all the cases where this can happen, VstsAzureHelpers_ would need to be updated to use a c# assembly resolver instead of a powershell script block in VstsAzureHelpers cc @starkmsu who introduced this change Call stack for reference unknown!DynamicClass.lambda_method+0x111 |
@YanaXu can you take a look and see whether we have action items to take? |
Hi @onetocny, I got what you described and how to workaround this issue. I'm sorry for your inconvenience. pr: none
trigger: none
steps:
- task: AzurePowerShell@5
displayName: Test with default version
inputs:
azureSubscription: 'my-service-connection-name'
ScriptPath: 'ps/test.ps1'
azurePowerShellVersion: LatestVersion
- task: AzurePowerShell@5
displayName: Test with latest version
inputs:
azureSubscription: 'my-service-connection-name'
ScriptType: 'FilePath'
ScriptPath: 'ps/test.ps1'
preferredAzurePowerShellVersion: '13.0.0' The Write-Host "---- Step 1"
$module = Get-Module -Name "Az.Accounts" -ListAvailable | Sort-Object Version -Descending | Select-Object -First 1
Import-Module -Name $module.Path -Global -PassThru -Force
Get-AzConfig -AppliesTo Az -Scope CurrentUser
Write-Host "---- Step 2"
Get-AzConfig
Write-Host "---- Step 3" In my build, the agent image is https://github.com/actions/runner-images/releases/tag/ubuntu22%2F20250105.1, the Azure PowerShell task version is Could you try my pipeline and tell me if it fails? Or how to reproduce this issue? |
Description
The
Az
modules used in Azure Devops AzurePowerShell tasks are intermittently failing. We are usingAzAccounts
module to authenticate PowerShell scope against Azure resources. After importingAz.Accounts
and running very first Az command (usuallyGet-AzConfig
) PowerShell process intermittently exits with following error output (see the related issues below for more details about the symptoms of the issue):The whole completed code that is responsible for Az module initialization could be found here. Here is the shortened code:
The process does not terminate immediately after
Get-AzConfig
but exists randomly in further code as it probably does not take constant time to overflow the stack. We have noticed that the issue appears across all versions starting3.0.0
.Preliminary RCA
We were able to find these records in event log on build agent machines where the issue happens. The error messages are pointing to Azure Watson dump records. Here is the example. Looking at the callstack there I was able to localize that the issue happens in
UpgradeNotificationHelper.RefreshVersionInfo
. We were able to mitigate the issue on our side by leveraging this code and disabling the whole feature by following command:However the
Update-AzConfig
has to be very first Az command we call otherwise theStackOverflowException
occurs again. I believe that is caused by the fact that version check is called automatically after every Az command completes inAzurePSCmdlet.EndProcessing
.Related issues
The text was updated successfully, but these errors were encountered: