Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Azure.Identity GetTokenAsync() sporadically timing out in Azure function #23713

Closed
erikumhoefer opened this issue Sep 1, 2021 · 10 comments
Assignees
Labels
Azure.Identity Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that

Comments

@erikumhoefer
Copy link

erikumhoefer commented Sep 1, 2021

Describe the bug
We have an Azure function that is using a system assigned managed identity. When we try to request a token to access our SQL database in the same resource group, occasionally the request will get "stuck" when making the request and is unable to return the token. I have logs around this request and I see the log for starting the token request, but the log after the token request is not logged and the function times out after 5 minutes. This also makes debugging hard as there is no exception thrown by the request - the only exception is the function timeout exception.

We are also noticing long request times (2s+ compared to normal request length of < 500ms)

Expected behavior
The token request is successful and does not "hang", causing the function timeout.

Actual behavior (include Exception or Stack Trace)
Sporadically the function times out due to the request not completing.
See logs attached from insights (might not be super useful).

To Reproduce
Steps to reproduce the behavior (include a code snippet, screenshot, or any additional information that might help us reproduce the issue)

This is difficult to reproduce as it happens sporadically / inconsistently. It does not seem to be correlated with multiple concurrent requests / multiple requests in short succession. It is happened for an isolated request.

Here is the code snippet that is failing.

ManagedIdentityCredential azureCredential = new ManagedIdentityCredential();
DateTime requestStart = DateTime.Now;
log.LogInformation("Token request created at {date}.", requestStart);
AccessToken accessToken = await azureCredential.GetTokenAsync(new TokenRequestContext(scopes: new string[] { "https://database.windows.net/.default" }));
log.LogInformation("Token request successful. Request length: {requestTime}ms.", DateTime.Now - requestStart);
connection.AccessToken = accessToken.Token;

Environment:

  • Name and version of the Library package used: Azure.Identity.1.4.1
  • Hosting platform or OS and .NET runtime version: Azure Functions 3.1.3.0 / netcoreapp3.1
  • IDE and version: Visual studio 16.5

I have seen some similar looking bugs that are open / have been resolved in the past that seemed to be due to deadlock. This could be a similar issue as the token request hangs indefinitely.
#14691
#22314

Logs:
tokenTimeoutLogs.csv

@ghost ghost added needs-triage Workflow: This is a new issue that needs to be triaged to the appropriate team. customer-reported Issues that are reported by GitHub users external to the Azure organization. question The issue doesn't require a change to the product in order to be resolved. Most issues start as that labels Sep 1, 2021
@jsquire jsquire added Azure.Identity Client This issue points to a problem in the data-plane of the library. needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team labels Sep 1, 2021
@ghost ghost removed the needs-triage Workflow: This is a new issue that needs to be triaged to the appropriate team. label Sep 1, 2021
@jsquire
Copy link
Member

jsquire commented Sep 1, 2021

Thank you for your feedback. Tagging and routing to the team members best able to assist.

@christothes
Copy link
Member

Hi @erikumhoefer -
Would you mind providing the logging output after reproducing this with logging enabled?

I looked at the logs attached, but I'm not sure what those are from.

@christothes christothes added the needs-author-feedback Workflow: More information is needed from author to address the issue. label Sep 1, 2021
@ghost ghost added needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team and removed needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team needs-author-feedback Workflow: More information is needed from author to address the issue. labels Sep 1, 2021
@erikumhoefer
Copy link
Author

I have added the AzureEventSourceListener to my function and I see its logs in insights. I'm going to wait for another failure to occur and come back with those logs, thanks! 👍

@erikumhoefer
Copy link
Author

erikumhoefer commented Sep 2, 2021

Looking at those new logs, I can see the token request is timing out after 100 seconds, and then failing with the exception

Request [9abbb824-c2e1-4875-8238-426a88dd98f1] exception System.Threading.Tasks.TaskCanceledException: The operation was cancelled because it exceeded the configured timeout of 0:01:40. Network timeout can be adjusted in ClientOptions.Retry.NetworkTimeout.
 ---> System.Threading.Tasks.TaskCanceledException: The operation was canceled.
 ---> System.IO.IOException: Unable to read data from the transport connection: The I/O operation has been aborted because of either a thread exit or an application request..
 ---> System.Net.Sockets.SocketException (995): The I/O operation has been aborted because of either a thread exit or an application request.

It will then retry the token request two more times, both timing out. After 5 minutes, the function times out.

This is the info from the request being logged:

Request [9abbb824-c2e1-4875-8238-426a88dd98f1] GET http://127.0.0.1:41249/MSI/token/?api-version=2017-09-01&resource==https%3A%2F%2Fdatabase.windows.net
secret:REDACTED 
x-ms-client-request-id:9abbb824-c2e1-4875-8238-426a88dd98f1
 x-ms-return-client-request-id:true 
User-Agent:azsdk-net-Identity/1.4.1,(.NET Core 3.1.16; Microsoft Windows 10.0.14393) 
client assembly: Azure.Identity

@christothes
Copy link
Member

Hi @erikumhoefer -
Because this appears to be an issue with the Managed Identity service itself, I'd recommend opening a support ticket at https://support.microsoft.com to get assistance on troubleshooting this further.

@christothes christothes added the needs-author-feedback Workflow: More information is needed from author to address the issue. label Sep 7, 2021
@Tealons
Copy link

Tealons commented Sep 14, 2021

We are seeing the same issue. If happens with only a part of our function projects and the occurrence is very flaky indeed. We opened a ticket, but the response from support is very slow... Any way you can escalate this @christothes?

@erikumhoefer
Copy link
Author

@Tealons could you please link your ticket with Microsoft ?

@ghost ghost removed the needs-author-feedback Workflow: More information is needed from author to address the issue. label Sep 14, 2021
@Tealons
Copy link

Tealons commented Sep 15, 2021

The ticket number is 2109140050001157.

@Ethan0007
Copy link

Hi everyone, is there any update on this issue?

@Tealons
Copy link

Tealons commented Nov 1, 2021

We got the response from the support team that there is a bug in some of the clusters in Azure. Apparently, the MSI endpoint on some clusters have become stuck in a faulty state induced by a high number of token requests by a site running in the cluster. The product team is aware of this and is working on a fix, but no ETA is given at this moment. The current advice is to restart your functions when this problem occurs.

@github-actions github-actions bot locked and limited conversation to collaborators Mar 27, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Azure.Identity Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that
Projects
None yet
Development

No branches or pull requests

6 participants