-
Notifications
You must be signed in to change notification settings - Fork 861
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deadlock acquiring credentials when GlobalRuntimeDependencyRegistry used #3153
Comments
Doing a bit more digging, this might be a case of more of a docs/footgun issue than a bug per-se. Digging around some more, I see here that when not using native AoT, the client is created with explicit anonymous credentials: aws-sdk-net/sdk/src/Core/Amazon.Runtime/Credentials/AssumeRoleWithWebIdentityCredentials.cs Line 240 in f7a3a70
However, in my repro, I'm just calling the constructor with a region, it calls back into Line 90 in f7a3a70
I haven't tried changing my code yet to do the same, but it raises a few points
If this does turn out to be what is happening, then I guess the bug is less a deadlock, and more a need for clearer documentation on what the user needs to do in this case. |
Yep, that's what it was. If I change the code to this, the problem goes away: using Amazon.Runtime;
using Amazon.RuntimeDependencies;
using Amazon.SecurityToken;
Console.WriteLine("GlobalRuntimeDependencyRegistry.RegisterSecurityTokenServiceClient()");
GlobalRuntimeDependencyRegistry.Instance.RegisterSecurityTokenServiceClient(
- (_) =>
- new AmazonSecurityTokenServiceClient());
+ (context) =>
+ new AmazonSecurityTokenServiceClient(
+ new AnonymousAWSCredentials(),
+ context.SecurityTokenServiceClientContextData.Region));
Console.WriteLine("AssumeRoleWithWebIdentityCredentials.FromEnvironmentVariables()");
var credentials = AssumeRoleWithWebIdentityCredentials.FromEnvironmentVariables();
Console.WriteLine("AssumeRoleWithWebIdentityCredentials.GetCredentialsAsync()");
_ = await credentials.GetCredentialsAsync();
Console.WriteLine("Credentials obtained"); Maybe it would help give a hint for the user in this case if
public class SecurityTokenServiceClientContext
{
public enum ActionContext { AssumeRoleAWSCredentials, AssumeRoleWithWebIdentityCredentials, FederatedAWSCredentials };
public ActionContext Action { get; set; }
public RegionEndpoint Region { get; set; }
public IWebProxy ProxySettings { get; set; }
+ public AWSCredentials Credentials { get; set; } = new AnonymousAWSCredentials();
} |
The STS dependency is the hardest one to deal with in the SDK for AOT or more specifically now the GlobalRuntimeDependencyRegistry. Even in the assume role case, which requires AWS credentials and not the anonymous credentials, you want to make sure your not just using the default credentials because that would go back to the profile that needs to assume a role instead of the profile that should make the STS assume role call. I agree this area needs some smoothing out and at the very least better documentation. |
If you are using code that gets the role from the running instance which has role assumed in its execution context like Fargate or Lambda (which I believe is using assume role?), is the current correct action to keep Side note, it's crazy to me that this GitHub issue is the only reference found on Google for |
Describe the bug
While doing research into publishing a .NET 8 application using native AoT to EKS, I found at deploy-time that the application would fail on a code path we have in an internal library where the AWS SDK would attempt to get credentials using STS. The exception would be the following:
I added the code mentioned to the application and redeployed. At this point it appeared to no longer be responding and crash-looping.
Through various trial and error, I've distilled the problematic code path down to the code snippet below.
My hunch is that the code path is somehow re-entrantly trying to access this lock, causing it to block indefinitely:
aws-sdk-net/sdk/src/Core/Amazon.Runtime/Credentials/RefreshingAWSCredentials.cs
Lines 144 to 146 in f7a3a70
That's just a hunch - the issue could be somewhere completely different, but it's the first bit of synchronization I could find on the method call that is blocking. It also blocks using the synchronous version,
GetCredentials()
, so I'm pretty sure it's not something to do with sync-over-async.The code appears to deadlock regardless of whether native AoT is actually used at runtime or not.
Expected Behavior
The operation does not block.
With the repro, the following messages should be printed to the console:
Current Behavior
The application blocks on the call to
AssumeRoleWithWebIdentityCredentials.GetCredentialsAsync()
.Reproduction Steps
Run the following code within EKS where the instance credentials are available.
Possible Solution
No response
Additional Information/Context
No response
AWS .NET SDK and/or Package version used
AWSSDK.SecurityToken 3.7.300.38
Targeted .NET Platform
.NET 8
Operating System and version
Ubuntu 22.04.3 LTS
The text was updated successfully, but these errors were encountered: