-
Notifications
You must be signed in to change notification settings - Fork 10k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
At random intervals the application fails to set RouteValues #41924
Comments
@okolvik-avento thanks for contacting us. Could you put the code in a minimal repro project instead? It is super hard to make sense on an issue with so much code. |
@Tratcher My understanding is that DefaultHttpRequest.RouteValues is failing (somehow). Do you know what can be the case for that? |
Thanks for contacting us. |
@javiercn The project itself is already quite small, so i can send you an invite to a repo without secrets. |
@javiercn I should note that i have yet to get the error in local development with IIS Express. I've only ever seen it on IIS. |
@javiercn Just noticed that i had not set the application pool to "No Managed Code", maybe that is the issue? I will see if it fixes the problem Update: |
Issues like these are always hard to track down. Does it repro if you remove application insights (I assume you're using it)? |
@davidfowl Yeah i use Application Insights. I've removed service.AddApplicationInsightsTelemetry() now and removed all traces of the telemetry client. Lets see how it goes |
@davidfowl No errors after i removed it. So it seems Application Insights is the issue here. I think this is related #1524 |
Hmmm, I expected this but the callstack does look different. I would be great a capture a process dump when this happens... |
@davidfowl What's the best way to do that? |
@davidfowl Here is the .net callstack:
Another similar one:
|
@davidfowl And one for get_Cookies
|
This stack is the app insights ones that I would expect to fail. |
Ok, i'll keep monitoring the exceptions. Tell me if you want a minidump |
Did you capture one? |
Yeah i'm running the debug diagnostic tool on the application pool |
Note that the set_RouteValues one also only happens with AI activated |
@davidfowl https://avento365-my.sharepoint.com/:f:/g/personal/ole_kolvik_avento_no/EnPB1kXz48RJjju2c45gqQwB-gpS6lyIKiYu3JsaG7v5_A?e=HeWMXw 1 for get_Cookies and 1 for set_RouteValues. Btw, the pdb for Application Insights isn't on the microsoft symbol servers or on the nuget ones. Had to grab it manually. |
I do have full userdumps as well, but i'm not sharing them publicly |
None of these errors happen on Debug builds, could it be something wrong with optimization for Release builds? |
Also here is a load test with JMeter on Debug vs Release build Debug:
Release:
|
Release without services.AddApplicationInsightsTelemetry();
|
@okolvik-avento can you get me access to the dump. My email is david.fowler at microsoft |
@davidfowl sent you an invite to a onedrive folder |
My current theory is that there's concurrent access to the HttpContext causing this corruption. I will admit that I haven't yet figured out why its concurrent access to the HttpContext this early. This method is returning null
|
@davidfowl i suspect it's the BasicAuthenticationHandler. It seems to be accessing a lot of headers in various async functions. That might be it? I'll try moving every header in the initial authentication function |
BasicAuthenticationHandler looks fine. Async functions are ok so long as you |
I see why it happens now. It's because the basic auth middleware calls into the telemetry client. That's why it fails at routing. I'm going to see if we can improve these issues but one short term fix would be to remove the telemetry initializer that read the http context. |
@davidfowl in the updated code that the dumps are from, there is no telemetry in BasicAuth (remember i removed everything about the telemetry client #41924 (comment)). I changed everything to use Logger.LogError where Logger is Microsoft.Extensions.Logging.ILoggerFactory which is the standard logger in AuthenticationHandler. Unless i'm misunderstanding you that is. |
Are you saying it still happens when you remove application insights? |
Thanks to these dumps I can visually see the race condition 😄: When the exception happens for cookies, there's an async Task continuation that is running and writing to app insights (where the null ref happens). Concurrently the auth middleware is currently running. Where are your SQL queries? In the user service? |
@davidfowl No, that's not what i mean. It only happens when i have services.AddApplicationInsightsTelemetry(), even when i don't actually use the telemetry client anywhere. This goes for both the get_Cookies and set_RouteValues I'm sending you an invite to the repo with the code that created the dumps. |
Got it, yea it seems like something is making SQL queries on another thread that is running concurrently with the request causing the race. |
I didn't even realize the problem was right in front of my eyes. @okolvik-avento Thanks for the dumps, the issue in the in SqlCommand implementation here When you run an async sql command, even when you await it properly, the implementation fires the diagnostics listener callback after running the user's continuation. This allows the telemetry initializers in app insights to run concurrently with the request, causing this chaos. This isn't an app insights issue (this time 😄), it turns out. cc @ajcvickers @roji |
@davidfowl I'm replacing System.Data.SqlClient with Microsoft.Data.SqlClient for more testing. |
First thing i see is a huge performance increase. While tests with System.Data.SqlClient had 20ms responsetime with max up to 100ms, i now see sub 5ms responsetimes with max up to 20ms |
@davidfowl Sadly i'm still seeing the get_Cookies exception |
Yes, the problematic code exists in both implementations 😄, see dotnet/SqlClient#1634 |
@David-Engel owns SqlClient. |
PR open here dotnet/SqlClient#1637 |
This is fixed! |
Thank you for the quick response @davidfowl! |
Thanks for the great debugging and dumps @okolvik-avento ! |
Is there an existing issue for this?
Describe the bug
At random intervals (6-24 hours) the application fails to set RouteValues, which raises an exception.
Parallel or subsequent requests completes successfully, it only happens once in a blue moon.
Technical details:
.NET 6.0.5
Windows Server 2019, IIS 10
YARP (Reverse Proxy) 1.1.0
Application Insights 2.20.0
The bug happened with Ocelot as well as YARP, so i don't think either of them is the reason.
appsettings.json
Startup.cs
BasicAuthenticationHandler:
Expected Behavior
No exception
Steps To Reproduce
Sadly i can't reproduce it, since i don't know what's actually wrong.
Exceptions (if any)
System.NullReferenceException: Object reference not set to an instance of an object.
at Microsoft.AspNetCore.Http.DefaultHttpRequest.set_RouteValues(RouteValueDictionary value)
at Microsoft.AspNetCore.Routing.Matching.DfaMatcher.MatchAsync(HttpContext httpContext)
at Microsoft.AspNetCore.Routing.EndpointRoutingMiddleware.Invoke(HttpContext httpContext)
at Microsoft.AspNetCore.Authentication.AuthenticationMiddleware.Invoke(HttpContext context)
at Microsoft.AspNetCore.Server.IIS.Core.IISHttpContextOfT`1.ProcessRequestAsync()
.NET Version
6.0.5
Anything else?
.NET 6.0.5
Windows Server 2019, IIS 10
YARP (Reverse Proxy) 1.1.0
Application Insights 2.20.0
The text was updated successfully, but these errors were encountered: