-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Race condition in MemoryCache will crash the process #61032
Comments
Tagging subscribers to this area: @eerhardt, @maryamariyan, @michaelgsharp Issue DetailsDescriptionI'm running some benchmark on MemoryCache, and I got the following error from a background thread, which killed the process. It looks like a race condition in the background compaction process, and I'm assuming that the following line is responsible for that: The issue here is that we are are updating the The error is:
Reproduction StepsThe following code will reproduce the behavior in about 5 or so minutes of runtime.
Expected behaviorThere should not be a crash. Actual behaviorThere is a crash and the process dies. Regression?No response Known Workaroundsnone ConfigurationNo response Other informationNo response
|
During cache compaction, we are sorting entries based on their LastAccessed time. However, since the cache entries can still be used concurrently on other threads, the LastAccessed time may be updated in the middle of sorting the entries. This leads to exceptions in a background thread, crashing the process. The fix is to cache the LastAccessed time outside of the entry when we are adding it to the list. This will ensure the time is stable during the compaction process. Fix dotnet#61032
Thanks for the bug report, @ayende. With your above description, I was easily able to reproduce the problem myself and you were correct in your analysis. The issue is that we are sorting based on I have created a fix for this in 7.0. This may be a candidate we can backport to 6.0. How urgent do you see this issue being? |
I'm actually using a copy of the code, so I can patch this locally. However, I would estimate that this is likely to be causing application crashes in production (and nearly impossible to reproduce ones). |
Yes we need to back port this one |
* Cache LastAccessed during MemoryCache compaction During cache compaction, we are sorting entries based on their LastAccessed time. However, since the cache entries can still be used concurrently on other threads, the LastAccessed time may be updated in the middle of sorting the entries. This leads to exceptions in a background thread, crashing the process. The fix is to cache the LastAccessed time outside of the entry when we are adding it to the list. This will ensure the time is stable during the compaction process. Fix #61032
Re-opening to track porting this fix to 6.0. |
* Cache LastAccessed during MemoryCache compaction During cache compaction, we are sorting entries based on their LastAccessed time. However, since the cache entries can still be used concurrently on other threads, the LastAccessed time may be updated in the middle of sorting the entries. This leads to exceptions in a background thread, crashing the process. The fix is to cache the LastAccessed time outside of the entry when we are adding it to the list. This will ensure the time is stable during the compaction process. Fix dotnet#61032
* Cache LastAccessed during MemoryCache compaction (#61187) * Cache LastAccessed during MemoryCache compaction During cache compaction, we are sorting entries based on their LastAccessed time. However, since the cache entries can still be used concurrently on other threads, the LastAccessed time may be updated in the middle of sorting the entries. This leads to exceptions in a background thread, crashing the process. The fix is to cache the LastAccessed time outside of the entry when we are adding it to the list. This will ensure the time is stable during the compaction process. Fix #61032 * Update fix for 6.0.x servicing. 1. Remove the dependency on ValueTuple and use a custom struct instead. 2. Add servicing package changes. 3. In the tests, move the DisableParallelization collection declaration in the test project, since it is only in "Common" in main.
Is this issue still relevant (#62286 is merged)? |
This can now be closed. The fix will ship with version |
@eerhardt I just had that error with 6.0.200-preview.22055.15 - is that error going to be fixed in the next version? |
Yes, it will be fixed in the next servicing release of .NET. The runtime version will be |
Description
I'm running some benchmark on MemoryCache, and I got the following error from a background thread, which killed the process.
It looks like a race condition in the background compaction process, and I'm assuming that the following line is responsible for that:
https://github.com/dotnet/runtime/blob/main/src/libraries/Microsoft.Extensions.Caching.Memory/src/MemoryCache.cs#L451
The issue here is that we are are updating the
LastAccessed
value onTryGetValue
, but if this is running concurrently with theSort()
call, this means that the sort order of an entry has changed, leading to this issue.The error is:
Reproduction Steps
The following code will reproduce the behavior in about 5 or so minutes of runtime.
This is actually a reproduction for another issue, but I run into the exception and it very much looks like a serious problem for consumers of MemoryCache.
Expected behavior
There should not be a crash.
Actual behavior
There is a crash and the process dies.
Regression?
No response
Known Workarounds
none
Configuration
No response
Other information
No response
The text was updated successfully, but these errors were encountered: