-
Notifications
You must be signed in to change notification settings - Fork 849
Fix race condition in UnreliableL2Tests.WriteFailureInvisible #7075
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
… logs Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR fixes a flaky test (WriteFailureInvisible) that was intermittently failing in CI due to a race condition. The test verifies that L2 cache write failures are properly logged, but because these failures are logged asynchronously after a background write task completes, the original fixed 150ms delay was insufficient to reliably wait for logging to finish.
Key Changes:
- Added
WaitForLogsAsyncmethod toLogCollectorthat polls for expected log entries with a timeout instead of using fixed delays - Updated
WriteFailureInvisibletest to useWaitForLogsAsync(errorIds, TimeSpan.FromSeconds(5))instead ofTask.Delay(150)
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| test/Libraries/Microsoft.Extensions.Caching.Hybrid.Tests/LogCollector.cs | Added WaitForLogsAsync method that polls for log entries matching expected error IDs with configurable timeout, using thread-safe lock-based checking that mirrors the validation logic in AssertErrors |
| test/Libraries/Microsoft.Extensions.Caching.Hybrid.Tests/UnreliableL2Tests.cs | Replaced unreliable fixed 150ms delay with WaitForLogsAsync call that polls for up to 5 seconds, ensuring logs are present before assertion |
| var deadline = DateTime.UtcNow + timeout; | ||
| while (DateTime.UtcNow < deadline) | ||
| { | ||
| lock (_items) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it would be enough to just wait until this could be acquired once? Or are the items added asynchronously acquiring the lock potentially after this is called?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logging itself is synchronous (line 120 adds to _items under lock), but it's called from the catch block (line 380 in DefaultHybridCache.StampedeStateT.cs) that executes asynchronously in the background task.
l2.LastWrite tracks when the write task completes, but doesn't guarantee the catch block has finished executing. The catch block could still be running after LastWrite is done, especially with thread scheduling variations.
So we need to poll: we're waiting for the logger to be called (which happens async), not just for the lock to be available. Acquiring the lock once would be a snapshot at a single point in time - we might check before the background catch block calls the logger.
ericstj
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just had a question that might simplify the method.
The test was intermittently failing in CI because L2 write failures are logged asynchronously after the background write task completes, but the test used a fixed 150ms delay that didn't reliably wait for logging to finish.
Changes
WaitForLogsAsyncthat polls for expected log entries with timeout instead of guessing timingWaitForLogsAsync(errorIds, TimeSpan.FromSeconds(5))The polling logic matches
AssertErrorsexactly (count and order must match), ensuring test assertions see the same log state that will be verified.Original prompt
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.
Microsoft Reviewers: Open in CodeFlow