Fix deadlock in ServiceEndpointWatcher when disposing change token registration#7255
Merged
ReubenBond merged 1 commit intodotnet:mainfrom Feb 3, 2026
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR fixes a critical deadlock issue in ServiceEndpointWatcher that occurs when disposing change token registrations while holding a lock. The deadlock happens when one thread holds the lock and waits for a callback to complete, while the callback thread is blocked trying to acquire the same lock.
Changes:
- Moved
_changeTokenRegistration?.Dispose()outside the lock inRefreshAsyncInternal()by capturing the registration before disposal - Applied the same pattern to
DisposeAsync()for both_changeTokenRegistrationand_pollingTimer - Added explanatory comments documenting the deadlock scenario and why disposal must occur outside the lock
rzikm
approved these changes
Feb 3, 2026
src/Libraries/Microsoft.Extensions.ServiceDiscovery/ServiceEndpointWatcher.cs
Show resolved
Hide resolved
Member
|
/backport to release/10.2 |
Contributor
|
Started backporting to release/10.2: https://github.com/dotnet/extensions/actions/runs/21645137209 |
Contributor
|
@joperezr backporting to "release/10.2" failed, the patch most likely resulted in conflicts: $ git am --3way --empty=keep --ignore-whitespace --keep-non-patch changes.patch
Patch format detection failed.
Error: The process '/usr/bin/git' failed with exit code 128Please backport manually! |
d957975 to
f278292
Compare
…gistration Move _changeTokenRegistration.Dispose() outside the lock to avoid deadlock. CancellationTokenRegistration.Dispose() blocks waiting for any in-flight callback to complete, but the callback (RefreshAsync) tries to acquire the same lock, causing a deadlock. The fix captures the registration reference while holding the lock, then disposes it after releasing the lock. Applied to both RefreshAsyncInternal and DisposeAsync methods.
f278292 to
730ee47
Compare
Member
Author
|
/backport to release/10.2 |
Contributor
|
Started backporting to release/10.2: https://github.com/dotnet/extensions/actions/runs/21650268580 |
This was referenced Feb 11, 2026
Merged
Bump Microsoft.Extensions.ServiceDiscovery from 10.2.0 to 10.3.0
askpt/openfeature-aspire-sample#351
Merged
Open
chore: Bump Microsoft.Extensions.ServiceDiscovery from 10.2.0 to 10.3.0
JerrettDavis/PokManagerUI#23
Open
This was referenced Feb 16, 2026
Open
Merged
Open
Open
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes a deadlock in
ServiceEndpointWatcherthat occurs when disposing the change token registration while holding_lock.The Problem
A deadlock can occur with the following sequence:
RefreshAsyncInternal(), acquires_lock, and calls_changeTokenRegistration?.Dispose()CancellationTokenRegistration.Dispose()blocks waiting for any in-flight callback to completeRefreshAsync(force: false), which tries to acquire_lock_lockand waits for Thread B's callback → Thread B waits for_lockheld by Thread AThe Fix
Move
_changeTokenRegistration?.Dispose()outside the lock:Applied to both
RefreshAsyncInternal()andDisposeAsync()methods.Testing
Microsoft Reviewers: Open in CodeFlow