Skip to content

Conversation

JamesNK
Copy link
Member

@JamesNK JamesNK commented Feb 24, 2025

Fixes #2589

In connection manager and friends there are two important types of lock:

  • A lock in the connection manager
  • A lock for each subchannel

It's safe to take a subchannel lock inside the connection manager lock, but doing the opposite can lead to a deadlock.

There is one place locks were taken in the incorrect order which could cause a deadlock in rare circumstances: the channel is pick first, it has an idle subchannel, and a connection request occurs at the same time as the resolver updates.

Repoed in a unit test:

image

The fix is to move the connectivity state update outside of the subchannel lock.

@JamesNK JamesNK force-pushed the jamesnk/deadlock-20250224 branch from 81ad06f to 35edd39 Compare February 24, 2025 15:38
@JamesNK JamesNK changed the title Remove subchannel update lock Move updating connectivity state outside of subchannel lock Feb 24, 2025
@JamesNK JamesNK marked this pull request as ready for review February 24, 2025 15:50

if (connectionRequested)
{
UpdateConnectivityState(ConnectivityState.Connecting, "Connection requested.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this now going to cause races where someone else comes along and modifies the Subchannel state and we end up updating it to Connecting when it should stay as TransientFailure or Ready?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is fine. The logic in load balancing is designed that state can change at any point, and other parts of the system then react to it. All other places update the connectivity state outside the subchannel lock, this one is the outlier.

@JamesNK JamesNK merged commit 852a118 into grpc:master Feb 25, 2025
5 checks passed
@JamesNK JamesNK deleted the jamesnk/deadlock-20250224 branch February 25, 2025 00:19
@dannyheard7
Copy link

Would this be the cause of this reported issue? #2612

CurtHagenlocher pushed a commit to apache/arrow-dotnet that referenced this pull request Sep 27, 2025
Updated [Grpc.Net.Client](https://github.com/grpc/grpc-dotnet) from
2.65.0 to 2.71.0.

<details>
<summary>Release notes</summary>

_Sourced from [Grpc.Net.Client's
releases](https://github.com/grpc/grpc-dotnet/releases)._

## 2.71.0

## What's Changed
* Remove old dotnet-core feed by @​JamesNK in
grpc/grpc-dotnet#2611
* Bump axios from 1.7.4 to 1.8.4 in
/testassets/InteropTestsGrpcWebWebsite/Tests by @​dependabot in
grpc/grpc-dotnet#2615
* Bump @​babel/helpers from 7.25.0 to 7.27.0 in
/testassets/InteropTestsGrpcWebWebsite/Tests by @​dependabot in
grpc/grpc-dotnet#2616
* Bump tar-fs from 3.0.6 to 3.0.8 in
/testassets/InteropTestsGrpcWebWebsite/Tests by @​dependabot in
grpc/grpc-dotnet#2619
* Fix race condition that caused inprogress connect to be canceled by
@​JamesNK in grpc/grpc-dotnet#2618
* Bump tools package to 2.71 by @​apolcyn in
grpc/grpc-dotnet#2621
* Update NuGet package versions by @​JamesNK in
grpc/grpc-dotnet#2620
* bump version to 2.71.0-pre1 by @​apolcyn in
grpc/grpc-dotnet#2622
* Bump version on 2.71 for final release by @​apolcyn in
grpc/grpc-dotnet#2627


**Full Changelog**:
grpc/grpc-dotnet@v2.70.0...v2.71.0

## 2.71.0-pre1

## What's Changed
* Remove old dotnet-core feed by @​JamesNK in
grpc/grpc-dotnet#2611
* Bump axios from 1.7.4 to 1.8.4 in
/testassets/InteropTestsGrpcWebWebsite/Tests by @​dependabot in
grpc/grpc-dotnet#2615
* Bump @​babel/helpers from 7.25.0 to 7.27.0 in
/testassets/InteropTestsGrpcWebWebsite/Tests by @​dependabot in
grpc/grpc-dotnet#2616
* Bump tar-fs from 3.0.6 to 3.0.8 in
/testassets/InteropTestsGrpcWebWebsite/Tests by @​dependabot in
grpc/grpc-dotnet#2619
* Fix race condition that caused inprogress connect to be canceled by
@​JamesNK in grpc/grpc-dotnet#2618
* Bump tools package to 2.71 by @​apolcyn in
grpc/grpc-dotnet#2621
* Update NuGet package versions by @​JamesNK in
grpc/grpc-dotnet#2620
* bump version to 2.71.0-pre1 by @​apolcyn in
grpc/grpc-dotnet#2622


**Full Changelog**:
grpc/grpc-dotnet@v2.70.0...v2.71.0-pre1

## 2.70.0

## What's Changed
* update ArgumentNullException.ThrowIfNull usage by @​WeihanLi in
grpc/grpc-dotnet#2563
* use nameof for CallerArgumentExpression by @​WeihanLi in
grpc/grpc-dotnet#2562
* Correctness: Make some private & internal classes sealed where
possible by @​Henr1k80 in grpc/grpc-dotnet#2559
* Bump vue from 2.6.14 to 3.0.0 in /examples/Spar/Server/ClientApp by
@​dependabot in grpc/grpc-dotnet#2565
* Bump cross-spawn from 7.0.3 to 7.0.6 in
/testassets/InteropTestsGrpcWebWebsite/Tests by @​dependabot in
grpc/grpc-dotnet#2574
* [vote]Added Active maintainers into MAINTAINERS.md. by @​subhraOffGit
in grpc/grpc-dotnet#2449
* Refactor: Use `await using` for `packageVersionStream` to ensure
proper disposal of async resources by @​dexcompiler in
grpc/grpc-dotnet#2521
* Performance microoptimizations by @​Henr1k80 in
grpc/grpc-dotnet#2558
* Complete health checks watch service on server shutting down by
@​JamesNK in grpc/grpc-dotnet#2582
* Avoid using ConcurrentDictionary for channels with few methods by
@​JamesNK in grpc/grpc-dotnet#2597
* Bump elliptic from 6.6.0 to 6.6.1 in /examples/Spar/Server/ClientApp
by @​dependabot in grpc/grpc-dotnet#2599
* Move updating connectivity state outside of subchannel lock by
@​JamesNK in grpc/grpc-dotnet#2601
* Bump Grpc.Tools dependency by @​apolcyn in
grpc/grpc-dotnet#2603
* bump version on v2.70.x branch by @​apolcyn in
grpc/grpc-dotnet#2604
* Change version to 2.70.0 by @​JamesNK in
grpc/grpc-dotnet#2610

## New Contributors
* @​Henr1k80 made their first contribution in
grpc/grpc-dotnet#2559
* @​subhraOffGit made their first contribution in
grpc/grpc-dotnet#2449
* @​dexcompiler made their first contribution in
grpc/grpc-dotnet#2521

**Full Changelog**:
grpc/grpc-dotnet@v2.67.0...v2.70.0

## 2.67.0

## What's Changed
* precompile condition clean by @​Varorbc in
grpc/grpc-dotnet#2528
* Log server cancellation errors at info level by @​JamesNK in
grpc/grpc-dotnet#2527
* Update logging to use generated logs by @​wabalubdub in
grpc/grpc-dotnet#2531
* Bump serve-static from 1.14.2 to 1.16.2 in
/examples/Spar/Server/ClientApp by @​dependabot in
grpc/grpc-dotnet#2536
* Update to Grpc.Tools 2.67.0-pre1 by @​JamesNK in
grpc/grpc-dotnet#2547
* Cleanup gRPC unit testing helpers in tester sample by @​JamesNK in
grpc/grpc-dotnet#2548
* Fix UpdateBalancingState not called when address attributes are
modified by @​JamesNK in grpc/grpc-dotnet#2553
* Update Grpc.Tools to 2.67.0 by @​JamesNK in
grpc/grpc-dotnet#2554
* Fix System.Text.Json vulnerability warning by @​JamesNK in
grpc/grpc-dotnet#2556
* Update package dependencies to 9.0 RC2 by @​JamesNK in
grpc/grpc-dotnet#2560
* Bump elliptic from 6.5.7 to 6.6.0 in /examples/Spar/Server/ClientApp
by @​dependabot in grpc/grpc-dotnet#2567
* Update to .NET 9 RTM by @​JamesNK in
grpc/grpc-dotnet#2571

## New Contributors
* @​wabalubdub made their first contribution in
grpc/grpc-dotnet#2531

**Full Changelog**:
grpc/grpc-dotnet@v2.66.0...v2.67.0

## 2.67.0-pre1

## What's Changed
* precompile condition clean by @​Varorbc in
grpc/grpc-dotnet#2528
* Log server cancellation errors at info level by @​JamesNK in
grpc/grpc-dotnet#2527
* Update logging to use generated logs by @​wabalubdub in
grpc/grpc-dotnet#2531
* Bump serve-static from 1.14.2 to 1.16.2 in
/examples/Spar/Server/ClientApp by @​dependabot in
grpc/grpc-dotnet#2536
* Update to Grpc.Tools 2.67.0-pre1 by @​JamesNK in
grpc/grpc-dotnet#2547
* Cleanup gRPC unit testing helpers in tester sample by @​JamesNK in
grpc/grpc-dotnet#2548
* Fix UpdateBalancingState not called when address attributes are
modified by @​JamesNK in grpc/grpc-dotnet#2553
* Update Grpc.Tools to 2.67.0 by @​JamesNK in
grpc/grpc-dotnet#2554
* Bump version for 2.67 RC by @​apolcyn in
grpc/grpc-dotnet#2555

## New Contributors
* @​wabalubdub made their first contribution in
grpc/grpc-dotnet#2531

**Full Changelog**:
grpc/grpc-dotnet@v2.66.0...v2.67.0-pre1

## 2.66.0

## What's Changed
* Bump version on master to 2.66.0-dev by @​stanley-cheung in
grpc/grpc-dotnet#2491
* Fix failure to create GrpcChannel under Wine compatibility layer
(including Steam Proton and Apple Game Porting Toolkit) by @​mayuki in
grpc/grpc-dotnet#2496
* Update .NET 9 SDK and resolve warnings by @​sebastienros in
grpc/grpc-dotnet#2502
* Bump braces from 3.0.2 to 3.0.3 in
/testassets/InteropTestsGrpcWebWebsite/Tests by @​dependabot in
grpc/grpc-dotnet#2504
* Bump axios from 1.6.2 to 1.7.4 in
/testassets/InteropTestsGrpcWebWebsite/Tests by @​dependabot in
grpc/grpc-dotnet#2505
* Update puppeteer by @​JamesNK in
grpc/grpc-dotnet#2507
* Remove internal_ci flag from interop test script by @​JamesNK in
grpc/grpc-dotnet#2509
* Fix Google auth interop test by @​JamesNK in
grpc/grpc-dotnet#2512
* [testing] improve sanity check in jwt_token_creds interop test by
@​apolcyn in grpc/grpc-dotnet#2513
* Add HTTP version configuration to GrpcChannelOptions by @​JamesNK in
grpc/grpc-dotnet#2514
* Bump grpc.tools version to 2.66 by @​apolcyn in
grpc/grpc-dotnet#2523
* Bump webpack from 5.76.0 to 5.94.0 in /examples/Browser/Server/wwwroot
by @​dependabot in grpc/grpc-dotnet#2522
* Bump elliptic from 6.5.4 to 6.5.7 in /examples/Spar/Server/ClientApp
by @​dependabot in grpc/grpc-dotnet#2525
* Bump micromatch from 4.0.7 to 4.0.8 in
/testassets/InteropTestsGrpcWebWebsite/Tests by @​dependabot in
grpc/grpc-dotnet#2524
* Bump v2.66.x branch to 2.66.0.pre1 by @​apolcyn in
grpc/grpc-dotnet#2526
* Bump v2.66.x to v2.66.0 by @​apolcyn in
grpc/grpc-dotnet#2539


**Full Changelog**:
grpc/grpc-dotnet@v2.65.0...v2.66.0

## 2.66.0-pre1

## What's Changed
* Bump version on master to 2.66.0-dev by @​stanley-cheung in
grpc/grpc-dotnet#2491
* Fix failure to create GrpcChannel under Wine compatibility layer
(including Steam Proton and Apple Game Porting Toolkit) by @​mayuki in
grpc/grpc-dotnet#2496
* Update .NET 9 SDK and resolve warnings by @​sebastienros in
grpc/grpc-dotnet#2502
* Bump braces from 3.0.2 to 3.0.3 in
/testassets/InteropTestsGrpcWebWebsite/Tests by @​dependabot in
grpc/grpc-dotnet#2504
* Bump axios from 1.6.2 to 1.7.4 in
/testassets/InteropTestsGrpcWebWebsite/Tests by @​dependabot in
grpc/grpc-dotnet#2505
* Update puppeteer by @​JamesNK in
grpc/grpc-dotnet#2507
* Remove internal_ci flag from interop test script by @​JamesNK in
grpc/grpc-dotnet#2509
* Fix Google auth interop test by @​JamesNK in
grpc/grpc-dotnet#2512
* [testing] improve sanity check in jwt_token_creds interop test by
@​apolcyn in grpc/grpc-dotnet#2513
* Add HTTP version configuration to GrpcChannelOptions by @​JamesNK in
grpc/grpc-dotnet#2514
* Bump grpc.tools version to 2.66 by @​apolcyn in
grpc/grpc-dotnet#2523
* Bump webpack from 5.76.0 to 5.94.0 in /examples/Browser/Server/wwwroot
by @​dependabot in grpc/grpc-dotnet#2522
* Bump elliptic from 6.5.4 to 6.5.7 in /examples/Spar/Server/ClientApp
by @​dependabot in grpc/grpc-dotnet#2525
* Bump micromatch from 4.0.7 to 4.0.8 in
/testassets/InteropTestsGrpcWebWebsite/Tests by @​dependabot in
grpc/grpc-dotnet#2524


**Full Changelog**:
grpc/grpc-dotnet@v2.65.0...v2.66.0-pre1

Commits viewable in [compare
view](grpc/grpc-dotnet@v2.65.0...v2.71.0).
</details>

[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=Grpc.Net.Client&package-manager=nuget&previous-version=2.65.0&new-version=2.71.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Potential deadlock in subchannel state update

3 participants