Restore blocking mode after successful ConnectAsync on Unix by liveans · Pull Request #124200 · dotnet/runtime

liveans · 2026-02-09T21:26:54Z

Summary

On Unix, after a successful ConnectAsync completion, this PR restores the underlying socket to blocking mode if the user hasn't explicitly set Socket.Blocking = false. This
optimizes subsequent synchronous operations (Send/Receive) by using native blocking syscalls instead of emulating blocking on top of epoll/kqueue.

When SendAsync/ReceiveAsync (or other async I/O operations) are called later, the socket is switched back to non-blocking mode via the existing SetHandleNonBlocking()
mechanism.

Motivation

Currently, once any async operation is performed on a Unix socket, the underlying file descriptor is permanently set to non-blocking mode via fcntl(O_NONBLOCK). Subsequent
synchronous operations must then emulate blocking behavior in managed code using epoll/kqueue, which has performance overhead compared to native blocking syscalls.

This change benefits the common usage pattern:

await socket.ConnectAsync(endpoint);
// Subsequent sync operations now use native blocking syscalls (more efficient)
socket.Receive(buffer);
socket.Send(data);

dotnet-policy-service · 2026-02-09T21:28:03Z

Tagging subscribers to this area: @karelz, @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

src/libraries/System.Net.Sockets/tests/FunctionalTests/Connect.cs

src/libraries/System.Net.Sockets/src/System/Net/Sockets/SafeSocketHandle.Unix.cs

src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketAsyncContext.Unix.cs

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 7 comments.

src/libraries/System.Net.Sockets/tests/FunctionalTests/Connect.cs

src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketAsyncContext.Unix.cs

src/libraries/System.Net.Sockets/src/System/Net/Sockets/SafeSocketHandle.Unix.cs

stephentoub · 2026-02-11T18:38:53Z

🤖 Copilot Code Review — PR #124200

Holistic Assessment

Motivation: The optimization goal is valid. After ConnectAsync completes, users commonly switch to synchronous Send/Receive operations. Currently these must emulate blocking behavior via epoll/kqueue when the socket is in non-blocking mode. Restoring native blocking mode could improve performance for this pattern.

Approach: The implementation adds SetHandleBlocking() to toggle the socket back to blocking mode and calls RestoreBlocking() from multiple Connect/Accept completion paths. The design respects user preferences (only restores blocking if Socket.Blocking is true).

Summary: ⚠️ Needs Human Review. The optimization concept is sound, but this change reintroduces exactly the synchronization complexity the original design explicitly avoided. The key findings below require human judgment on whether the race conditions are acceptable in practice.

Detailed Findings

⚠️ Thread Safety — No Synchronization on Blocking Mode Toggle

Location: SocketAsyncContext.Unix.cs — SetHandleBlocking() and _isHandleNonBlocking field

The original code (lines 1350-1354) explicitly warned:

"We never transition back to blocking mode, to avoid problems synchronizing that transition with the async infrastructure."

While the PR updates this comment, it doesn't address the underlying race condition. Consider:

Thread A: ConnectAsync completes → calls RestoreBlocking() → sets socket to blocking
Thread B: Simultaneously starts ReceiveAsync → calls SetHandleNonBlocking()

The _isHandleNonBlocking field is read/written without any locking in both methods. Additionally, ShouldRetrySyncOperation() (line 1432) reads _isHandleNonBlocking to determine behavior—if stale, it could misclassify EAGAIN as a timeout.

The PR claims safety because ConnectAsync/AcceptAsync completion is "guaranteed by construction to not be used concurrently with any other operation", but this guarantee is not enforced by the Socket API. A user can start a ReceiveAsync on one thread while ConnectAsync is completing on another.

Questions for maintainers:

Is the race actually benign in practice (i.e., worst case is an extra fcntl syscall)?
Should Volatile.Read/Write be used for _isHandleNonBlocking?
Should the restoration be best-effort (catch and ignore fcntl failures)?

(Flagged by: Claude + GPT-5.1)

⚠️ AcceptAsync Restores Blocking on Listener Socket

Location: SocketAsyncContext.Unix.cs — lines 1509, 1524 (per diff line numbers)

RestoreBlocking() is called on the listener socket after AcceptAsync completes. Unlike ConnectAsync (which can only succeed once per socket), a listener socket may:

Have multiple AcceptAsync operations pending
Be used concurrently with other accept operations

If one accept completes and restores blocking mode while another async accept is pending, could this cause issues with the epoll/kqueue infrastructure?

Suggestion: Verify whether restoring blocking mode on listener sockets is safe given concurrent accept operations, or limit this optimization to connect sockets only.

⚠️ Exception After Successful Operation

Location: SocketAsyncContext.Unix.cs — SetHandleBlocking()

csharp if (Interop.Sys.Fcntl.SetIsNonBlocking(_socket, 0) != 0) { throw new SocketException(...); }

If the fcntl call fails (e.g., socket is in an unexpected state), this throws after a logically successful connect/accept. The user's await ConnectAsync() would throw even though the connection succeeded.

Suggestion: Consider making this best-effort (catch and ignore failures, or return a bool and let RestoreBlocking be truly non-throwing). The blocking mode is an optimization; failing it shouldn't fail the operation.

(Flagged by: GPT-5.1)

💡 Multiple RestoreBlocking Call Sites

Location: SocketAsyncContext.Unix.cs — 6+ call sites

RestoreBlocking() is called from many places:

AcceptOperation.InvokeCallback()
ConnectOperation.InvokeCallback()
AcceptAsync synchronous completion (2 paths)
ConnectAsync synchronous completion (2 paths)

This is correct for covering all paths, but fragile—future changes to async completion could forget to call RestoreBlocking(). Consider whether there's a single choke point where restoration could happen.

✅ WASI Handling

The early return in SetHandleBlocking() for WASI is correct—WASI sockets are always non-blocking.

✅ Test Coverage

The tests cover the key scenarios:

Normal connect with blocking restoration
User-set non-blocking mode preserved
Subsequent async ops restore non-blocking
Failed connect behavior
AcceptAsync blocking restoration

However, tests don't cover the race scenarios (concurrent operations) which is where the risk lies.

Cross-Cutting Analysis

Threading Model: The existing socket code has careful threading considerations. The original design deliberately avoided blocking-mode transitions during async operations. This PR re-introduces that complexity. Human review is needed to determine if the races are acceptable.

Related Sibling Code: The Windows implementation doesn't have this concept (Windows sockets work differently). This is a Unix-only optimization.

Performance: No benchmark data was provided. Given the syscall overhead of calling fcntl on every connect/accept completion, it would be valuable to measure whether this actually improves the target scenario (ConnectAsync followed by sync Send/Receive).

Summary

Verdict: ⚠️ Needs Human Review

The optimization motivation is valid, but this change reintroduces synchronization complexity that was explicitly avoided in the original design. Key questions for maintainers:

Thread safety: Is the unsynchronized access to _isHandleNonBlocking acceptable given the new bidirectional transitions?
Listener sockets: Is it safe to restore blocking mode on listener sockets that may have concurrent accept operations?
Exception behavior: Should SetHandleBlocking failures be silently ignored rather than throwing?
Performance evidence: Are there benchmarks showing this optimization provides meaningful improvement?

The code itself is consistent and tests cover the intended behavior. The uncertainty is about the design tradeoffs, not the implementation.

Review generated by Copilot code-review skill. Models contributing: Claude Sonnet 4, GPT-5.1

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketAsyncContext.Unix.cs:688

The InvokeCallback override only invokes the callback and calls RestoreBlocking when buffer.Length is 0. However, when a connect operation fails (ErrorCode != Success) and a buffer was provided (via DoOperationConnectEx), the callback will never be invoked and RestoreBlocking won't be called.

This can occur when using ConnectEx with a buffer on Unix systems. The callback must always be invoked when the operation is complete, regardless of whether there's a buffer or whether the operation succeeded or failed. RestoreBlocking should also be called on error paths to restore the socket to blocking mode after a failed connection attempt.

Consider updating the logic to:

Always call RestoreBlocking when the connect fails (ErrorCode != Success), regardless of buffer.Length
Always invoke the callback when the operation is complete and no follow-up Send is needed

            public override void InvokeCallback(bool allowPooling)
            {
                var cb = Callback!;
                int bt = BytesTransferred;
                Memory<byte> sa = SocketAddress;
                SocketError ec = ErrorCode;
                Memory<byte> buffer = Buffer;

                if (buffer.Length == 0)
                {
                    AssociatedContext._socket.RestoreBlocking();

                    // Invoke callback only when we are completely done.
                    // In case data were provided for Connect we may or may not send them all.
                    // If we did not we will need follow-up with Send operation
                    cb(bt, sa, SocketFlags.None, ec);
                }
            }

Copilot · 2026-02-13T13:04:26Z

src/libraries/System.Net.Sockets/tests/FunctionalTests/Connect.Unix.cs

+        [Fact]
+        public async Task ConnectAsync_Failure_SocketIsRestoredToBlocking()
+        {
+            using Socket client = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp);
+
+            await Assert.ThrowsAsync<SocketException>(async () =>
+                await client.ConnectAsync(new IPEndPoint(IPAddress.Loopback, 1)));
+
+            Assert.False(IsSocketNonBlocking(client));
+        }


The test suite is missing coverage for a failed ConnectAsync with a data buffer. When using ConnectEx with a buffer (via SocketAsyncEventArgs.SetBuffer), if the connection fails, the callback may not be invoked due to the bug in ConnectOperation.InvokeCallback (lines 671-688 in SocketAsyncContext.Unix.cs).

Consider adding a test that:

Creates a SocketAsyncEventArgs with a buffer via SetBuffer

Attempts to connect to an unreachable endpoint (e.g., port 1)

Verifies that the Completed callback is invoked with an error

Verifies that the socket is restored to blocking mode after the failed connect

src/libraries/System.Net.Sockets/tests/FunctionalTests/Connect.Unix.cs

src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketAsyncContext.Unix.cs

liveans · 2026-02-13T16:12:46Z

/azp run runtime-libraries-coreclr outerloop

azure-pipelines · 2026-02-13T16:12:57Z

Azure Pipelines successfully started running 1 pipeline(s).

liveans requested review from a team and Copilot February 9, 2026 21:26

github-actions bot added the area-System.Net.Sockets label Feb 9, 2026

dotnet-policy-service bot assigned liveans Feb 9, 2026

Copilot started reviewing on behalf of liveans February 9, 2026 21:27 View session

wfurt reviewed Feb 9, 2026

View reviewed changes

src/libraries/System.Net.Sockets/tests/FunctionalTests/Connect.cs Outdated Show resolved Hide resolved

wfurt reviewed Feb 9, 2026

View reviewed changes

src/libraries/System.Net.Sockets/src/System/Net/Sockets/SafeSocketHandle.Unix.cs Outdated Show resolved Hide resolved

stephentoub reviewed Feb 9, 2026

View reviewed changes

src/libraries/System.Net.Sockets/src/System/Net/Sockets/SafeSocketHandle.Unix.cs Show resolved Hide resolved

wfurt reviewed Feb 9, 2026

View reviewed changes

src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketAsyncContext.Unix.cs Outdated Show resolved Hide resolved

Copilot AI reviewed Feb 10, 2026

View reviewed changes

liveans requested a review from Copilot February 10, 2026 12:10

Copilot started reviewing on behalf of liveans February 10, 2026 12:11 View session

Copilot AI reviewed Feb 10, 2026

View reviewed changes

build-analysis bot mentioned this pull request Feb 12, 2026

System.Security.Cryptography.CryptographicException : m_safeCertContext is an invalid handle. #124279

Closed

liveans force-pushed the socket-blocking-mode-optimization branch from ba10f3d to 49a3f53 Compare February 13, 2026 12:53

Copilot AI review requested due to automatic review settings February 13, 2026 12:53

Copilot started reviewing on behalf of liveans February 13, 2026 12:54 View session

Copilot AI reviewed Feb 13, 2026

View reviewed changes

MihaZupan reviewed Feb 13, 2026

View reviewed changes

src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketAsyncContext.Unix.cs Outdated Show resolved Hide resolved

src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketAsyncContext.Unix.cs Outdated Show resolved Hide resolved

build-analysis bot mentioned this pull request Feb 13, 2026

Sometimes the helix SDK uses GetWorkItemsAsync when workitems aren't done processing. dotnet/dnceng#6011

Open

3 tasks

Add RestoreBlocking method and related tests for ConnectAsync behavior

b8c9054

liveans force-pushed the socket-blocking-mode-optimization branch from 49a3f53 to b8c9054 Compare February 13, 2026 16:09

Conversation

liveans commented Feb 9, 2026

Summary

Motivation

Uh oh!

dotnet-policy-service bot commented Feb 9, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

stephentoub commented Feb 11, 2026

🤖 Copilot Code Review — PR #124200

Holistic Assessment

Detailed Findings

⚠️ Thread Safety — No Synchronization on Blocking Mode Toggle

⚠️ AcceptAsync Restores Blocking on Listener Socket

⚠️ Exception After Successful Operation

💡 Multiple RestoreBlocking Call Sites

✅ WASI Handling

✅ Test Coverage

Cross-Cutting Analysis

Summary

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

liveans commented Feb 13, 2026

Uh oh!

azure-pipelines bot commented Feb 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants