Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unhandled exception from AcceptCompletionCallback #108026

Closed
pepone opened this issue Sep 19, 2024 · 17 comments · Fixed by #108616
Closed

Unhandled exception from AcceptCompletionCallback #108026

pepone opened this issue Sep 19, 2024 · 17 comments · Fixed by #108616

Comments

@pepone
Copy link
Contributor

pepone commented Sep 19, 2024

2024-09-18T17:42:06.9787720Z Unhandled exception. System.ArgumentException: The supplied System.Net.SocketAddress is an invalid size for the System.Net.IPEndPoint end point. (Parameter 'socketAddress')
2024-09-18T17:42:06.9788630Z    at System.Net.IPEndPoint.Create(SocketAddress socketAddress)
2024-09-18T17:42:06.9789430Z    at System.Net.Sockets.SocketAsyncEventArgs.FinishOperationAccept(SocketAddress remoteSocketAddress)
2024-09-18T17:42:06.9790370Z    at System.Net.Sockets.SocketAsyncEventArgs.FinishOperationSyncSuccess(Int32 bytesTransferred, SocketFlags flags)
2024-09-18T17:42:06.9791310Z    at System.Net.Sockets.SocketAsyncEventArgs.FinishOperationAsyncSuccess(Int32 bytesTransferred, SocketFlags flags)
2024-09-18T17:42:06.9793120Z    at System.Net.Sockets.SocketAsyncEventArgs.AcceptCompletionCallback(IntPtr acceptedFileDescriptor, Memory`1 socketAddress, SocketError socketError)
2024-09-18T17:42:06.9794100Z    at System.Net.Sockets.SocketAsyncEngine.System.Threading.IThreadPoolWorkItem.Execute()
2024-09-18T17:42:06.9795710Z    at System.Threading.ThreadPoolWorkQueue.Dispatch()
2024-09-18T17:42:06.9796250Z    at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()
2024-09-18T17:42:06.9798780Z    at System.Threading.Thread.StartCallback()

The exception is similar to the one from #81437, but this might indicate a different problem, exceptions from FinishOperationAccept are not handled.

@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Sep 19, 2024
@rzikm
Copy link
Member

rzikm commented Sep 20, 2024

This report is not actionable for us without knowing more about the scenario when the exception occurred. Does this exception happen frequently? Are you able to reproduce it with some small code example you could share with us?

@rzikm rzikm added the needs-author-action An issue or pull request that requires more info or actions from the author. label Sep 20, 2024
@pepone
Copy link
Contributor Author

pepone commented Sep 20, 2024

This is happening on macOS, when accepting IPv6 TCP connections. I thought there is a bug here because .NET runtime not handling the exception and as a result the callback passed to BeginAccept is not called.

The underlying cause is probably the same as https://github.com/dotnet/runtime/issues/81437 unreliable OS call, and it is not easy to reproduce.

@dotnet-policy-service dotnet-policy-service bot removed the needs-author-action An issue or pull request that requires more info or actions from the author. label Sep 20, 2024
@antonfirsov
Copy link
Member

We haven't understood the root cause of #81437. In fact, it has likely stopped occurring in our CI or the frequency is so low, it's below of our radar.

You haven't answered the questions from #108026 (comment). Does the error happen sporadically or predictably? Any code you can possibly share? Any chance for creating a self-contained repro?

This is happening on macOS, when accepting IPv6 TCP connections.

Are you using DualMode listener sockets (created by new Socket(SocketType.Stream, ProtocolType.Tcp)? Does the issue disappear if you change it to be IPv6-only (created by new Socket(AddressFamily.InterNetworkV6, SocketType.Stream, ProtocolType.Tcp)?

@wfurt
Copy link
Member

wfurt commented Sep 23, 2024

I think the underlying issue is same as #102663. Would you be able to run custom code @pepone ? Perhaps we can come up with more instrumentation and/or improved error handling. If that is feasible you can grab binaries from PR or you could rebuild the code changes yourself.

@pepone
Copy link
Contributor Author

pepone commented Sep 23, 2024

Would you be able to run custom code @pepone ?

Sure, I can try with some custom code.

@wfurt
Copy link
Member

wfurt commented Sep 27, 2024

can you rebuild #108334 and give it shot @pepone ? The PR should also create binaries if that is easier....

@pepone
Copy link
Contributor Author

pepone commented Sep 30, 2024

@wfurt I hit #104105 when trying to build your branch with Xcode 16, Where can I download the binaries from your PR for macOS ARM64?

@rzikm
Copy link
Member

rzikm commented Oct 1, 2024

@pepone you should be able to download the OSX ARM64 build artifacts (Debug) from https://dev.azure.com/dnceng-public/cbb18261-c48f-4abb-8651-8cdcb5474649/_apis/build/builds/820467/artifacts?artifactName=libraries_bin_osx_x64_Debug&api-version=7.1&%24format=zip. you need only the System.Net.Sockets.dll from the testhost/shared/... directory.

In case you have problems accessing the above url, I attached the relevant files here directly
macAccept.zip

Publish app as self-contained and overwrite the System.Net.Sockets.dll in the publish output to test the change.

@pepone
Copy link
Contributor Author

pepone commented Oct 1, 2024

With this PR I hit an assert instead of getting the unhandled exception:

testing connection closure... Process terminated. Assertion Failed
   at System.Net.Sockets.SocketAsyncEventArgs.CompleteAcceptOperation(IntPtr acceptedFileDescriptor, Memory`1 socketAddress, SocketError socketError)
   at System.Net.Sockets.SocketAsyncEventArgs.AcceptCompletionCallback(IntPtr acceptedFileDescriptor, Memory`1 socketAddress, SocketError socketError)
   at System.Net.Sockets.SocketAsyncContext.AcceptOperation.InvokeCallback(Boolean allowPooling)
   at System.Net.Sockets.SocketAsyncContext.OperationQueue`1.ProcessAsyncOperation(TOperation op)
   at System.Net.Sockets.SocketAsyncContext.ProcessAsyncReadOperation(ReadOperation op)
   at System.Net.Sockets.SocketAsyncContext.ReadOperation.System.Threading.IThreadPoolWorkItem.Execute()
   at System.Net.Sockets.SocketAsyncContext.AsyncOperation.Process()
   at System.Net.Sockets.SocketAsyncContext.HandleEvents(SocketEvents events)
   at System.Net.Sockets.SocketAsyncEngine.System.Threading.IThreadPoolWorkItem.Execute()
   at System.Threading.ThreadPoolWorkQueue.Dispatch()
   at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()
   at System.Threading.Thread.StartCallback()

To make this easy to reproduce I run my test while I also stress the CPU with stress -c 10.

@wfurt wfurt removed the untriaged New issue has not been triaged by the area owner label Oct 1, 2024
@wfurt wfurt added this to the 10.0.0 milestone Oct 1, 2024
@wfurt
Copy link
Member

wfurt commented Oct 1, 2024

I guess that make sense

That one is not in release builds and blows up later. I'll update the PR to pass through that.

@pepone
Copy link
Contributor Author

pepone commented Oct 2, 2024

No more asserts with the latest build.

I think we still need a workaround as proposed in #102663, otherwise when this happens the socket wont be able to accept the connection.

@wfurt
Copy link
Member

wfurt commented Oct 2, 2024

I don't have repro in my hands ... but I don't think we need to. From what I saw the Accept only assigns the address to _remoteEndPoint e.g. it does not need it to create new socket. And if somebody access RemoteEndPoint and _remoteEndPoint is null we would query underlying socket via getsockopt to get it at that time.

If you observer something else please let me know

@pepone
Copy link
Contributor Author

pepone commented Oct 2, 2024

I will try to get more details on this part a comment on the other issue or open a new one.

The unhandled exception part seems to be fixed with your PR.

@wfurt
Copy link
Member

wfurt commented Oct 2, 2024

I'll get it to main 10.0 branch so more people have chance to test it. If that looks good we can start working on servicing permission.

@antonfirsov
Copy link
Member

After reading #102663 (comment), I'm wondering if getsockname will actually return the address succesfully in this weird case #108334 is addressing. @pepone can you check the what happens if you log acceptSocket.RemoteEndpoint?

@antonfirsov
Copy link
Member

This should be fixed by the PR-s linked above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants