-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Eliminate initial connection race condition in win32PipeListener (addresses #83) #84
base: main
Are you sure you want to change the base?
Conversation
Previously, win32PipeListener would create an initial pipe instance, open a client connection on it, then immediately close that connection. The reasoning behind this was unclear, but it appears to have been a mechanism for claiming the pipe name in the pipe namespace. If the listener-side single client could not be created, the listener would abort. This created a race condition where a legitimate client could attach itself to the named pipe before the listener could create its own client, causing the listener to incorrectly abort due to a legitimate client connection. This commit alters the process for listening for and accepting new connections: when the win32PipeListener is constructed, the first pipe is still created as before, but it is now also used as the first client connection as well. In order to maintain the namespace stake, the listenerRoutine() creates another named pipe instance before returning the connected instance to a pending Accept(), so that there is at least one pending named pipe instance at all times. There are some caveats to this approach: transient errors from CreateNamedPipe will not be returned until the _next_ Accept() call, so there may be instances where calling clients may correct the cause of the error, but still receive the same error across two Accept() calls. Additionally, if transient errors from CreateNamedPipe begin occurring and all other named pipe instances are closed, the listener loses its namespace stake and _may_ accidentally clobber other processes if the transient errors resolve themselves. It may be worth investingating closing the listener "from the inside" if transient errors begin to occur in CreateNamedPipe, so that the calling client is forced to create a new Listener and detect whether its stake in the pipe namespace has been lost.
This is a work in progress meant to address issue #83. I'm still not terribly happy with the treatment of errors in additional calls to |
This commit enforces the invariant that a win32PipeListener guarantees its stake in the pipe namespace by having instances of win32Pipe "donate" their disconnected pipe instances to the listener if the listener failed to create a new instance. Currently, this is done with heavy use of sync.Mutex. The "best case" of the listener successfully creating a successor instance elides one locking call with the use of atomic.LoadPointer; it may be possible to use atomic.CompareAndSwapPointer to achieve the same effect in general. This is worth pursuing further. This commit also fixes a race condition in win32PipeListener.Close, where the close channel send may have been swallowed by connectServerPipe if a client successfully connected before its pipe instance was shut down.
I made some long comments about this PR in #80 -- submitter and others please read what I wrote there. I don't think these changes address #67 completely, although they do address one element of the problem, and I think still aren't there 100%. For the record, I'd also be willing to look at refactoring #80 to incorporate the approach here (borrowing this code, or working with @djsweet to do so) as part of the listener side fixes (though we need to make some changes I think), and then lifting the client side of my work to provide the complete solution. This is quite a bit more work than just band aiding the one problem my code has (potential race), but would probably be architecturally cleaner, without wasting two extra handles per listener instance. I'm not interested in doing that work unless it seems pretty clear that this would be the direction we all agree we should take, and that it would lead to ultimately landing in master. |
I have a separate branch in which I've been trying out That branch had to treat As a result, replacing the mutex usage will take some thought and create incompatibilities that we probably don't want, but I'm not sure if they're avoidable while fixing the initial listener race. |
The Named Pipe API is quite inconsistent. Full stop. Thanks for the heads up about ERROR_PIPE_NOT_CONNECTED. ERROR_WINDOWS_HAS_TOO_MANY_ERROR_CODES |
I'm seeing some odd behavior that I think might be related to this? My listener doesn't seem to accept the first connection to the named pipe. If I restart my client, everything works and every connection is accepted as expected. |
@jstarks - PTAL |
Yep @djsweet's PR is fixing the issue for me. I was curious and tried running all the pipe tests (on a fully patched Windows 7) from |
Previously, win32PipeListener would create an initial pipe instance,
open a client connection on it, then immediately close that connection.
The reasoning behind this was unclear, but it appears to have been a
mechanism for claiming the pipe name in the pipe namespace. If the
listener-side single client could not be created, the listener would
abort. This created a race condition where a legitimate client could
attach itself to the named pipe before the listener could create its own
client, causing the listener to incorrectly abort due to a legitimate
client connection.
This commit alters the process for listening for and accepting new
connections: when the win32PipeListener is constructed, the first
pipe is still created as before, but it is now also used as the first
client connection as well. In order to maintain the namespace stake,
the listenerRoutine() creates another named pipe instance before
returning the connected instance to a pending Accept(), so that there is
at least one pending named pipe instance at all times.
There are some caveats to this approach: transient errors from
CreateNamedPipe will not be returned until the next Accept() call, so
there may be instances where calling clients may correct the cause
of the error, but still receive the same error across two Accept()
calls.
Additionally, if transient errors from CreateNamedPipe beginoccurring and all other named pipe instances are closed, the listener
loses its namespace stake and may accidentally clobber other processes
if the transient errors resolve themselves.
It may be worth investingating closing the listener "from the inside"if transient errors begin to occur in CreateNamedPipe, so that the
calling client is forced to create a new Listener and detect whether
its stake in the pipe namespace has been lost.
EDIT: As of commit 1c4c364, the namespace stake is guaranteed by having instances of
win32Pipe
"donate" their open instances back to the listener if the listener could not create a new pipe instance. This ensures that at least one open pipe instance is bound to the expected name through the lifetime of the listener, which also ensures that the stake in the namespace is not lost. It is currently usingsync.Mutex
rather heavily, but it might be possible to replace that withatomic.CompareAndSwapPointer
instead.Commit 1c4c364 also contains a fix for #85. This might be worth breaking out into a new PR.