-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix race condition in PipeNetCon #8643
Conversation
The race condition detector is being triped by a concurrent `Write` and `Close` in the `PipeNetCon` in several integration tests. This is a naive fix to serialise the write and close operations to resolve the race condition. The affected tests were also not handling asynchronous error reporting correctly (i.e. it's not legal to call `require.XYZ()` from a goroutine other than the one executing the test function.). This patch introduces some plumbing to marshal asynchronous errors back into the main test routine before failing the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any thoughts about changing to a RWMutex
and acquiring a read lock for the Read
call? Feels odd to synchronize Close/Write and not Read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Trent, sorry for the delay, I meant to get to this one yesterday. Thanks for once again fixing our tests.
I agree with Zac on the RWMutex for PipeNetConn - guarding all operations seems easier to reason about.
In regards to the assertions, I think we can do away with asyncError and asyncAssertion and use plain error
to communicate what we need - see the suggestions and let me know your thoughts. Ideally we would "simply" run the assertions in the proper goroutine, but I'm assuming that is fairly difficult to achieve.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally we would "simply" run the assertions in the proper goroutine, but I'm assuming that is fairly difficult to achieve.
In a perfect world, yes. Unfortunately, the go testing package doesn't help us. From the docs for testing.T
:
A test ends when its Test function returns or calls any of the methods FailNow, Fatal, Fatalf, SkipNow, Skip, or Skipf. Those methods, as well as the Parallel method, must be called only from the goroutine running the Test function.
Given that testify
is a pretty thin layer over these functions, it doesn't help in this case either.
3714ead
to
2fcd7f5
Compare
I tried, and it introduces a bunch of deadlocks, which is reasonable as I'd expect calls to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bot.
The race condition detector is being tripped by a concurrent `Write` and `Close` in the `PipeNetCon` in several integration tests. This is a naive fix to serialize the write and close operations to resolve the race condition. The affected tests were also not handling asynchronous error reporting correctly (i.e. it's not legal to call `require.XYZ()` from a goroutine other than the one executing the test function.). This patch introduces some plumbing to marshal asynchronous errors back into the main test routine before failing the test.
The race condition detector is being tripped by a concurrent `Write` and `Close` in the `PipeNetCon` in several integration tests. This is a naive fix to serialize the write and close operations to resolve the race condition. The affected tests were also not handling asynchronous error reporting correctly (i.e. it's not legal to call `require.XYZ()` from a goroutine other than the one executing the test function.). This patch introduces some plumbing to marshal asynchronous errors back into the main test routine before failing the test.
Part of this change is implementing a "no secrets" policy for CI. Given that 1. we have to support CI for arbitrary external contributors, and 2. it is easy to craft a malicious PR that exfiltrates secrets during a CI build any test that runs under CI must be able to do so without any injected secrets. This means that several of the test we currently run under Drone will not be run on GCB, at least as part of the regular CI. The plan is to create a separate task that periodically runs tests that require external credentials (e.g. Kube tests, various backend data stores, etc.) in a more secure way and report failures asynchronously. And while these tests will not run under CI, the should still be built under CI so that required changes are caught during review. Note: this backport includes various data race fixes added separately in the master branch: See-Also: #8643 See-Also: #8888 See-Also: #9117 See-Also: #9119
Part of this change is implementing a "no secrets" policy for CI. Given that we have to support CI for arbitrary external contributors, and it is easy to craft a malicious PR that exfiltrates secrets during a CI build any test that runs under CI must be able to do so without any injected secrets. This means that several of the test we currently run under Drone will not be run on GCB, at least as part of the regular CI. The plan is to create a separate task that periodically runs tests that require external credentials (e.g. Kube tests, various backend data stores, etc.) in a more secure way and report failures asynchronously. And while these tests will not run under CI, the should still be built under CI so that required changes are caught during review. See-Also: #8608 See-Also: #8643 See-Also: #8888 See-Also: #9117 See-Also: #9119
The race condition detector is being triped by a concurrent
Write
andClose
in thePipeNetCon
in several integration tests.Example race detector report:
This is a naive fix to serialise the write and close operations to resolve the race condition.
The affected tests were also not handling asynchronous error reporting correctly (i.e. it's not legal to call
require.XYZ()
from a goroutine other than the one executing the test function.). This patch introduces some plumbing to marshal asynchronous errors back into the main test routine before failing the test.This is an incredibly naive way to address the race condition, an i hope that the review process will help bring out a better solution.