-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Multi VU deadlock by isolating browser contexts #1219
Conversation
87aa32f
to
518806c
Compare
d2ec0ff
to
570bd88
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Self-review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What an amazing outcome from this PR! 🎉 🥇
Since it touches some areas that are heavily asynchronous and quite complex I've asked some follow up questions to help me understand some of the changes and how they help improve the runtime stability.
Thank you for the enthusiastic feedback! 🙏 ❤️ As detailed in the discussion, the results of this PR were intentionally planned and are a direct outcome of those design choices. Seeing the team's efforts pushing our project forward is great. 😊 🥳 |
365d98e
to
01a1d06
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🚀 🌕
This allows us to detect whether a session that we're trying to close was in the connection's sessions. This way, we can skip. See the next commit for details.
The connection now skips doing session read and browser module event generation and management. This way, we prevent unnecessary communication overhead.
This callback will be called to manage the connection session handling from outside. Since Browser is the creator of Connection, we're adding this callback to let Browser hook into target attachment and tell Connection to stop attaching the target if Browser doesn't want it. This is a downside of the current design. There is a symbiotic relationship between Connection and Browser. Perhaps they should be together, not separate this way. But for the current design, this is the best we can do to improve Browser's session management.
This way, we can listen target attachment events from Connection, and veto Connection's decision of adding a Session to its internal list.
This isn't yet functional until we add other mechanics in later commits. For now, it allows Connection to attach a target if it's in the same BrowserContext (or in the default context). We don't use `defer` for the lock because we don't want introduce any lock overhead. Local tests showed us that holding the lock for the entire function is prone to unexpected deadlock issues. This is actually for a future code addition that can easily miss this fact.
This and the previous commit allow us to stop sharing browser contexts.
This method tells the browser to stop waiting for the attached target. Otherwise, the browser would indefinitely wait for the target that we don't want to be attached. This way, we prevent resource overuse.
When Browser doesn't want a target to be attached, Connection tells the browser to stop waiting for debugger. This prevents the browser from indefinitely waiting for a target (that we don't want) to be attached.
This will be useful to register the onTargetAttachedToTarget event. See the next commit for the reason.
NewConnection may start recv/send loop before the onTargetAttachedToTarget callback is attached. Extracting start() goroutine loops from NewConnection and calling it explicitly after registering the callback prevents this problem.
This method was returning +1 pages. This fix is necessary for the test in the following commit to work.
This test ensures that we will no longer share pages across browser contexts.
- Removes start() - Moves onTargetAttachedToTarget to the constructor. - Moves implicit goroutine run to the constructor. - Updates tests to work with a nil hook.
c8416b3
to
a6a8cb9
Compare
What?
With this fix, the browser module can now work with an unlimited number of VUs and iterations (limited by the machine power and the browser, of course). As explained in #1112, this PR separates browser contexts to make each iteration focus solely on its pages rather than those created by other iterations. This brings:
Tested with the script in #971 and saw no deadlocks.
Note: This PR aims to apply this fix with minimum code changes to avoid disrupting the module's working.
Why?
Please take a look at #1112 and #971 for more details.
We're adding a separate
onAttachedToTarget
hook to act precisely even before the module's event system is used to carry on theonAttachedToTarget
event. This way, we can precisely controlConnection
's attachment decisions at the moment that happens. This also prevents other related concurrency issues betweenBrowser
andConnection
event management.In an ideal world, the session management should be outside of
Connection
, andBrowser
would be managing sessions itself. Currently, this is the cleanest way (IMO) to do this.The effects of this PR on other open issues
#971: Deadlock: Multi-VU many iterations
There are no longer deadlocks, even with 1000 VUs. See test run ID: 167467.
#966: Creating new page in browser context sometimes times out
We no longer see these errors. See test run ID: 167467. However, there were plenty in test run ID 167336 (before this fix was applied).
#970: Uncaught (in promise) when navigating due to time out
This still occurs. But this is about the server(s) under the test unable to keep up with the load. When I manually loaded the page on my browser, it gave me 503/502 many times. Here's the test run ID 167467's log:
Failing here looks normal to me.
Checklist
Related PR(s)/Issue(s)
Updates #1112, #971, and #966 (even maybe #970).