-
Notifications
You must be signed in to change notification settings - Fork 8.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix output stuttering using a ticket lock #10653
Conversation
903f3c6
to
e683f21
Compare
May I ask what exactly is the "stuttering" you mentioned? Also how can I repro it locally. |
Are there any before/after benchmarks on this? |
Would something like https://github.com/AlexeyAB/object_threadsafe be a good alternative that also improves speed? |
@WSLUser While that project shows very impressive benchmark numbers, it's an unfair lock design as well and would lead to the same issue that the |
@zadjii-msft I've finished benchmarking the new lock. Before:
After:
The difference appears to be within the margin of error. I'd love to make a video of the stuttering/freezes, but they only occur when the stars align, since it heavily depends on how the scheduler feels like on any given day. It's pretty spectacular when it happens though, because it seemingly freezes the screen for multiple seconds. |
for (;;) | ||
{ | ||
const auto current = _now_serving.load(std::memory_order_acquire); | ||
if (current == ticket) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is provably no case where now_serving will jump multiple steps above the expected ticket value, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is possible, if someone where to call unlock()
multiple times.
But if you don't, it works exactly like the Wikipedia article describes the algorithm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's interesting to consider the work we've done to make sure we were using a reader/writer lock correctly, and possibly how little of an impact it actually had on our performance. Is there a possible net loss for operations that require reading, as they now contend with operations that require writing? Ideally, reading would be much more common and therefore "fast-pathable".
@DHowett A shared lock, like the one we used before this PR, would only help with concurrent reads, as locking the mutex as a reader prevents any writers from progressing.
We can't really build a lock that fast-paths reads, because that would lead to an unfair lock again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love this.
As a note, I edited your pull request body to match the commit message style of the repo.
// | ||
// But we can abuse the fact that the surrounding members rarely change and are huge | ||
// (std::function is like 64 bytes) to create some natural padding without wasting space. | ||
til::ticket_lock _readWriteLock; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you add the alignas
now, would that make it safer (in case somebody changes the order) and not compromise the layout by adding wasted padding space?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ask because... this is a micro-optimization somebody could easily accidentally remove or not fully understand and fail to maintain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alignas
would add 56 bytes of padding space and force _readWriteLock
to be 64 bytes large.
I had hoped that the comment I added there would be enough to prevent others from accidentally changing the order. It someone would move it to be next to the later members in this class it would double the overhead of the ticket lock (from 0.03% to 0.06% CPU usage), but I think the comment should be an okay-ish reminder to not do that. I hope?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the comment is OK. I wonder if we'd ever be possessed to remove all the std::functions
though and then the trick wouldn't work anymore.
@msftbot make sure @miniksa signs off |
Hello @DHowett! Because you've given me some instructions on how to help merge this pull request, I'll be modifying my merge approach. Here's how I understand your requirements for merging this pull request:
If this doesn't seem right to you, you can tell me to cancel these instructions and use the auto-merge policy that has been configured for this repository. Try telling me "forget everything I just told you". |
{ | ||
return std::shared_lock<std::shared_mutex>(_readWriteLock); | ||
return std::unique_lock{ _readWriteLock }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there's no difference any more between Locking for Writing and Reading thanks to the mechanics of the ticket lock... wouldn't it make sense to have one call the other to make that super clear until the future revision where you build that more complicated fair read/write lock for funsies?
// | ||
// But we can abuse the fact that the surrounding members rarely change and are huge | ||
// (std::function is like 64 bytes) to create some natural padding without wasting space. | ||
til::ticket_lock _readWriteLock; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the comment is OK. I wonder if we'd ever be possessed to remove all the std::functions
though and then the trick wouldn't work anymore.
🎉 Handy links: |
SRWLOCK
, as used bystd::shared_mutex
, is a inherently unfair mutexand makes no guarantee whatsoever whether a thread may acquire the lock
in a timely manner. This is problematic for our renderer which relies on
being able to acquire the lock in a timely and predictable manner.
Drawing stalls of up to one minute have been observed in tests.
This issue can be solved with a primitive ticket lock, which is 10x
slower than a
SRWLOCK
but still sufficiently fast for our use case(10M locks per second per thread). It's likely that any non-trivial lock
duration will diminish the difference to "negligible".
Validation Steps Performed