-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update to use a more efficient power of 2 check. #274
Conversation
@mjp41 this does solve the regression, and our test numbers (in the admittedly not very performance sensitive Debug build) are back to normal
|
44c03da
to
dd63343
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this actually make a difference to codegen? Most of the modified code is in static asserts, and I'm curious whether the comparisons in the dynamic checks actually compile to something different after optimisation (they're interesting compiler test cases if they do).
The modified version is much more readable than the original though, so I'm happy to see this change whether it improves codegen or not.
@davidchisnall even without LVI mitigations, using the previous implementation (d3ecd66, between 0.5.0 and 0.5.1) caused a major slowdown in our builds. With LVI mitigations, some of the tests just time out. This is all in Debug, so this isn't critical by any means, but it's still quite nice. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The runtime asserts in On second thought, maybe that's not true with inlining, since the alignments are constant. Hm.pointer_align_down
(and friends, but especially that one) land on the hottest path, as part of {Superslab,Mediumslab}::get()
, for every dealloc()
.
As @davidchisnall says, tho', this is also a good improvement to readability and the perf effects are a nice bonus. :)
So in the commit @achamayou mentioned In
to
This means that the code is not using the intrinsic, and thus uses a loop to calculate the The rest of the changes were from reviewing our usage of the pattern
and
I thought it was nicer to rewrite. I don't expect the majority of the changes to have any codegen impact. |
@achamayou can you test if this shows the same regression in Debug performance.