Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to use a more efficient power of 2 check. #274

Merged
merged 1 commit into from
Jan 27, 2021

Conversation

mjp41
Copy link
Member

@mjp41 mjp41 commented Jan 27, 2021

@achamayou can you test if this shows the same regression in Debug performance.

@mjp41 mjp41 requested review from nwf and achamayou January 27, 2021 11:15
@achamayou
Copy link
Member

@mjp41 this does solve the regression, and our test numbers (in the admittedly not very performance sensitive Debug build) are back to normal

73:         ✓ creates numeric polls (824ms)
73:         ✓ creates string polls (825ms)
73:         ✓ rejects creating polls with an existing topic (1668ms)
73:         ✓ rejects creating polls without authorization (757ms)
73:       POST /
73:         ✓ creates multiple polls (814ms)
73:         ✓ rejects creating polls with an existing topic (1722ms)
73:         ✓ rejects creating polls without authorization (726ms)
73:       PUT /{topic}
73:         ✓ stores opinions to a topic (1581ms)
73:         ✓ rejects opinions with mismatching data type (1583ms)
73:         ✓ rejects opinions for unknown topics (770ms)
73:         ✓ rejects opinions without authorization (1447ms)
73:       PUT /
73:         ✓ stores opinions to multiple topics (1581ms)
73:         ✓ rejects opinions with mismatching data type (1535ms)
73:         ✓ rejects opinions for unknown topics (802ms)
73:         ✓ rejects opinions without authorization (1454ms)
73:       GET /{topic}
73:         ✓ returns aggregated numeric poll opinions (9383ms)
73:         ✓ returns aggregated string poll opinions (9359ms)
73:         ✓ rejects returning aggregated opinions below the required opinion count threshold (8525ms)
73:         ✓ rejects returning aggregated opinions for unknown topics (793ms)
73:     /csv
73:       GET|POST /
73:         ✓ stores and returns opinions of authenticated user as CSV (1326ms)

Copy link
Collaborator

@davidchisnall davidchisnall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this actually make a difference to codegen? Most of the modified code is in static asserts, and I'm curious whether the comparisons in the dynamic checks actually compile to something different after optimisation (they're interesting compiler test cases if they do).

The modified version is much more readable than the original though, so I'm happy to see this change whether it improves codegen or not.

@achamayou
Copy link
Member

@davidchisnall even without LVI mitigations, using the previous implementation (d3ecd66, between 0.5.0 and 0.5.1) caused a major slowdown in our builds. With LVI mitigations, some of the tests just time out.

This is all in Debug, so this isn't critical by any means, but it's still quite nice.

Copy link
Collaborator

@nwf nwf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The runtime asserts in pointer_align_down (and friends, but especially that one) land on the hottest path, as part of {Superslab,Mediumslab}::get(), for every dealloc(). On second thought, maybe that's not true with inlining, since the alignments are constant. Hm.

As @davidchisnall says, tho', this is also a good improvement to readability and the perf effects are a nice bonus. :)

@mjp41
Copy link
Member Author

mjp41 commented Jan 27, 2021

So in the commit @achamayou mentioned

In align_up and align_down, there was a change from:

 SNMALLOC_ASSERT(next_pow2(alignment) == alignment);

to

 SNMALLOC_ASSERT(next_pow2_const(alignment) == alignment);

This means that the code is not using the intrinsic, and thus uses a loop to calculate the next_pow2. As the issue was in debug, I don't think the loop will be optimised away.

The rest of the changes were from reviewing our usage of the pattern

next_pow2(???) == ???

and

??? == next_pow2(???)

I thought it was nicer to rewrite. I don't expect the majority of the changes to have any codegen impact.

@mjp41 mjp41 merged commit a3660c4 into microsoft:master Jan 27, 2021
@mjp41 mjp41 deleted the check_power_of_2 branch January 27, 2021 11:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants