Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

<cmath>: Use intrinsics where possible #1336

Merged
merged 6 commits into from
Nov 6, 2020
Merged

<cmath>: Use intrinsics where possible #1336

merged 6 commits into from
Nov 6, 2020

Conversation

sylveon
Copy link
Contributor

@sylveon sylveon commented Oct 2, 2020

Partial fix for #1234. This uses intrinsics where possible to enhance codegen. Due to the STL relying on the UCRT for all functions here, this is the best we can do. These changes only affect users of the float and long double overloads.

@sylveon sylveon requested a review from a team as a code owner October 2, 2020 01:18
@CaseyCarter CaseyCarter added the performance Must go faster label Oct 2, 2020
@fsb4000
Copy link
Contributor

fsb4000 commented Oct 2, 2020

Starting since this PR it is better to use cmath and std::meow instead of math.h and meow, isnt it?

@sylveon
Copy link
Contributor Author

sylveon commented Oct 2, 2020

Right, but only for float and long double now.

Can bring the same to double (and std::signbit) by manually writing the overload rather than importing it from the UCRT into the std namespace, and same goes for std::meowf and std::meowl.

I'll look into doing that.

Doing the same for the int overloads might be trickier because of the macro machinery to generate them.

@sylveon
Copy link
Contributor Author

sylveon commented Oct 3, 2020

Can't do that because any solution would break overload resolution for consumers with using namespace std;

I can still do something about the integer overloads however.

@fsb4000
Copy link
Contributor

fsb4000 commented Oct 3, 2020

By the way from C standard:

https://web.archive.org/web/20181230041359if_/http://www.open-std.org/jtc1/sc22/wg14/www/abq/c17_updated_proposed_fdis.pdf

paragraph 7.1.4:

Any function declared in a header may be additionally implemented as a function-like macro defined in the header

and below they ever wrote example: https://imgur.com/a/kqzW2dk

So UCRT team also can do:

#define meow(x) __meow(x)

@CaseyCarter CaseyCarter self-assigned this Oct 7, 2020
stl/inc/cmath Outdated Show resolved Hide resolved
stl/inc/cmath Outdated Show resolved Hide resolved
stl/inc/cmath Outdated Show resolved Hide resolved
stl/inc/cmath Outdated Show resolved Hide resolved
Copy link
Member

@StephanTLavavej StephanTLavavej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this optimization - I will explore/validate changes for Clang and hopefully push something soon.

stl/inc/cmath Show resolved Hide resolved
stl/inc/cmath Outdated Show resolved Hide resolved
stl/inc/cmath Outdated Show resolved Hide resolved
@StephanTLavavej StephanTLavavej removed their assignment Nov 5, 2020
StephanTLavavej added a commit to StephanTLavavej/STL that referenced this pull request Nov 5, 2020
@StephanTLavavej
Copy link
Member

@CaseyCarter FYI, I pushed a /clr:pure fix (verified with an MSVC-internal build) after you approved.

StephanTLavavej added a commit to StephanTLavavej/STL that referenced this pull request Nov 5, 2020
@StephanTLavavej
Copy link
Member

@CaseyCarter /clr needed the same fix (this is surprising and may indicate a compiler bug but I haven't reported it).

@StephanTLavavej StephanTLavavej merged commit 26bbe2a into microsoft:master Nov 6, 2020
@StephanTLavavej
Copy link
Member

Thanks again for this codegen improvement! 😺

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants