-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize Math.Pow(x, c) where c is 2, 1, -1 or 0 #31978
Conversation
If you can do it at import with consts, would it be worth going higher? e.g. to 5 in you linked example 3-4 crops up For 5 I was going to suggest smootherstep However, you'd probably write it like x * x * x * (x * (x * 6 - 15) + 10) |
@benaadams If I understand you correctly I can't optimize other constants in "safe math" mode, e.g. vmulsd xmm0, xmm0, xmm0
vmulsd xmm0, xmm0, xmm0 (a single xmm0 register!) but it might return a slightly different value (and violate the ieee754 spec) |
cc: @tannergooding |
CI failures are unrelated (#31985) |
What about: |
Can be added I guess but should be careful with side-effects, I wanted to optimize |
For reference, the IEEE spec defines the following behavior for
A couple of the conditions aren't valid because we don't support signalling NaN nor do we support floating-point exceptions. The C Language Standard also matches this behavior in |
So I guess we better skip |
No, that is fine to optimize. The point of my comment is that we don't support |
@tannergooding any idea why on arm64 Should I remove the |
The result is |
Thanks for explanation, so should I give up on |
You'll be likely to hit the same types of issues with
It would be good to make sure this isn't C# or the JIT doing constant folding on |
Will back to it later (to keep amount of active PRs smaller ) |
🤔 hm... looks like I have to do this optimization later since LICM is not for (int i = 0; i < 1000; i++)
{
Console.WriteLine(MathF.Pow(x + 2, 2));
} Without this PR optimization, this Pow() is hoisted. |
Right, it can't hoist assignments (see #35735 for example). |
Resurrects dotnet/coreclr#26552
Optimizes:
(same for
MathF
andfloat
)This time it's done in the
importer.cpp
and handles all kinds of the first argument (introduces a temp variable if needed, e.g. forGT_CALL
).Example:
Current codegen:
New codegen:
It seems this pattern can be found in gamedev, e.g.. Xenko (a game engine): https://github.com/xenko3d/xenko/search?q=Math.Pow&unscoped_q=Math.Pow
Also the dotnet/performance benchmarks use it: https://github.com/dotnet/performance/blob/8aed638c9ee65c034fe0cca4ea2bdc3a68d2a6b5/src/benchmarks/micro/runtime/Burgers/Burgers.cs
Jitdiff for bcl:
The optimization can be extended to handle more cases once some sort of fast-math mode appears in .NET Core.