-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
only use accurate powf function #19890
Conversation
Note that this reverts #2741 (comment) (e.g. restores f2b3192), since being correct is more important than being a few % faster. |
test/math.jl
Outdated
@@ -959,9 +959,13 @@ end | |||
end | |||
|
|||
@testset "issue #13748" begin | |||
let A = [1 2; 3 4]; B = [5 6; 7 8]; C = [9 10; 11 12] | |||
@test muladd(A,B,C) == A*B + C | |||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean to delete this?
src/codegen.cpp
Outdated
@@ -5964,7 +5961,7 @@ static void init_julia_llvm_env(Module *m) | |||
&pow, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This breaks compilation here.
What's the reason again that we can't do the static_cast
unconditionally? Including <math.h>
in C++ code causes using std::pow;
which provides additional overloads for float
and long double
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More precisely what glibc
+ libstdc++
/libc++
do when <math.h>
is directly included by c++ code is roughly
double pow(double, double);
namespace std {
using ::pow;
float pow(float, float);
long double pow(long double, long double);
// template version for promotion...
}
using std::pow;
It shouldn't be possible to overwrite a previously defined ::pow(double, double)
so I expect MSVC to do something similar and it should be safe in general to specify the overload type using a static_cast
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My main concern would be whether those are the same function. I would expect that the C++ version is getting name mangled might not end up picking up the openlibm version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that is an issue.
- These should be inline functions that calls the C
::pow
,::powf
,::powl
. - For C++ implementations that provide these it shouldn't be allowed to overwrite the existing
::pow(double, double)
symbol so the it should always be the C one. - At least in the generic linux binary
cglobal(:pow)
andccall(:pow, ..)
points to thelibm
version.
Are we moving ahead with this? If not, we should remove the 0.6.0 label. If so, it would be good to get the tests passing... |
That's the plan, just made an error in rebasing. But it also now made a good excuse to try out the new |
@nanosoldier |
replutil test failure on win32 is real - probably related to #17251, will see if locally running with edit: yep, get to choose either decent startup time or decent backtraces
|
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels |
Another failure in |
|
This failure is very specific to debug info in backtraces for the pow function. Not action at a distance at all, for this PR. |
looks like the name-demanger is getting confused here, nothing to do with #17251 |
Just a note that these performance reductions are both expected and intentional. It's good that we have a record of them though, and will probably want to link back here / note this on the nightly as well. |
We should probably revert exp back to using openlibm then, if changing the pow calculation used by julia is going to be this expensive. Or change its code to opt in to the other pow version, if it isn't needed for accuracy in the innards of the exp implementstion? cc @musm |
The julia code for |
could |
yes, |
It looks like |
great @nanosoldier |
base/math.jl
Outdated
^(x::Float32, y::Int32) = powi_llvm(x, y) | ||
^(x::Float16, y::Integer) = Float16(Float32(x)^y) | ||
@inline ^(x::Float64, y::Float64) = nan_dom_err(ccall("llvm.pow.f64", llvmcall, Float64, (Float64, Float64), x, y), x + y) | ||
@inline ^(x::Float32, y::Float32) = nan_dom_err(ccall("llvm.pow.f32", llvmcall, Float32, (Float32, Float32), x, y), x + y) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might have missed something, but why now llvm for ^
when both arguments are floats? the libm pow function is accurate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It resolves to the same function, but give an optimization hint for llvm that we mean that pow, and not just some random function named pow.
Line 508 in e577f8e
Also needs ^ = Base.FastMath.pow_fast for similar reason. Should trigger a regression.
|
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels |
It seems like there ought to be more efficient ways to compute |
Probably not for arbitrary |
The powi intrinsic optimization over calling powf is that it is inaccurate. We don't need that. When it is equally accurate (e.g. tiny constant powers), LLVM will already recognize and optimize any call to a function named `powf`, and produce the same speedup. fix #19872
@nanosoldier |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels |
so the simple version regressed - is that the Val literal lowering change making inlining or constant prop less predictable? |
Moving the value into the type domain can potentially thwart constant folding. In this case, it's this constant-value function that got broken:
|
… inlining of `^`
We should really stop dispatching on |
were you waiting for something before running @nanosoldier |
+100 #19890 (comment) The only real problem is that
which would allow it to be constant-folded and inferred just as well, so you could write |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels |
true. that then just has the problem that calling the I'll also just reiterate that the fact there's a number of methods |
Agreed; we should not have public APIs with |
anyways, enough OT - I think we can go ahead and merge this? |
Yes, seems like it's time. |
The constant-folding of powers is fixed by #20783, following the suggestion to use |
@eval exponent_max(::Type{$T}) = $(Int(exponent_mask(T) >> significand_bits(T)) - exponent_bias(T)) | ||
# maximum float exponent without bias | ||
@eval exponent_raw_max(::Type{$T}) = $(Int(exponent_mask(T) >> significand_bits(T))) | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vtjnash Out of curiosity, what was the reason behind moving these functions from float.jl
to math.jl
? I had some code that imported Base.significand_bits
, and now it needs Base.Math.significand_bits
to work on master.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They were incorrectly marked @pure
, and only used here. You can import them from Base.Math
on old code also, to be compatible.
The powi intrinsic optimization over calling powf is that it is inaccurate.
We don't need that.
When it is equally accurate (e.g. tiny constant powers),
LLVM will already recognize and optimize any call to a function named
powf
,and produce the same speedup.
fix #19872