-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Faster Rational-like type #11522
Comments
One option, suggested by @StefanKarpinski in #8672, is to get rid of the We could take this even further and get rid of the coprime requirement altogether? We could keep the current operations |
We could have a function ( |
Another option (perhaps encompassing Stefan's proposal), would be to do fast, non-cancelling operations by default, and only fallback on cancelling operations when overflow is detected? |
@timholy Do you have some suggestions for useful benchmarks? |
I've posted https://github.com/timholy/Ratios.jl as a playground (and because I need this for Interpolations, and as @tlycken pointed out it's better not to bury it inside Interpolations). Feel free to play here or elsewhere. |
Operationally, I'd say anything fast enough so that it's not a bottleneck for Interpolations is currently the benchmark I care about 😄. |
This was first thing that sprung to mind too. When coupled with simplification only when absolutely needed (i.e. display, querying numerator/denominator), it could be quite nice. I assume (LOL) that it'd be closer to Ratios.jl performance than the current Rationals, but... not sure. The Ratios code is so simple that it should be blistering fast (SIMD-able even) |
I tested out the idea on the Unfortunately, the resulting performance is somewhat disappointing: using @timholy's test on JuliaMath/Interpolations.jl#37, makes Interpolations slightly slower than Grid, though nowhere near as slow as using the Rational type (see here for the test script). |
I've also added a
Given that we still get an order of magnitude speedup, I think this is worth pursuing. We could also then add (possibly unexported) |
Pretty compelling to me. It looks like you are using exceptions, so is it plausible that a Int-specific version that does checking for overflow without exceptions could get to something like 10x slower? |
+1 for the experiment, and the 10x speedup. Interpolations will still use the blisteringly-fast unchecked variants, but I agree this is quite promising. |
The main issue with the checked stuff is that we only expose it via exceptions, which are a total performance trap. We need to expose some way of doing operations and then checking the overflow bit. |
There are currently packages where EDIT: As an explicit example of potential subtle breakage, |
I don't really have time to play around with this much more the moment, but I did arrive at this: function null_checked_add(x::Int, y::Int)
n, x = Base.llvmcall("""
%3 = call { i64, i1 } @llvm.sadd.with.overflow.i64(i64 %0, i64 %1)
%4 = extractvalue { i64, i1 } %3, 1
%5 = zext i1 %4 to i8
%6 = extractvalue { i64, i1 } %3, 0
%7 = insertvalue { i8, i64 } undef, i8 %5, 0
%8 = insertvalue { i8, i64 } %7, i64 %6, 1
ret { i8, i64 } %8""",
Tuple{Bool,Int64},Tuple{Int64,Int64},x,y)
end This returns a |
I was thinking along the same directions, minus all the |
Not that I know of: the current |
This is a great use of llvmcall. We could adjust the intrinsics to return the overflow bit, and then throw the exception in a julia-level definition, but we don't want to add many more intrinsics. |
+1 to this – I was thinking that as well. |
Is #11604 needed to use |
Apparently not: I assume because they've already been declared by |
sadd works. But when trying the above with smul:
It may be that I'm doing something else wrong! |
Ah, sorry I missed that. Yes, you're right (though if you run it a second time it does work correctly). |
@JeffBezanson This is one thing I have often wondered: do we actually need most of the intrinsics? Would there be any disadvantage to using llvmcall for a lot of those (once #11604 is ironed out)? |
A rough plan for this issue:
|
@simonbyrne any thoughts on the issue I raised above? Renaming the Also, I assume calling |
Could keep a flag of whether it's been reduced or not and have a reduce function that returns the same value in reduced form. |
Renaming the fields seems reasonable. The idea of a flag seems reasonable, though perhaps worth having some examples of where this might be a problem. |
Could use the sign of the denominator or something like that. We've also talked about having a separate powers of two field, which would give bigger range and make it possible to represent all floating-point values, which would be pretty useful. |
A quick update: I've managed to get the llvm-checked operations working, (on the llvm-checked branch), and it's down to 6x slower than completely unchecked ops. |
Really nice! That's a heck of an improvement from 320x slower! Sounds like Base material to me (assuming we aren't planning on moving Rationals out of Base). |
What happened with this? A 20x performance increase would be a bad thing to lose to the sands of time. |
Note that |
I spent the afternoon playing with this. It seems to me that one may carry an unreduced rational if it become reduced on these occasions:
For all other calculation processing, the use of unreduced rationals would be ok.
I found this to be marginally faster than the current version:
Is it acceptable to use two Val{} types as a second parameter, encoding IS_REDUCED or MAY_REDUCE? That is a way to work without a state field and let calculations with unreduced rationals go on unless there is overflow. The only other way that is type size respecting, as I read above, appropriates the denominator's sign bit for use as state bit ( signbit(den) ? IS_REDUCED. : MAY_REDUCE ). To date, Julia base has stayed away from reclaiming an internal bit of a built-in numeric type (I have). |
Nice work, @JeffreySarnoff. It would be great to have a faster rational type based on this approach. I'm not enthused about the type parameter indicating reduction status, but maybe it would be ok? At that point, we could actually just have reduced and unreduced rational types. I.e. this: abstract Rational{T<:Integer} <: Real
immutable ReducedRational{T<:Integer} <: Rational{T} ... end
immutable UnreducedRational{T<:Integer} <: Rational{T} ... end Then some operations would produce reduced rationals, while others would produce unreduced ones. Of course, the trouble is that you can't always predict statically when you'll get which. Which is why I don't think it really helps. Instead, I think having some sort of reduced flag to avoid repeated reduction would be the way to go. |
👍 to run-time checking of the flag (I'd bet money that using the type system for this would make things worse). |
This is a proof of concept. To keep type constancy, there is no widening. With element types of Int64 or Int32 the speedups are utilitarian. |
Any progress on this? As @oscardssmith said, "A 20x performance increase would be a bad thing to lose to the sands of time." |
which one of these approaches to handle overflow .. rationals tend to grow their sigdigits
|
I was trying to compute the 1000th harmonic numbers exactly. Have to use bigint in this case. It seems to be a bit slow. |
try this for calculating harmonic numbers
I get 20x, 10x for n=1_000. |
Indeed much faster. See benchmark here. Thanks. https://newptcai.github.io/harmonic-number-and-zeta-functions-explained-in-julia-part-1.html |
Great! @StefanKarpinski |
Hi! I’m trying to use FastRationals package for long computations associated with the harmonic numbers. |
Are you following the guidelines? Can you use Q64 staying around the tabulated sweet spot? If so, that is worth doing. What is the result when you run this:
|
I have no idea why your results are unstable.
repeating that five times, I see these results (showing a slowdown!) using FastQ64 tells another story (n <= 46, as FastQ64 overflows this calc at n=47 )
|
I placed a large CAUTION about FastQBig at the top of the readme. |
Over in JuliaMath/Interpolations.jl#36 (comment) and JuliaMath/Interpolations.jl#37 it was discovered that doing computations with
Rational
is slow, because basically every usage callsgcd
anddiv
. The advantage of callinggcd
anddiv
is that it makes the type much less vulnerable to overflow, and that is a Good Thing. But as we discovered, certain computations may not need that kind of care, so there may be room for a faster variant. Switching to a stripped down variant provided an approximate 50-fold speed boost.I suspect certain computations may demand an implementation that is as minimalistic as that
Ratio
type. There may also be an area of intermediate interest, where aRational
-like object is represented in terms of pre-factorized numbers, perhaps numerator and denominator each being aDict{Int,Int}
representing the base and power of the factors.The text was updated successfully, but these errors were encountered: