-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FloatRange: new type for accurate floating-point ranges [fix #2333] #5636
Conversation
Very cool! Does this generally give the same values as |
Can we have a few more such algorithms? I'm thinking of functions like |
+1 for |
I think that's in the forthcoming |
@JeffBezanson, re matching julia> Base.showcompact(io::IO, x::Float64) = show(io,x)
showcompact (generic function with 8 methods)
julia> [0.3:0.1:1.1 linspace(0.3,1.1,9)]
9x2 Array{Float64,2}:
0.3 0.3
0.4 0.4
0.5 0.5
0.6 0.6000000000000001
0.7 0.7000000000000001
0.8 0.8
0.9 0.9
1.0 1.0000000000000002
1.1 1.1 As you can see, even though |
Maybe linspace should be implemented using ranges after this.
|
No, I think |
@timholy |
Man, I'd looked everywhere for that function. Thanks! |
Since the |
I haven't looked at performance at all, so I'm not sure. I would imagine it's a bit slower. |
gives
on my computer: about a factor of 7 penalty for the divisions in a simple loop. |
Yeah, that seems about right. Is it worth it for more accurate/intuitive results? Keep in mind that floating-point ranges with a unit step don't have to pay the price. |
A unit step is probably not the common case for float ranges.
|
That's quite true. So the question is whether we care more about indexing performance for float ranges or accuracy/intuitiveness. Note that using linspace has similar performance issues. |
I really like the more accurate ranges, and think the performance penalty is well worth it. Getting the wrong number of iterations in a loop because of roundoff errors, is surprising to most beginners and also for lots of more seasoned programmers. I have not figured out is why we want to use step_length instead of num_steps for a floating point range, but integer range similarity and Matlab compatibility is probably a good enough reason. There is a fine balance for where we should try to do what the user intended, and where we should give single instruction performance. The overflow behaviour for integer is an example where we (for now) have opted for the performant "non obvious" choice. |
I agree with Ivar in that I like the more accurate ranges. If this became Kevin On Mon, Feb 3, 2014 at 4:33 PM, Ivar Nesje notifications@github.com wrote:
|
(a, b, c) = (0.0, 0.1, 1.0)
for i in FastButInaccurateRange(a, b, c)
#use i
end
# or
for ind in 1:int(c / b)
i = a + b*ind
#use i
end |
You don't even need any of that – you can just explicitly construct a regular |
I hadn't noticed that you created a new type... |
Yes. Ultimately floating-point ranges and "ordinal ranges" just shouldn't be handled in the same way. This is work towards solving the rangepocalypse umbrella issue. |
This really looks beautiful! @StefanKarpinski, why do you think |
I suppose that |
This range implementation is great, but a 7x slowdown does give me pause. It shouldn't be possible to beat the standard library by that much with such a simple-but-annoying rewrite. We don't want to have a large catalog of faster-version-of-X types of things. |
I wonder how many situations there really are where floating-point range indexing is a bottleneck. Seems like a very strange thing to have as a bottleneck to me. |
Yes, I doubt it would be a bottleneck by itself, but what can happen is you might use 10 pieces of the language/library, collectively using 60% of run time, and each piece is 4-5x slower than it could be. It starts to make a big dent if we do this in several key components. Some more realistic benchmarks would be helpful though. |
As much as we don't want a large catalog of "faster-version-of-X types of things", we also don't want a large catalog of "correct-version-of-X types of things". In the case of floating-point ranges, getting unexpected values is a perennial problem, whereas I've just never encountered a situation where I really cared that much about the performance of indexing into or iterating over them. That doesn't mean it's not an issue, but it's just not clear to me that this is a place where we should prioritize performance over ease-of-use. |
For what it's worth, I tend to agree with @StefanKarpinski that the tradeoff is worth it here. Otherwise roundoff errors make floating-point ranges so annoying that I end up just not using them. |
I agree too. This is just the place where you want magic, even if it's somewhat slow magic. |
While this is a great idea, I think we should be very careful when using the terms "correct" and "accurate": these ranges are really "the best rounded, rational approximation to the range that you specified". Arguably, a On a related note, our
and yes, it is true that the exact midpoint of |
Also, it seems that using any |
Yeah, that's the kind of thing that's holding me back from merging it. It still has some issue. I also think that for this to be merged, it needs to have a pretty well-defined behavior, so there's that too. But it's getting there and it already behaves better than any other float range implementation that I've seen. |
Just pushed a patch that ought to fix this. I had it in my workspace but forgot to update |
This failure [1] is completely mystifying to me. If anyone has any bright ideas as to why this patch would make |
Looks like the actual error is being masked: julia> convert(Float16, 1.0f0)
float16(Evaluation succeeded, but an error occurred while showing value of type Float16:
ERROR: no method display(Float16)
in display at multimedia.jl:158 |
This bug is surely unrelated to this change. |
The display error isn't actually the issue. I filed an issue for the actual problem: #5885. |
These additional operations are needed to get the random and linalg tests closer to passing for the FloatRange change [#5636]. We really need to make a decision about how first-class we want Float16 to be. I'm starting to suspect that we should just bite the bullet and make all the things that work for other floating-point types work for the Float16 type also. It's just easier than having them be half-broken and try to explain that they're "just for storage".
There's a slight chance that computing `stop` with integers and then converting to floating point would work but doing to computation in floating point would not give the correct answer. This tests actual values – with the correct type – that will be used.
Obviously we don't want to leave things like this, but with this work around, we can merge the FloatRange branch and figure out the root cause of #5885 later.
Ok, this now finally works. I'm gonna merge it. |
FloatRange: new type for accurate floating-point ranges [fix #2333]
Do we want the "snapping" behavior here?
The old |
It's hard to argue that 1.9999999999999998 should be considered equal to 2. I don't think we should try to infer what arbitrary floating-point computations were meant to compute. Taken to its logical conclusion, that would mean that any value could have been meant to be any other value. What's the rule that allows this? Always snap to the nearest integer? The current behavior guesses start, step and stop values that are ratios of 24-bit integers and give the exact floating-point values given. That seems like a reasonable and well-defined place to stop guessing. |
I can live with that. But the rule is very simple: range lengths are normally rounded down, but if the computed length is within ~3 ulps of the next larger integer it is rounded up instead. |
Why 3 and not 4 ulps? What about very short step lengths Should the rounding be relative to the step (eg. round up if (round_error < step/10000))? |
The rounding is done against |
I think the point is that all of those things are arbitrary, whereas there is nothing about this |
This algorithm for floating-point ranges magically reads your mind.
[ r.start + k*r.step for k = 0:r.len ]
behavior by settinglen = s
anddivisor = 1
.s
, using a customized rationalize function.(start,step,stop)
tuple, then we use lifted iteration, stepping by an integer amount and projecting back down to get a perfect floating point range. Otherwise, it falls through to the default behavior.FloatRange
parameters are determined, iteration and indexing are quite simple: the values of the range are[ (r.start + k*r.step)/r.divisor for k = 0:r.len ]
.