Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

decide whether to support indexing with floats #625

Closed
StefanKarpinski opened this issue Mar 21, 2012 · 14 comments
Closed

decide whether to support indexing with floats #625

StefanKarpinski opened this issue Mar 21, 2012 · 14 comments
Labels
breaking This change will break code needs decision A decision on this change is needed
Milestone

Comments

@StefanKarpinski
Copy link
Member

Currently, we support indexing with floats in most places by rounding the float value to an integer. That's been called into question a number of times. There are three obvious behaviors we could go with here:

  • Keep the current float-rounding approach for non-integer indices
  • Only allow integer indices and throw a no method error when someone indexes with floats
  • Hybrid: allow indexing with floats, but only if they're integer-valued.

We need to decide on this before v1.0; the safest approach is to disallow non-integer indexing, since that gives us the option of changing our minds later without breaking existing code.

@pao
Copy link
Member

pao commented Mar 21, 2012

@ViralBShah
Copy link
Member

I suggest we just do integer indices for the 1.0 release. I certainly don't like the current behaviour, but the hybrid behaviour is not bad to have. Perhaps we can change the current behaviour to hybrid and move to extra, rather than getting rid of it altogether.

@StefanKarpinski
Copy link
Member Author

My argument for the hybrid behavior is that for indexing in general, value is the significant aspect, not type. Consider integer indices: it's not like all integer indices are ok — the ones that are out of the range 1:length(v) are not valid. For hashing, it's even more obvious that value rather than type is the significant issue: when comparing two hash keys, you don't compare their type, you compare their value for equality using isequal. One can argue that a vector is kind of like a hash where the only valid keys are integer value 1 through n.

One wrinkle in this argument is what to do with hashes where the key has a specified type. Do we try doing a convert? In that case, the current rounding behavior may actually be natural, if one considers a vector to be like a hash where the key type is integer and only the values 1 through n are valid keys. If we don't try converting, but just to key equivalence, then checking for integer-value keys in the case of vectors makes sense. It's also quite arguable that convert from float to int shouldn't work unless the value in question is already integer-valued. In other cases one could make the case that explicit use of iround, ifloor, or iceil ought to be required.

@JeffBezanson
Copy link
Member

I kind of like the idea of giving an error for non-integer-valued or out of range values on float-to-int conversion with convert. That forces you to specify whether floor, ceil, or round is required. I would guess that's a pretty common source of off-by-one bugs. It also makes code easier to port, since there is less reliance on implicit behaviors.

I still wonder whether isequal(2,2.0) should really be true or not.

Of course a[iround(n/2)] is a bit verbose. a[(n+1)>>>1] is 2 characters shorter :)

@StefanKarpinski
Copy link
Member Author

If we started doing that, we could also have complex -> float conversions since it would just be a matter of making sure that the imaginary part was zero and then just taking the real part (for non-real-valued complexes, convert would throw an error). I like the consistency. As it is, we don't have complex -> float conversion because it's unclear what to do. But in reality, it's no more clear what to do with an float when converting it to an integer — it just happens that rounding is decent a fair amount of the time, but even that isn't the default in other languages.

@JeffBezanson
Copy link
Member

Allowing integer-valued floats as indexes slightly violates the principle of making behavior type-dependent rather than value-dependent. I say slightly because throwing an error is a relatively safe thing to do. Nevertheless, it means code can work for a while until particular values happen to be used.

@StefanKarpinski
Copy link
Member Author

Whether a function throws an exception or not is often value-dependent. In indexing, even for integers it already depends on whether the index is in range or not. I can't see why this is any different.

@timholy
Copy link
Member

timholy commented May 4, 2012

After closing #795 I vaguely remembered this discussion. I'm currently in the process of removing support even for Integers (rather than Ints) in array indexing, with the exception of linear indexing (where I think Integer can be safely kept as an alternative). This is even more extreme than the float issue.

@JeffBezanson
Copy link
Member

Although it sounds weird at first, it might make sense to allow indexing with floats but not integers other than Int. The reason is that floats arise naturally from certain operations like /, but there is no real reason to compute with, say, Int8.

A less important reason is that throwing around lots of different integer types can be a worst case for the code specializer, where you can hit an O(n^m) situation.

@StefanKarpinski
Copy link
Member Author

@JeffBezanson: although I know you're worried about a specialization explosion (a great term, btw) if we allow all sorts of integer index types, I think this is one of those cases where that could happen, but in practice won't. Why would it? Also, can we just put a single more general conversion method over all ref calls and just be done with it? Something that will catch all cases where all the indices aren't Int and just convert them?

@JeffBezanson
Copy link
Member

We should experiment with that, but it's hard to do without lots of method ambiguities. And implementations like looping over the arguments to convert non-Ints are too slow to be useful. Actually this is another thing that can be done easily if staged methods are part of the compiler.

@johnwcowan
Copy link

I think it should be an error to use floats. People will write loops running from 0.0 to 1.0 by 0.1 (which already is problematic) and multiply the loop variable by 10 to use as an index, sometimes getting the wrong index (particularly since you are rounding rather than truncating). Float-typed indices are most likely a design bug.

@JeffBezanson
Copy link
Member

I'm ok with disallowing floats but many people want them. Our loops of the form for i in 0.0:0.1:1.0 do not have this problem, and even when I tried doing the loop the wrong way rounding seemed to do the right thing:

julia> while i<=1.0
       println(iround(10i))
       i += 0.1
       end
0
1
2
3
4
5
6
7
8
9
10

@StefanKarpinski
Copy link
Member Author

The floating-point errors don't accumulate to give a result that's off by more than 0.5 here, which, using rounding for conversion does the right thing (that's why rounding should be the default way of coercing floats to ints). I think we should favor convenience-with-safety (floats allowed but an error for non-integral floats) over less convenience with a better guarantee of performance. If people write stupid loops, that's their problem, but it can be easily fixed. It should, however, IMO, work as long as it makes sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking This change will break code needs decision A decision on this change is needed
Projects
None yet
Development

No branches or pull requests

6 participants