-
Notifications
You must be signed in to change notification settings - Fork 52
Reconsider sum/prod/trace upcasting for floating-point dtypes #731
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
As I mentioned a few times before, I agree with not specifying this. Partly, because I think it is just asking too much from NumPy (and apparently others). |
Since all libraries appear to do exactly the same as of today, what's the problem with encoding that? Nothing is being asked from anyone at that point, it's basically just documenting the status quo. |
I wouldn't have been surprised if someone upcast for float16, but if not then fine. Mainly, I am not sure I mind the old proposal if it was coming from scratch, so I don't have an opinion about allowing it (i.e. not caring that the result may have higher precision). |
My2¢ is that it is good to codify the current behaviour for the various p.s. Indeed, I think this is true even for integers. At least, to me, the following is neither logical nor expected:
Explicit is better than implicit and all that. And for reductions, it might be quite reasonable to do the operation at higher precision and check for overflow before downcasting. |
If you're suggesting a value-based result type, that's even worse. That's the sort of thing we're trying to get away from with the standard. |
That PR discussion is huge and you didn't point to a specific comment, so I don't know what was already said. But it makes sense to me to treat floats different from ints because floats give |
There's several comments on it. The main one is this comment. Then it got also mixed in with the comment on in-place operator behavior in this comment. And in this comment @seberg said "(I explicitly did not give a thumbs-up for the type promotion changes in that meeting)" (type promotion meaning the I did write it down as one requirement among many (I didn't quite agree with what I wrote myself, but forgot to revisit), it didn't stand out in the text. It's telling that it was flagged quickly by both @seberg and @mhvk as too impactful to change.
Internal upcasting is regularly done, and perfectly fine. I assume the intent was "warn or raise on integer overflow", rather than value-based casting. |
No, not a different type, that would be awful indeed! But an over/underflow error/warning, just like we can currently get for floating point operations. For regular ufuncs, that is too much of a performance hit, but for reductions, it should not be. And reductions are a bit special already since it definitely makes sense to do things at higher precision internally, before casting back to the original precision. |
OTOH |
@rgommers What would the new wording that you seek to change to? |
The current wording that is problematic is:
I suggest a change like this:
This loosens the spec, recommends what the current behavior of all known libraries is, and still allows upcasting if an implementation desires to do so for (reasons). |
Thanks, looks good to me. Maybe it would be slightly clearer to replace the EDIT: Or just replace the should with a must, to me it seems to apply to the full construct, so must is correct and the (recommended) already includes the "should" part. |
I personally don't see value in hedging with "recommended" or "should" if no one actually does that now and we don't even have a concrete reason for anyone to do so. It feels like our only real rationale is some misunderstanding in the original discussion about int dtypes. Not being precise about dtypes has disadvantages. For instance, it makes it harder to reason about dtypes statically (#728). Everywhere else in the standard uses "must" for output dtypes (correct me if I'm wrong on that). And I also disagree that upcasting is not a big deal. When you're explicitly using a lower precision float silent or unexpected upcasting can have a very real performance impact. Here's an example where fixing an unexpected float64 upcast made some code 5x faster jaymody/picoGPT#12. |
I am fine with being strict here and saying it must be the same: it is the only version that I see giving any clarity to libraries supporting multiple implementations (which is my main emphasis here always, compared to thinking about the ideal end-user API). But, there must have been some feeling of float16 and float32 having loss of precision quickly and that users need protecting, so that this ended up written down. And I am happy to accept the opinion that it may be a reasonable choice for end-users.
N.B.: To clarify, for summation overflows are actuall not the main issue! The issue is extreme loss of precision unless you have a high precision intermediate (at least float64). If you sum |
Closes data-apisgh-731. This is a backwards-incompatible change. It seems justified and necessary because array libraries all behave like the required behavior described in this commit, are not planning to change, and the initial rationale for the "upcast float" requirement wasn't strong. See discussion in data-apisgh-731 for more details.
Okay, seems like there is support for "must", and I agree that that is nicer. PR for that: gh-744. |
This commit modifies type promotion behavior in `sum`, `prod`, `cumulative_sum`, and `linalg.trace` when the input array has a floating-point data type. Previously, the specification required that conforming implementations upcast to the default floating-point data type when the input array data type was of a lower precision. This commit revises that guidance to require conforming libraries return an array having the same data type as the input array. This revision stems from feedback from implementing libraries, where the current status quo matches the changes in this commit, with little desire to change. As such, the specification is amended to match this reality. Closes: #731 PR-URL: #744 Co-authored-by: Athan Reines <kgryte@gmail.com> Reviewed-by: Athan Reines <kgryte@gmail.com>
This can be solved by using a higher intermediate precision, or by using a smarter summation algorithm. My point is that the only reason you'd need a higher result precision is if there is an overflow. |
The requirement to upcast
sum(x)
to the default floating-point dtype with the defaultdtype=None
currently says (from the sum spec):If
x
has a real-valued floating-point data type, the returned array must have the default real-valued floating-point data type.The rationale given is "keyword argument is intended to help prevent data type overflows.". This came up again in the review of NEP 56 (numpy/numpy#25542), and is basically the only part of the standard that was flagged as problematic and explicitly rejected.
I agree that the standard's choice here is problematic, at least from a practical perspective: no array library does this, and none are planning to implement this. And the rationale is pretty weak, it just does not apply to floating-point dtypes to a similar extent as it does to integer dtypes (and for integers, array libraries do implement the upcasting). Examples:
The most relevant conversation is #238 (comment). There was some further minor tweaks (without much discussion) in gh-666.
Proposed resolution: align the standard with what all known array libraries implement today.
The text was updated successfully, but these errors were encountered: