-
-
Notifications
You must be signed in to change notification settings - Fork 31.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-37986: Improve perfomance of PyLong_FromDouble() #15611
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
Improve performance of :c:func:`PyLong_FromDouble` for values that fit into | ||
:c:type:`long`. Now :meth:`float.__trunc__` is faster up to 10%, | ||
:func:`math.floor()` and :func:`math.ceil()` are faster up to 30% when used | ||
with such values. |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -433,6 +433,21 @@ PyLong_FromSize_t(size_t ival) | |
PyObject * | ||
PyLong_FromDouble(double dval) | ||
{ | ||
/* Try to get out cheap if this fits in a long. When a finite value of real | ||
* floating type is converted to an integer type, the value is truncated | ||
* toward zero. If the value of the integral part cannot be represented by | ||
* the integer type, the behavior is undefined. Thus, we must check that | ||
* value is in range (LONG_MIN - 1, LONG_MAX + 1). If a long has more bits | ||
* of precision than a double, casting LONG_MIN - 1 to double may yield an | ||
* approximation, but LONG_MAX + 1 is a power of two and can be represented | ||
* as double exactly (assuming FLT_RADIX is 2 or 16), so for simplicity | ||
* check against [-(LONG_MAX + 1), LONG_MAX + 1). | ||
*/ | ||
const double int_max = (unsigned long)LONG_MAX + 1; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Will it be truncated in the right direction (towards zero) to avoid this triggering on values with undefined conversion behavior? the previous code used There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The original comment explains why you should use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think I had to add comment about this: I assumed that LONG_MAX == 2 ** (CHAR_BIT * sizeof(long) - 1) - 1 and LONG_MIN == -2 ** (CHAR_BIT * sizeof(long) - 1), i.e. (Originally I wrote it like this: There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Here I'm trying to demonstrate correctness of this approach: In [66]: SIZEOF_LONG = 8; CHAR_BITS = 8
In [67]: LONG_MAX = (1 << (SIZEOF_LONG * CHAR_BITS - 1)) - 1; LONG_MIN = -LONG_MAX - 1
In [68]: int_max = float(LONG_MAX + 1)
In [69]: int_max == LONG_MAX + 1
Out[69]: True
In [70]: def cast_to_long(dval):
...: assert isinstance(dval, float)
...: wholepart = math.trunc(dval)
...: if LONG_MIN <= wholepart <= LONG_MAX:
...: return wholepart
...: raise RuntimeError('undefined behavior')
In [71]: def long_from_double(dval):
...: assert isinstance(dval, float)
...: if -int_max <= dval < int_max:
...: return cast_to_long(dval)
...: raise ValueError('float is out of range, use frexp()')
In [72]: long_from_double(int_max)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-72-280887471997> in <module>()
----> 1 long_from_double(int_max)
<ipython-input-71-ccaef6014bf1> in long_from_double(dval)
3 if -int_max <= dval < int_max:
4 return cast_to_long(dval)
----> 5 raise ValueError('float is out of range, use frexp()')
ValueError: float is out of range, use frexp()
In [73]: int_max.hex()
Out[73]: '0x1.0000000000000p+63'
In [74]: long_from_double(float.fromhex('0x1.fffffffffffffp+62'))
Out[74]: 9223372036854774784
In [75]: long_from_double(float.fromhex('-0x1.fffffffffffffp+62'))
Out[75]: -9223372036854774784
In [76]: long_from_double(-int_max)
Out[76]: -9223372036854775808
In [77]: long_from_double(float.fromhex('-0x1.0000000000001p+63'))
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-77-de5e9e1eba23> in <module>()
----> 1 long_from_double(float.fromhex('-0x1.0000000000001p+63'))
<ipython-input-71-ccaef6014bf1> in long_from_double(dval)
3 if -int_max <= dval < int_max:
4 return cast_to_long(dval)
----> 5 raise ValueError('float is out of range, use frexp()')
ValueError: float is out of range, use frexp() There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this is fine, under reasonable assumptions on the platform. It's not safe based purely on the C standard to assume that Believe it or not, it's also not safe based purely on the C standard to assume that There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Re-reading all this, I had one more worry (which is why I dismissed my own review): what happens if the exact value of
So any There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Then I think we could use
Shouldn't we formally state that we support only two's complement representation? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Definitely not worth it! The C standard permits
Yes, we should, though I'm not sure where would be the best place. But I think it's a non-issue in practice. |
||
if (-int_max <= dval && dval < int_max) { | ||
mdickinson marked this conversation as resolved.
Show resolved
Hide resolved
|
||
return PyLong_FromLong((long)dval); | ||
} | ||
|
||
PyLongObject *v; | ||
double frac; | ||
int i, ndig, expo, neg; | ||
|
@@ -452,8 +467,7 @@ PyLong_FromDouble(double dval) | |
dval = -dval; | ||
} | ||
frac = frexp(dval, &expo); /* dval = frac*2**expo; 0.0 <= frac < 1.0 */ | ||
if (expo <= 0) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is already on the slow path, it seems safest to keep this check in place even though it should've been handled by the above int range checks. smart compilers would see that (no idea how many are smart enough to unroll frexp and understand). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I do not think it makes sense to keep this code. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Either seems fine to me. Personally, I'd probably keep the check out of defensiveness (someone could, for whatever reason, move the fast path out at some point in the future; it's nice if the slow path remains valid in that case), but I'm happy for this to be merged as is. Do we at least have unit tests that cover this case? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
At least gcc is not smart enough. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. a compromise is to turn it into There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is. |
||
return PyLong_FromLong(0L); | ||
assert(expo > 0); | ||
ndig = (expo-1) / PyLong_SHIFT + 1; /* Number of 'digits' in result */ | ||
v = _PyLong_New(ndig); | ||
if (v == NULL) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these numbers still correct? There are other changes related to trunc/floor/ceil in 3.9, they can reduce or increase the relative effect of this optimization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could just drop the second sentence and keep the first here.