Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-46055: Streamline inner loop for right shifts #30243

Merged
merged 6 commits into from
Dec 27, 2021
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 8 additions & 7 deletions Objects/longobject.c
Original file line number Diff line number Diff line change
Expand Up @@ -4491,7 +4491,7 @@ long_rshift1(PyLongObject *a, Py_ssize_t wordshift, digit remshift)
{
PyLongObject *z = NULL;
Py_ssize_t newsize, hishift, i, j;
digit lomask, himask;
twodigits accum;

if (Py_SIZE(a) < 0) {
/* Right shifting negative numbers is harder */
Expand All @@ -4511,16 +4511,17 @@ long_rshift1(PyLongObject *a, Py_ssize_t wordshift, digit remshift)
if (newsize <= 0)
return PyLong_FromLong(0);
hishift = PyLong_SHIFT - remshift;
lomask = ((digit)1 << hishift) - 1;
himask = PyLong_MASK ^ lomask;
z = _PyLong_New(newsize);
if (z == NULL)
return NULL;
for (i = 0, j = wordshift; i < newsize; i++, j++) {
z->ob_digit[i] = (a->ob_digit[j] >> remshift) & lomask;
if (i+1 < newsize)
z->ob_digit[i] |= (a->ob_digit[j+1] << hishift) & himask;
Copy link
Member Author

@mdickinson mdickinson Dec 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new code also fixes a really subtle portability issue in this code. Here the result of a->ob_digit[j+1] * 2**hishift may not be representable in the target type. Normally that wouldn't matter, because a->ob_digit[j+1] has type digit, which is an unsigned type, so the C standards tell us that any out-of-range value wraps in the normal way. But integer promotions could result in the left-hand operand to the shift actually being of type int (a signed type), and then an out-of-range shift result gives undefined behaviour according to the standard (C99 §6.5.7p4). We don't run into this in practice because under any likely combination of integer type bit widths (e.g., 16-bit digit, 32-bit int), if digit is small enough to be promoted to int, then int is likely big enough to hold the shift result. But the C standard does allow potentially problematic bit widths (e.g., digit could be 16 bits and int 24 bits).

Not a real issue, since it's unlikely we'd ever meet this in practice, but it's nice not to have to worry about it. With the new code, the result of the shift is guaranteed to be representable in the target type (that type being either twodigits, or something larger in the case that there are integer promotions going on).

j = wordshift;
accum = a->ob_digit[j++] >> remshift;
for (i = 0; j < Py_SIZE(a); i++, j++) {
accum |= (twodigits)a->ob_digit[j] << hishift;
z->ob_digit[i] = (digit)(accum & PyLong_MASK);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would not be better to operate on a single digit? (digit)accum & PyLong_MASK

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe. I find this code clearer as a statement of intent: we're only changing the value once, not twice (all other things being equal, I like my casts not to change values).

I'll run some timings and look at the generated code. If the cast-first variant is faster, I'll change this.

accum >>= PyLong_SHIFT;
}
z->ob_digit[i] = (digit)accum;
z = maybe_small_long(long_normalize(z));
}
return (PyObject *)z;
Expand Down