Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restore (or beat) Python 2 performance for arithmetic operations on ints that fit into a single word #101291

Open
markshannon opened this issue Jan 24, 2023 · 7 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage

Comments

@markshannon
Copy link
Member

markshannon commented Jan 24, 2023

In Python 2 ints and longs were different objects, and the design of each was tailored to the different size and use cases.
In Python3 we dropped the distinction, but we also dropped the design for ints that fit into a single word.
We have added various fast paths for "medium" integers (e.g. #89109) but the underlying data structure gets in the way.

We should layout the int/long object so that it supports fast operations for most integers.

See faster-cpython/ideas#548 for a fuller discussion

Linked PRs

@markshannon markshannon added the performance Performance or resource usage label Jan 24, 2023
markshannon added a commit that referenced this issue Jan 30, 2023
mdboom pushed a commit to mdboom/cpython that referenced this issue Jan 31, 2023
markshannon added a commit that referenced this issue Mar 22, 2023
* Eliminate all remaining uses of Py_SIZE and Py_SET_SIZE on PyLongObject, adding asserts.

* Change layout of size/sign bits in longobject to support future addition of immortal ints and tagged medium ints.

* Add functions to hide some internals of long object, and for setting sign and digit count.

* Replace uses of IS_MEDIUM_VALUE macro with _PyLong_IsCompact().
Fidget-Spinner pushed a commit to Fidget-Spinner/cpython that referenced this issue Mar 27, 2023
…2464)

* Eliminate all remaining uses of Py_SIZE and Py_SET_SIZE on PyLongObject, adding asserts.

* Change layout of size/sign bits in longobject to support future addition of immortal ints and tagged medium ints.

* Add functions to hide some internals of long object, and for setting sign and digit count.

* Replace uses of IS_MEDIUM_VALUE macro with _PyLong_IsCompact().
warsaw pushed a commit to warsaw/cpython that referenced this issue Apr 11, 2023
…2464)

* Eliminate all remaining uses of Py_SIZE and Py_SET_SIZE on PyLongObject, adding asserts.

* Change layout of size/sign bits in longobject to support future addition of immortal ints and tagged medium ints.

* Add functions to hide some internals of long object, and for setting sign and digit count.

* Replace uses of IS_MEDIUM_VALUE macro with _PyLong_IsCompact().
markshannon added a commit that referenced this issue May 21, 2023
Co-authored-by: Petr Viktorin <encukou@gmail.com>
miss-islington pushed a commit to miss-islington/cpython that referenced this issue May 22, 2023
…pythonGH-104742)

(cherry picked from commit e295d86)

Co-authored-by: Mark Shannon <mark@hotpy.org>
JelleZijlstra pushed a commit that referenced this issue May 23, 2023
GH-104742) (#104759)

(cherry picked from commit e295d86)

Co-authored-by: Mark Shannon <mark@hotpy.org>
@Yhg1s
Copy link
Member

Yhg1s commented Jul 6, 2023

Has there been any progress in documenting the changes made in Python 3.12? (#101292 (comment))

@gvanrossum
Copy link
Member

Maybe @markshannon can answer that? AFAICT all the commits linked above are his. I know we had someone who was interested in pursuing this further but she had to bow out.

@Yhg1s
Copy link
Member

Yhg1s commented Jul 20, 2023

@markshannon Where are we with documentation for this? If it's not documented, do we need to start working to roll this back? I'm not comfortable with this change in rc1 if it's not documented.

@gvanrossum
Copy link
Member

Mark is at EuroPython. If you are there too you can talk to him. We will get it documented.

@gvanrossum
Copy link
Member

I did a little research. It looks like there are two key changes. First, struct _longobject (defined in Include/cpython/longintrepr.h but considered an internal implementation detail) has changed. It used to be

struct _longobject {
    PyObject_VAR_HEAD
    digit ob_digit[1];
};

I.e., there was an array ob_digit whose length was abs(ob_size), where sign(ob_size) gave the sign of the overall value.

The new (still internal) representation is as follows:

typedef struct _PyLongValue {
    uintptr_t lv_tag; /* Number of digits, sign and flags */
    digit ob_digit[1];
} _PyLongValue;

struct _longobject {
    PyObject_HEAD
    _PyLongValue long_value;
};

and there are new internal macros to determine the number of digits and the sign, and a bunch of internal macros to handle "compact" values (which fit in 1-2 "digits").

There are two new public, unstable APIs to support the concept of "compact" values: PyUnstable_Long_IsCompact and PyUnstable_Long_CompactValue. See https://docs.python.org/3.12/c-api/long.html#c.PyUnstable_Long_IsCompact. (In reality these are implemented as macros, and not intended to be part of any ABI.) Everything else that digs through the internals is defined in Include/internal/pycore_long.h, and requires defining Py_BUILD_CORE.

Details of what the bits in lv_tag mean are intentionally not published -- these are meant to be opaque. Applications that used to dig through ob_digits using ob_size as guidance will break, and have two options: Switch to calling the Python-level APIs int.to_bytes() and int.from_bytes() via PyObject_CallMethod() (see note at https://docs.python.org/3.12/c-api/long.html#c.PyLong_FromString). Or go hard-core, defining Py_BUILD_CORE and importing pycore_long.h. Or, I guess, an intermediate path is to use the new unstable public APIs for dealing with "compact" values and use the slower arbitrary-precision API for non-compact values.

I think in the What's New in 3.12 we should at least mention the change in the struct (calling out that using ob_size and ob_digits is no longer supported) and the new unstable public APIs (and what they're for). I don't think we need to call out the hard-core option, but maybe a reminder about to_bytes() and from_bytes() would be useful (even though that's been in the docs at least since 3.10).

@Yhg1s @markshannon What do you think of this? I volunteer to make a PR for what's new 3.12 along the lines of what I wrote above.

miss-islington pushed a commit to miss-islington/cpython that referenced this issue Jul 28, 2023
…rnals have changed. (pythonGH-107388)

(cherry picked from commit 1ee605c)

Co-authored-by: Mark Shannon <mark@hotpy.org>
gvanrossum pushed a commit that referenced this issue Jul 31, 2023
…ernals have changed. (GH-107388) (#107392)

(cherry picked from commit 1ee605c)

Co-authored-by: Mark Shannon <mark@hotpy.org>
@gvanrossum
Copy link
Member

@Yhg1s Assuming the changes Mark made to what's new in 3.12 are what you wanted?

@Yhg1s
Copy link
Member

Yhg1s commented Jul 31, 2023

Yep, that's adequate.

@iritkatriel iritkatriel added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Nov 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage
Projects
None yet
Development

No branches or pull requests

4 participants