-
-
Notifications
You must be signed in to change notification settings - Fork 31.9k
gh-90667: Add specializations of Py_DECREF when types are known #30872
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I tried to only modify hot-ish (not error path) code where DECREF is likely to take the deallocation branch |
Slower (7):
Faster (27):
Benchmark hidden because not significant (24): chaos, django_template, hexiom, logging_format, logging_simple, pickle, pickle_dict, pidigits, pyflate, python_startup_no_site, raytrace, regex_v8, richards, scimark_lu, sqlalchemy_declarative, sqlalchemy_imperative, sympy_expand, sympy_sum, sympy_str, tornado_http, unpickle, unpickle_list, unpickle_pure_python, xml_etree_generate Geometric mean: 1.01x faster |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting idea. I read through most of the math and ceval module changes to verify the types are right.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Thanks. Another nice little speedup. The specialized for of For the other classes, rather than the addition level of macros, I think it would be more readable to use |
Include/internal/pycore_object.h
Outdated
#ifdef Py_DEBUG | ||
if (op->ob_refcnt <= 0) { | ||
// Calls _Py_FatalRefcountError for None, True, and False | ||
_Py_Dealloc(op); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you change this to call _Py_FatalRefcountError
directly?
We might want to use _Py_DECREF_IMMORTAL
for other "immortal" objects like 0
or ()
and there is no guarantee that their de-allocation functions will call _Py_FatalRefcountError
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I moved _Py_FatalRefcountError
from pycore_pyerrors
to pycore_object
as well, rather than having an #include
-dependence between the pycore_files.
🤖 New build scheduled with the buildbot fleet by @sweeneyde for commit ad146ef 🤖 If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again. |
@markshannon would you mind reviewing this again? Thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just benchmarked this and I too see a 1% speedup.
I'm a bit wary of making of making this change, because it feels like we're adding more code for what should be data.
IMO, we should be declaring the shape of objects, not adding more special case code.
But that will have to wait until 3.12.
For 3.11, pragmatism beats purity and we'll take the speedup.
const char *func, | ||
const char *message); | ||
|
||
#define _Py_FatalRefcountError(message) _Py_FatalRefcountErrorFunc(__func__, message) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OOI, why move this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to avoid a dependence between the pycore_...
includes. The unicodeobject.c and tupleobject.c implementations both only used pycore_pyerrors
for _Py_FatalRefcountError()
, which doesn't really concern itself with the implementation details of exceptions.
@@ -36,7 +36,15 @@ medium_value(PyLongObject *x) | |||
#define IS_SMALL_INT(ival) (-_PY_NSMALLNEGINTS <= (ival) && (ival) < _PY_NSMALLPOSINTS) | |||
#define IS_SMALL_UINT(ival) ((ival) < _PY_NSMALLPOSINTS) | |||
|
|||
static inline int is_medium_int(stwodigits x) | |||
static inline void | |||
_Py_DECREF_INT(PyLongObject *op) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this ultimately equivalent to
#define _Py_DECREF_INT(op) _Py_DECREF_SPECIALIZED(op, PyObject_Free)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed _PyLongExactDealloc
, but I'd like to keep the assertion around to make sure we're never accidentally calling this on int subclasses.
#90667