-
-
Notifications
You must be signed in to change notification settings - Fork 31.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-46541: Replace core use of _Py_IDENTIFIER() with statically initialized global objects. #30928
bpo-46541: Replace core use of _Py_IDENTIFIER() with statically initialized global objects. #30928
Conversation
This change is a prerequisite for generating code for other global objects (like strings in gh-30928). (We borrowed some code from Tools/scripts/deepfreeze.py.) https://bugs.python.org/issue46541
3673c6c
to
3ab48c8
Compare
ae35362
to
1612a4a
Compare
PyObject * | ||
_PyDict_GetItemWithError(PyObject *dp, PyObject *kv) | ||
{ | ||
assert(PyUnicode_CheckExact(kv)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a half-baked idea: would it be possible to initialize all hashes of global strings at interpreter startup, then eliminate some branching with a special
int
_PyDict_GetItemStrWithHashInitialized(PyDictObject *mp, PyUnicodeObject *key)
{
assert(PyDict_CheckExact(mp));
assert(PyUnicode_CheckExact(key));
Py_hash_t hash = ((PyASCIIObject *)key)->hash;
assert(hash != -1);
// Inline _PyDict_GetItem_KnownHash --> _Py_dict_lookup
PyDictKeysObject *dk = mp->ma_keys;
Py_ssize_t ix = dictkeys_stringlookup(dk, key, hash);
if (ix == DKIX_EMPTY) {
return NULL;
}
else if (kind == DICT_KEYS_SPLIT) {
return mp->ma_values->values[ix];
}
else {
return DK_ENTRIES(dk)[ix].me_value;
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add this to https://github.com/faster-cpython/ideas/discussions?
@markshannon approved this offline. |
Update for consistency with pythongh-30928 (bpo-46541).
In this PR we're no longer using
_Py_IDENTIFIER()
(or_Py_static_string()
) in any core CPython code. It is still used in a number of non-builtin stdlib modules.The replacement is:
PyUnicodeObject
(not pointer) fields under_PyRuntimeState
, statically initialized as part of_PyRuntime
. A new_Py_GET_GLOBAL_IDENTIFIER()
macro facilitates lookup of the fields (along with_Py_GET_GLOBAL_STRING()
for non-identifier strings).https://bugs.python.org/issue46541#msg411799 explains the rationale for this change.
The core of the change is in:
_PyRuntimeState
I've also added a
--check
flag to generate_global_objects.py (along withmake check-global-objects
) to check for unused global strings. That check is added to the PR CI config.The remainder of the PR updates the core code to use
_Py_GET_GLOBAL_IDENTIFIER()
instead of_Py_IDENTIFIER()
and the related_Py*Id
functions (likewise for_Py_GET_GLOBAL_STRING()
instead of_Py_static_string()
). This includes adding a few functions where there wasn't already an alternative to_Py*Id()
, replacing the_Py_Identifier *
parameter withPyObject *
.I'm planning on addressing the following separately:
_Py_IDENTIFIER()
in the stdlib modules_Py_IDENTIFIER()
, etc. entirely -- this may not be doable as at least one package on PyPI using this (private) APIhttps://bugs.python.org/issue46541