-
-
Notifications
You must be signed in to change notification settings - Fork 30.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make dict
objects thread-safe in --disable-gil
builds
#112075
Comments
I'd like to work on this one. |
Thanks @chgnrdv! I think "making operations use the critical section API" can be started now, but most of the other steps still have some pre-requisites that are not yet implemented. The dict changes are going to be pretty big. It may make sense to convert a few operations at a time to use the critical section API. |
For `dict.__len__`, use `_Py_atomic_load_ssize_relaxed` to access `PyDictObject` `ma_used` field. For the following methods * `dict.fromkeys` * `dict.copy` * `dict_richcompare` * `dict.clear` * `dict.__sizeof__` * `dict.__or__` * `dict.__ior__` * `dict.__reversed__` * `dict.keys` * `dict.items` * `dict.values` use critical section API, either in form of AC directive or macro.
@colesbury , I made the PR #112247 that covers the most obvious cases. Lots of methods are still to be converted, mostly the ones that access a single element and are expected to optimistically avoid locking. Most of them do not access |
Regarding:
This may be a non-issue in 3.12 and beyond. With the introduction of managed dicts when the values are all stored in the in-line array the shared dict keys is not incremented. If we make a full dict for the object then we will end up initializing it and adding a reference, but it seems like at least as far as the Python test suite is concerned that's a rare event. It seems like most processes hit it less than 10 times, and a handful hit it around 100. |
I think this avoiding refcounting The prevalence of |
Can you explain more what you were thinking about freeing them during cyclic GC then? Given that Python's GC drives down reference counts I'm not 100% sure how that'd work. The only thing I can imagine working is that we maintain a separate list of all shared keys and note which ones weren't visited during a full GC, although maybe you had something better in mind :) |
That's the basic idea, but the list only needs to bother with shared keys for types that were deallocated. If the type is still alive, then the shared keys must be alive too. The strategy in
The collection is simpler than general PyObject collection because keys objects are private data structures that can only be referenced by dicts. |
* Move more dict objects to argument clinic * Improve doc strings * More doc string improvements * Update Objects/dictobject.c Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com> * Update Objects/dictobject.c Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com> * Update Objects/dictobject.c Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com> * Update Objects/dictobject.c Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com> * Update Objects/dictobject.c Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com> * Update Objects/dictobject.c Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com> * Update Objects/dictobject.c Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com> --------- Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com>
… thread safety (#114512) * Bring in a subset of biased reference counting: colesbury/nogil@b6b12a9a94e The NoGIL branch has functions for attempting to do an incref on an object which may or may not be in flight. This just brings those functions over so that they will be usable from in the dict implementation to get items w/o holding a lock. There's a handful of small simple modifications: Adding inline to the force inline functions to avoid a warning, and switching from _Py_ALWAYS_INLINE to Py_ALWAYS_INLINE as that's available Remove _Py_REF_LOCAL_SHIFT as it doesn't exist yet (and is currently 0 in the 3.12 nogil branch anyway) ob_ref_shared is currently Py_ssize_t and not uint32_t, so use that _PY_LIKELY doesn't exist, so drop it _Py_ThreadLocal becomes _Py_IsOwnedByCurrentThread Add '_PyInterpreterState_GET()' to _Py_IncRefTotal calls. Co-Authored-By: Sam Gross <colesbury@gmail.com>
…ents (#114568) Dictionary global version counter should use atomic increments
…ty (#114629) Refactor dict lookup functions to use force inline helpers
… when reading lock-free (python#115786)
Free objects with qsbr if shared
Make _PyDict_LoadGlobal threadsafe
…fe (#114742) Make instance attributes stored in inline "dict" thread safe on free-threaded builds
Lock shared keys in `Py_dict_lookup` and use thread-safe lookup in `insertdict` Co-authored-by: Sam Gross <colesbury@gmail.com>
@DinoV is the dict work finished? Should we close this issue? |
I think so, anything else that comes up would just be a bug. |
use thread state set of dict versions
Fix dict thread safety issues (cherry picked from commit f0ed186) Co-authored-by: Germán Méndez Bravo <kronuz@fb.com>
Fix dict thread safety issues
…14256) * Move more dict objects to argument clinic * Improve doc strings * More doc string improvements * Update Objects/dictobject.c Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com> * Update Objects/dictobject.c Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com> * Update Objects/dictobject.c Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com> * Update Objects/dictobject.c Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com> * Update Objects/dictobject.c Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com> * Update Objects/dictobject.c Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com> * Update Objects/dictobject.c Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com> --------- Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com>
…n dict thread safety (python#114512) * Bring in a subset of biased reference counting: colesbury/nogil@b6b12a9a94e The NoGIL branch has functions for attempting to do an incref on an object which may or may not be in flight. This just brings those functions over so that they will be usable from in the dict implementation to get items w/o holding a lock. There's a handful of small simple modifications: Adding inline to the force inline functions to avoid a warning, and switching from _Py_ALWAYS_INLINE to Py_ALWAYS_INLINE as that's available Remove _Py_REF_LOCAL_SHIFT as it doesn't exist yet (and is currently 0 in the 3.12 nogil branch anyway) ob_ref_shared is currently Py_ssize_t and not uint32_t, so use that _PY_LIKELY doesn't exist, so drop it _Py_ThreadLocal becomes _Py_IsOwnedByCurrentThread Add '_PyInterpreterState_GET()' to _Py_IncRefTotal calls. Co-Authored-By: Sam Gross <colesbury@gmail.com>
We are going to need a variety of techniques to make dictionaries thread-safe in
--disable-gil
builds. I expect this to be implemented across multiple PRs.For context, here is the change from the
nogil-3.12
fork, but things might be done a bit differently in CPython 3.13: colesbury/nogil-3.12@d896dfc8dbPyDictKeysObject
needs special handling (see below)PyDictOrValues
needs special handling (see below)In general, we want to avoid making changes which hurt the performance of the default build. The "critical sections" are no-ops in the default build, but we will need to take care with other changes. Some may be behind
#ifdef
guards.PyDictKeysObject
Note that
PyDictKeysObject
is NOT aPyObject
subclass.We need a mutex in
PyDictKeysObject
because multiple threads may concurrently insert keys into sharedPyDictKeysObject
. The mutex is only used for modifications to "shared" keys; non-shared keys rely on the locking for the dict object.In
--disable-gil
builds, we want to avoid refcounting sharedPyDictKeysObject
for performance and thread-safety reasons. Instead shared keys objects should be freed during cyclic GC. (Non-shared keys don't need reference counting because they "owned" by a single dict object.) We may want to consider making this change for both the default build and--disable-gil
builds to make maintenance easier if there is not a negative perf impact.PyDictOrValues
PyDictOrValues
is the "managed" dict in some PyObject pre-headers. We need some locking/atomic operations to handle the transition betweenPyDictValues*
and storing aPyDictObject*
Optimistically Avoid Locking
See https://peps.python.org/pep-0703/#optimistically-avoiding-locking.
Linked PRs
dict
operations thread-safe without GIL #112247The text was updated successfully, but these errors were encountered: