-
-
Notifications
You must be signed in to change notification settings - Fork 30.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-115103: Implement delayed memory reclamation (QSBR) #115180
Conversation
This adds a safe memory reclamation scheme based on FreeBSD's "GUS" and quiescent state based reclamation (QSBR). The API provides a mechanism for callers to detect when it is safe to free memory that may be concurrently accessed by readers.
Notes for reviewers:
|
c0034b7
to
b03c5ca
Compare
if (QSBR_LT(rd_seq, min_seq)) { | ||
// It's okay if the compare-exchange failed: another thread updated it | ||
(void)_Py_atomic_compare_exchange_uint64(&shared->rd_seq, &rd_seq, min_seq); | ||
rd_seq = min_seq; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the compare/exchange fails is it worth returning the greater of rd_seq
and min_seq
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't matter for correctness and I'm not sure it's worth any extra complexity for performance.
Python/qsbr.c
Outdated
// Starting size of the array of qsbr thread states | ||
#define MIN_ARRAY_SIZE 8 | ||
|
||
// The shared write sequence is always odd and incremented by two. Detached |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we document why the shared write sequence is always odd? I'm assuming that's just to avoid issues with it wrapping around and colliding with QSBR_OFFLINE
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that was the original motivation, but as I wrote in another comment, with 64-bit counters we don't really have to worry about wrap around.
I'm still tempted to keep this numbering scheme and the QBSR_LT
wrap-around safe comparisons, but let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine to keep it this way, it's just worth a comment on why it's that way :)
@@ -169,6 +169,10 @@ extern PyTypeObject _PyExc_MemoryError; | |||
{ .threshold = 10, }, \ | |||
}, \ | |||
}, \ | |||
.qsbr = { \ | |||
.wr_seq = 1, \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can these be inited to QSBR_INITIAL
(it seems like including pycore_qsbr.h
shouldn't be an issue?)
// If there are no free entries, we pause all threads, grow the array, | ||
// and update the pointers in PyThreadState to entries in the new array. | ||
if (qsbr == NULL) { | ||
_PyEval_StopTheWorld(interp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like this isn't safe to do with the mutex locked and not-detached? If another thread is trying to reserve at the same time it won't be detached, but it also won't be responding to the request for the eval breaker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right -- I mixed up the locking discipline. The outer lock should not use _Py_LOCK_DONT_DETACH
. I think the locking discipline with stop-the-world should be:
- A lock can be acquired before starting a stop-the-world pause or after starting it, but not both (consistent lock ordering)
- If acquired before the stop-the-world, the acquisition should use
_Py_LOCK_DETACH
(the default) - If acquired within the stop-the-world, the acquisition should use
_Py_LOCK_DONT_DETACH
HEAD_LOCK(runtime)
is an example of case 3. This is case 2.
The reasoning for case (3) is a bit subtle: locks can be directly handed off to a detached thread, so with a normal PyMutex_Lock(&m)
a thread can both hold the mutex m
and have paused for another thread's stop-the-world.
Python/qsbr.c
Outdated
|
||
// Initialize (or reintialize) the freelist of QSBR thread states | ||
static void | ||
initialize_freelist(struct _qsbr_shared *shared) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe initialize_new_array
or something like that? It's just that it's not only initing the free lists, it's also initializing all of the thread state pointers into the array (and the new part covers the re-initialization)
uint64_t | ||
_Py_qsbr_advance(struct _qsbr_shared *shared) | ||
{ | ||
return _Py_atomic_add_uint64(&shared->wr_seq, QSBR_INCR) + QSBR_INCR; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The FreeBSD version has logic to avoid making sure that wr_seq
doesn't get too far ahead from rd_seq
and if so blocks until it catches up to avoid extreme wrap around issues... Is there a reason we don't need that? :). Or should we at least add a TODO and/or an assertion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are using 64-bit sequence counters to avoid worrying about wrap around. FreeBSD uses 32-bit counters. (2^62 ns = ~146 years; 2^30 ns = ~1 sec)
We could consider using 32-bit counters and handling wrap around in the future. It wouldn't matter much for x86-64 and aarch64, but would probably be more efficient on some other platforms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, that makes sense, FreeBSD's is buried in a typedef so I hadn't noticed it was 32-bit.
return true; | ||
} | ||
|
||
rd_seq = qsbr_poll_scan(qsbr->shared); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably just related to future scaling, but I wonder if we could pass in the goal and bail on the full-scan if any thread hasn't reached it yet. It might prevent us from updating the shared for someone else who would benefit from it though...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear to me if that case will happen often enough to be worth handling. Unfortunately, we can't really measure or test for that until we disable the GIL. With the GIL, only one thread is ever attached and active.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
@sethmlarson, does this need to be included in the SBOM? This includes code derived from a FreeBSD component, but there isn't really any shared code. In other words, it's derivative enough that I think the copyright notice should be preserved, but different enough that if there were any security vulnerabilities in the FreeBSD component, they would not be relevant here. |
@colesbury Thanks for the ping! Great question, the SBOM for CPython actually doesn't concern itself with copyright or licensing right now, it's primary use-case is for vulnerability management and tracking. Given this I don't think we need to update the SBOM for this contribution. |
…115180) This adds a safe memory reclamation scheme based on FreeBSD's "GUS" and quiescent state based reclamation (QSBR). The API provides a mechanism for callers to detect when it is safe to free memory that may be concurrently accessed by readers.
…115180) This adds a safe memory reclamation scheme based on FreeBSD's "GUS" and quiescent state based reclamation (QSBR). The API provides a mechanism for callers to detect when it is safe to free memory that may be concurrently accessed by readers.
This adds a safe memory reclamation scheme for the free-threaded build based on FreeBSD's "GUS" and quiescent state based reclamation (QSBR). The API provides a mechanism for callers to detect when it is safe to free memory that may be concurrently accessed by readers.
Related PRs that build on this:
📚 Documentation preview 📚: https://cpython-previews--115180.org.readthedocs.build/