gh-115103: Implement delayed memory reclamation (QSBR) #115180

colesbury · 2024-02-08T20:40:05Z

This adds a safe memory reclamation scheme for the free-threaded build based on FreeBSD's "GUS" and quiescent state based reclamation (QSBR). The API provides a mechanism for callers to detect when it is safe to free memory that may be concurrently accessed by readers.

Issue: Add delayed reclamation mechanism for free-threaded build (QSBR) #115103

Related PRs that build on this:

📚 Documentation preview 📚: https://cpython-previews--115180.org.readthedocs.build/

This adds a safe memory reclamation scheme based on FreeBSD's "GUS" and quiescent state based reclamation (QSBR). The API provides a mechanism for callers to detect when it is safe to free memory that may be concurrently accessed by readers.

colesbury · 2024-02-08T20:55:01Z

Notes for reviewers:

This PR does not contain any uses of the API; I wanted to keep this PR a manageable size. I'll put up a PR that makes use of these APIs soon.
The per-thread states (struct _qsbr_thread_state) are stored in a contiguous array to make scanning quicker, but this makes the thread initialization code a bit more complicated.
I think this is one of the cases where memory orderings are particularly tricky and hard to reason about intuitively. I found it helpful to verify the orderings and fences with CDSChecker. This is the model I used: https://github.com/colesbury/c11-model-checker/blob/cpython-models/test/qsbr.c.
_Py_qsbr_poll() will have scaling issues with large numbers of threads. That's something we'll want to deal with eventually, but possibly not in the 3.13 timeframe.

DinoV · 2024-02-14T22:33:59Z

Python/qsbr.c

+    if (QSBR_LT(rd_seq, min_seq)) {
+        // It's okay if the compare-exchange failed: another thread updated it
+        (void)_Py_atomic_compare_exchange_uint64(&shared->rd_seq, &rd_seq, min_seq);
+        rd_seq = min_seq;


If the compare/exchange fails is it worth returning the greater of rd_seq and min_seq?

It doesn't matter for correctness and I'm not sure it's worth any extra complexity for performance.

DinoV · 2024-02-14T22:58:29Z

Python/qsbr.c

+// Starting size of the array of qsbr thread states
+#define MIN_ARRAY_SIZE 8
+
+// The shared write sequence is always odd and incremented by two. Detached


Can we document why the shared write sequence is always odd? I'm assuming that's just to avoid issues with it wrapping around and colliding with QSBR_OFFLINE?

Yeah, that was the original motivation, but as I wrote in another comment, with 64-bit counters we don't really have to worry about wrap around.

I'm still tempted to keep this numbering scheme and the QBSR_LT wrap-around safe comparisons, but let me know what you think.

I think it's fine to keep it this way, it's just worth a comment on why it's that way :)

DinoV · 2024-02-14T22:59:03Z

Include/internal/pycore_runtime_init.h

@@ -169,6 +169,10 @@ extern PyTypeObject _PyExc_MemoryError;
                { .threshold = 10, }, \
            }, \
        }, \
+        .qsbr = { \
+            .wr_seq = 1, \


Can these be inited to QSBR_INITIAL (it seems like including pycore_qsbr.h shouldn't be an issue?)

DinoV · 2024-02-14T23:12:18Z

Python/qsbr.c

+    // If there are no free entries, we pause all threads, grow the array,
+    // and update the pointers in PyThreadState to entries in the new array.
+    if (qsbr == NULL) {
+        _PyEval_StopTheWorld(interp);


It seems like this isn't safe to do with the mutex locked and not-detached? If another thread is trying to reserve at the same time it won't be detached, but it also won't be responding to the request for the eval breaker.

You are right -- I mixed up the locking discipline. The outer lock should not use _Py_LOCK_DONT_DETACH. I think the locking discipline with stop-the-world should be:

A lock can be acquired before starting a stop-the-world pause or after starting it, but not both (consistent lock ordering)

If acquired before the stop-the-world, the acquisition should use _Py_LOCK_DETACH (the default)

If acquired within the stop-the-world, the acquisition should use _Py_LOCK_DONT_DETACH

HEAD_LOCK(runtime) is an example of case 3. This is case 2.

The reasoning for case (3) is a bit subtle: locks can be directly handed off to a detached thread, so with a normal PyMutex_Lock(&m) a thread can both hold the mutex m and have paused for another thread's stop-the-world.

DinoV · 2024-02-14T23:16:47Z

Python/qsbr.c

+
+// Initialize (or reintialize) the freelist of QSBR thread states
+static void
+initialize_freelist(struct _qsbr_shared *shared)


Maybe initialize_new_array or something like that? It's just that it's not only initing the free lists, it's also initializing all of the thread state pointers into the array (and the new part covers the re-initialization)

DinoV · 2024-02-14T23:44:12Z

Python/qsbr.c

+uint64_t
+_Py_qsbr_advance(struct _qsbr_shared *shared)
+{
+    return _Py_atomic_add_uint64(&shared->wr_seq, QSBR_INCR) + QSBR_INCR;


The FreeBSD version has logic to avoid making sure that wr_seq doesn't get too far ahead from rd_seq and if so blocks until it catches up to avoid extreme wrap around issues... Is there a reason we don't need that? :). Or should we at least add a TODO and/or an assertion?

We are using 64-bit sequence counters to avoid worrying about wrap around. FreeBSD uses 32-bit counters. (2^62 ns = ~146 years; 2^30 ns = ~1 sec)

We could consider using 32-bit counters and handling wrap around in the future. It wouldn't matter much for x86-64 and aarch64, but would probably be more efficient on some other platforms.

Okay, that makes sense, FreeBSD's is buried in a typedef so I hadn't noticed it was 32-bit.

DinoV · 2024-02-14T23:55:03Z

Python/qsbr.c

+        return true;
+    }
+
+    rd_seq = qsbr_poll_scan(qsbr->shared);


This is probably just related to future scaling, but I wonder if we could pass in the goal and bail on the full-scan if any thread hasn't reached it yet. It might prevent us from updating the shared for someone else who would benefit from it though...

It's not clear to me if that case will happen often enough to be worth handling. Unfortunately, we can't really measure or test for that until we disable the GIL. With the GIL, only one thread is ever attached and active.

DinoV

LGTM!

colesbury · 2024-02-16T19:23:19Z

@sethmlarson, does this need to be included in the SBOM? This includes code derived from a FreeBSD component, but there isn't really any shared code. In other words, it's derivative enough that I think the copyright notice should be preserved, but different enough that if there were any security vulnerabilities in the FreeBSD component, they would not be relevant here.

sethmlarson · 2024-02-16T20:09:33Z

@colesbury Thanks for the ping! Great question, the SBOM for CPython actually doesn't concern itself with copyright or licensing right now, it's primary use-case is for vulnerability management and tracking. Given this I don't think we need to update the SBOM for this contribution.

…115180) This adds a safe memory reclamation scheme based on FreeBSD's "GUS" and quiescent state based reclamation (QSBR). The API provides a mechanism for callers to detect when it is safe to free memory that may be concurrently accessed by readers.

colesbury requested a review from DinoV February 8, 2024 20:40

colesbury requested review from a team, ericsnowcurrently, markshannon and gvanrossum as code owners February 8, 2024 20:40

bedevere-app bot added the awaiting core review label Feb 8, 2024

bedevere-app bot mentioned this pull request Feb 8, 2024

Add delayed reclamation mechanism for free-threaded build (QSBR) #115103

Closed

colesbury added skip news topic-free-threading labels Feb 8, 2024

colesbury added 2 commits February 8, 2024 20:44

Fix default build

a33b739

Mark function as static

39e3954

colesbury added 2 commits February 8, 2024 21:03

Fix indentation in qsbr.c

f1c3740

Merge branch 'main' into pythongh-115103-qsbr

35cbc0e

This was referenced Feb 12, 2024

gh-115103: Implement delayed free mechanism for free-threaded builds #115367

Merged

gh-115103: Delay reuse of mimalloc pages that store PyObjects #115435

Merged

Merge branch 'main' into pythongh-115103-qsbr

b03c5ca

colesbury force-pushed the gh-115103-qsbr branch from c0034b7 to b03c5ca Compare February 14, 2024 20:32

DinoV reviewed Feb 14, 2024

View reviewed changes

Fix potential deadlock and other changes from review

94a95d5

DinoV approved these changes Feb 15, 2024

View reviewed changes

bedevere-app bot added awaiting merge and removed awaiting core review labels Feb 15, 2024

colesbury added 2 commits February 15, 2024 17:58

Add comment about QSBR_LT/QSBR_LEQ

69d5669

Merge branch 'main' into pythongh-115103-qsbr

be441dc

colesbury requested a review from sethmlarson February 16, 2024 19:16

colesbury merged commit 5903190 into python:main Feb 16, 2024

bedevere-app bot removed the awaiting merge label Feb 16, 2024

colesbury deleted the gh-115103-qsbr branch February 16, 2024 20:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gh-115103: Implement delayed memory reclamation (QSBR) #115180

gh-115103: Implement delayed memory reclamation (QSBR) #115180

colesbury commented Feb 8, 2024 •

edited

Loading

colesbury commented Feb 8, 2024

DinoV Feb 14, 2024

colesbury Feb 15, 2024

DinoV Feb 14, 2024

colesbury Feb 15, 2024

DinoV Feb 15, 2024

DinoV Feb 14, 2024

DinoV Feb 14, 2024

colesbury Feb 15, 2024

DinoV Feb 14, 2024

DinoV Feb 14, 2024

colesbury Feb 15, 2024

DinoV Feb 15, 2024

DinoV Feb 14, 2024

colesbury Feb 15, 2024

DinoV left a comment

colesbury commented Feb 16, 2024

sethmlarson commented Feb 16, 2024

gh-115103: Implement delayed memory reclamation (QSBR) #115180

gh-115103: Implement delayed memory reclamation (QSBR) #115180

Conversation

colesbury commented Feb 8, 2024 • edited Loading

colesbury commented Feb 8, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

DinoV left a comment

Choose a reason for hiding this comment

colesbury commented Feb 16, 2024

sethmlarson commented Feb 16, 2024

colesbury commented Feb 8, 2024 •

edited

Loading