-
-
Notifications
You must be signed in to change notification settings - Fork 30.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-126491: Lower heap size limit with faster marking #127519
base: main
Are you sure you want to change the base?
Conversation
!buildbot iOS |
🤖 New build scheduled with the buildbot fleet by @markshannon for commit 8893cf5 🤖 The command will test the builders whose names match following regular expression: The builders matched are:
|
!buildbot Android |
🤖 New build scheduled with the buildbot fleet by @markshannon for commit 8893cf5 🤖 The command will test the builders whose names match following regular expression: The builders matched are:
|
!buildbot Android|iOS |
🤖 New build scheduled with the buildbot fleet by @markshannon for commit 8262bf0 🤖 The command will test the builders whose names match following regular expression: The builders matched are:
|
Performance is a wash overall, but I think that is an artifact of our benchmarks. I would expect this to perform better on larger heaps and consume less memory, although the benchmarks show no overall change in memory consumption. Note that the "create gc cycles" benchmark shows a 10% speedup and "gc traversal" an 8% speedup, but there is an equivalent slowdown on the "xml etree" benchmarks. |
InternalDocs/garbage_collector.md
Outdated
|
||
To work out how much work we need to do, consider a heap with `L` live objects | ||
and `G0` garbage objects at the start of a full scavenge and `G1` garbage objects | ||
at the end of the scavenge. We don't want amount of garbage to grow, `G1 ≤ G0`, and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at the end of the scavenge. We don't want amount of garbage to grow, `G1 ≤ G0`, and | |
at the end of the scavenge. We don't want the amount of garbage to grow, `G1 ≤ G0`, and |
InternalDocs/garbage_collector.md
Outdated
The number of new objects created `N` must be at least the new garbage created, `N ≥ G1`, | ||
assuming that the number of live objects remains roughly constant. | ||
If we set `T == 4*N` we get `T > 4*G1` and `T = L + G0 + G1` => `L + G0 > 3G1` | ||
For a steady state heap `G0 == G1` we get `L > 2G` and the desired garbage ratio. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a steady state heap `G0 == G1` we get `L > 2G` and the desired garbage ratio. | |
For a steady state heap `G0 == G1` we get `L > 2*G0` and the desired garbage ratio. |
InternalDocs/garbage_collector.md
Outdated
The number of new objects created `N` must be at least the new garbage created, `N ≥ G1`, | ||
assuming that the number of live objects remains roughly constant. | ||
If we set `T == 4*N` we get `T > 4*G1` and `T = L + G0 + G1` => `L + G0 > 3G1` | ||
For a steady state heap `G0 == G1` we get `L > 2G` and the desired garbage ratio. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a steady state heap `G0 == G1` we get `L > 2G` and the desired garbage ratio. | |
For a steady state heap (`G0 == G1`) we get `L > 2G` and the desired garbage ratio. |
InternalDocs/garbage_collector.md
Outdated
|
||
If we choose the amount of work done such that `2*M + I == 6N` then we can do | ||
less work in most cases, but are still guaranteed to keep up. | ||
Since `I ≥ G0 + G1` (not strictly true, but close enough) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since `I ≥ G0 + G1` (not strictly true, but close enough) | |
Since `I ≈ G0 + G1` (not strictly true, but close enough) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The increments (I) can include some of the live heap, depending on the how much is keep alive by C extensions.
So ≥
is more correct. Although ≳
is even more correct.
InternalDocs/garbage_collector.md
Outdated
If we choose the amount of work done such that `2*M + I == 6N` then we can do | ||
less work in most cases, but are still guaranteed to keep up. | ||
Since `I ≥ G0 + G1` (not strictly true, but close enough) | ||
`T == M + I == (6N + I)/2` and `(6N + I)/2 ≥ 4G`, so we can keep up. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
`T == M + I == (6N + I)/2` and `(6N + I)/2 ≥ 4G`, so we can keep up. | |
`T == M + I == (6N + I)/2` and `(6N + I)/2 ≳ 4G`, so we can keep up. |
InternalDocs/garbage_collector.md
Outdated
`T == M + I == (6N + I)/2` and `(6N + I)/2 ≥ 4G`, so we can keep up. | ||
|
||
The reason that this improves performance is that `M` is usually much larger | ||
than `I` Suppose `M == 10I`, then `T ≅ 3N`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
than `I` Suppose `M == 10I`, then `T ≅ 3N`. | |
than `I`. If `M == 10I`, then `T ≅ 3N`. |
@@ -1558,24 +1547,26 @@ mark_at_start(PyThreadState *tstate) | |||
static intptr_t | |||
assess_work_to_do(GCState *gcstate) | |||
{ | |||
/* The amount of work we want to do depends on three things. | |||
/* The amount of work we want to do depends on two things. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth linking to the doc from here?
Python/gc.c
Outdated
*/ | ||
intptr_t scale_factor = gcstate->old[0].threshold; | ||
if (scale_factor < 2) { | ||
scale_factor = 2; | ||
} | ||
intptr_t new_objects = gcstate->young.count; | ||
intptr_t max_heap_fraction = new_objects*3/2; | ||
intptr_t max_heap_fraction = new_objects*5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this called fraction
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not for any good reason. I'll rename it.
With marking added to the cyclic GC (#127110) we spend a lot of the time in the GC forming transitive closures, both for marking and for the increments of the incremental GC.
Unfortunately the current algorithm has a couple of mistakes in it. One harmful, one beneficial.
These more or less cancel out.
This PR deliberately counts marking as twice as effective as normal collection, but limits the amount of work done.
To do so, we need to increase the typical amount of work done a bit.
This has the advantage of limiting the amount of garbage to (roughly) 1/3 of the heap.
This PR does two things: