Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The GC often doesn't act generational #40644

Open
oxinabox opened this issue Apr 28, 2021 · 7 comments
Open

The GC often doesn't act generational #40644

oxinabox opened this issue Apr 28, 2021 · 7 comments
Assignees
Labels
GC Garbage collector performance Must go faster

Comments

@oxinabox
Copy link
Contributor

oxinabox commented Apr 28, 2021

Julia has a generational mark and sweep GC.
One of the properties of a mark and sweep GC is that it needs to go through and mark every reference object in memory to decide if it should be swept whenever the GC runs.
This is very expensive if you have a lot of objects in memory, and the cost is incurred even if those objects are not being freed (or even read).
Which is where the generational aspect comes in.
It applies a heuristic to break the objects in memory into generations.
Things that have been referenced for a long time (in some sense) are likely to continue to be referenced, and so are moved into the older generations.
The generational GC should mark and sweep the young generations very often, and the older generations more rarely.
This should mean if you have some large object with lots of references that are e.g. allocated at the start of your program and kept until the end of the program, it will make the GC slow at first since it is so large and takes so long to mark, but after a little while it will move into the older generations and very rarely be marked, and so cease to have an effect on the time it takes to do normal GCs.

The following examples use
ZonedDateTimes from TimeZones@1.5.3 which are for now not isbits though that is being worked on (JuliaTime/TimeZones.jl#271).
You can create a similar example that uses any non-isbits objects such as mutable objects, things with abstract typed fields, or things with non-isbits fields (such as String or any Vector).
This one I had handy.

You can see in this first example the generational GC working.
At first, after the zdts are allocated there is a sharp jump in the time spent on GC, up to around 90%.
But then after a few rounds, the zdts have moved into the older generation,
and the time shifts spent on GC goes back down nicely to what it was at the start: around 20%

julia> # A function that creates a decent amount of garbage,
       little_burn() = sum(hash, ["trashfire"^min(1000, i) for i in 1:10_000])
little_burn (generic function with 1 method)
julia> @time little_burn();
  0.091142 seconds (10.00 k allocations: 82.467 MiB, 20.53% gc time)

julia> using TimeZones
julia> zdts = [now(tz"UTC") for _ in  1:100_000_000];
julia> for i in 1:10
           @time little_burn()
       end
  0.566361 seconds (10.00 k allocations: 82.467 MiB, 89.26% gc time)
  0.620691 seconds (10.00 k allocations: 82.467 MiB, 88.32% gc time)
  0.051410 seconds (10.00 k allocations: 82.467 MiB)
  0.603839 seconds (10.00 k allocations: 82.467 MiB, 90.91% gc time)
  0.060557 seconds (10.00 k allocations: 82.467 MiB, 17.61% gc time)
  0.066966 seconds (10.00 k allocations: 82.467 MiB, 18.17% gc time)
  0.058870 seconds (10.00 k allocations: 82.467 MiB, 19.43% gc time)
  0.060740 seconds (10.00 k allocations: 82.467 MiB, 15.33% gc time)
  0.064070 seconds (10.00 k allocations: 82.467 MiB, 19.75% gc time)
  0.060747 seconds (10.00 k allocations: 82.467 MiB, 21.03% gc time)
  0.061915 seconds (10.00 k allocations: 82.467 MiB, 18.41% gc time)

However, it's very easy to run a different function that does not display the generational behavior
burn() is identical to little_burn but runs for 10x longer and allocates 10x as much memory.
Like little_burn it never actually touches the zdts.
with burn however the GC time always stays at 60-80%.
This is due to it triggering a full sweep of both generations of our GC.
Thus it keeps needing to mark all of the big and complex zdts which is slow.

julia> # A function that creates a decent amount of garbage,
       burn() = sum(hash, ["trashfire"^min(1000, i) for i in 1:100_000])
burn (generic function with 1 method)
julia> @time burn();
  0.913665 seconds (100.00 k allocations: 863.183 MiB, 19.98% gc time)

julia> using TimeZones
julia> zdts = [now(tz"UTC") for _ in  1:100_000_000];
julia> for i in 1:10
           @time burn()
       end
  3.127583 seconds (100.00 k allocations: 863.183 MiB, 78.35% gc time)
  2.073914 seconds (100.00 k allocations: 863.183 MiB, 64.82% gc time)
  2.355717 seconds (100.00 k allocations: 863.183 MiB, 77.97% gc time)
  1.842793 seconds (100.00 k allocations: 863.183 MiB, 66.04% gc time)
  2.335667 seconds (100.00 k allocations: 863.183 MiB, 74.24% gc time)
  1.804693 seconds (100.00 k allocations: 863.183 MiB, 66.61% gc time)
  2.344848 seconds (100.00 k allocations: 863.183 MiB, 73.77% gc time)
  2.396741 seconds (100.00 k allocations: 863.183 MiB, 74.44% gc time)
  1.883768 seconds (100.00 k allocations: 863.183 MiB, 64.91% gc time)
  2.262383 seconds (100.00 k allocations: 863.183 MiB, 77.07% gc time)
  1.835220 seconds (100.00 k allocations: 863.183 MiB, 64.84% gc time)

I have seen real code that takes tens of seconds to do a full mark and sweep.
We really don't want to be running that very often. (if we are running that kind of mark and sweep too often it might even be better to OOM error)

It seems we need to do something to the GC to have it do a full mark and sweep less often.
This might be tweaking the heuristics.
It might be adding another generation.
it might be both

@JeffBezanson JeffBezanson added GC Garbage collector performance Must go faster labels Apr 28, 2021
@oxinabox oxinabox changed the title The GC often does't act generational The GC often doesn't act generational Apr 28, 2021
chflood pushed a commit to chflood/julia that referenced this issue Feb 16, 2022
This helps with JuliaLang#40644 and other programs which suffer from non-productive full collections.
oscardssmith pushed a commit that referenced this issue Mar 9, 2022
* Updated GC Heuristics to avoid a full GC unless absolutely necessary.
This helps with #40644 and other programs which suffer from non-productive full collections.
@JeffBezanson
Copy link
Member

I'm going to say this is fixed by #44215. Of course, please post more examples if there are still bad cases.

@oscardssmith
Copy link
Member

Specifically, if you have examples, https://github.com/JuliaCI/GCBenchmarks is collecting them.

@vtjnash
Copy link
Member

vtjnash commented Jul 26, 2023

I was trying this again today, and my machine apparently just decided to run out of memory instead of ever running GC (over 30GB heap)

@oscardssmith
Copy link
Member

What julia version?

@vtjnash
Copy link
Member

vtjnash commented Jul 26, 2023

For a related example, I paused the task for a bit inside GC to print the stats, and when I resumed it, it decided that it must now wait for 10GB of new allocations before being eligible for the next GC. I am a little confused how it thinks only 67MB are mapped right now also?

(lldb) p gc_heap_stats
(gc_heapstatus_t) $1 = (bytes_mapped = 67_125_248, bytes_resident = 67_125_248, heap_size = 7_192_808_358, heap_target = 17_053_423_782)

I didn't grab stats for the first run, but I noticed on later runs that after the first 5 GB are allocated, usually the next GC is scheduled for after the heap reaches 10-20 GB.

@vtjnash
Copy link
Member

vtjnash commented Jul 26, 2023

julia> versioninfo()
Julia Version 1.11.0-DEV.165
Commit 76a9772c13* (2023-07-24 15:47 UTC)

@gbaraldi
Copy link
Member

Ooops. I'll take a look

oscardssmith pushed a commit that referenced this issue Aug 5, 2023
If something odd happens during GC (the PC goes to sleep) or a very big
transient the heuristics might make a bad decision. What this PR
implements is if we try to make our target more than double the one we
had before we fallback to a more conservative method. This fixes the new
issue @vtjnash found in #40644
for me.
KristofferC pushed a commit that referenced this issue Aug 10, 2023
If something odd happens during GC (the PC goes to sleep) or a very big
transient the heuristics might make a bad decision. What this PR
implements is if we try to make our target more than double the one we
had before we fallback to a more conservative method. This fixes the new
issue @vtjnash found in #40644
for me.

(cherry picked from commit ab94fad)
d-netto pushed a commit to RelationalAI/julia that referenced this issue Oct 4, 2023
If something odd happens during GC (the PC goes to sleep) or a very big
transient the heuristics might make a bad decision. What this PR
implements is if we try to make our target more than double the one we
had before we fallback to a more conservative method. This fixes the new
issue @vtjnash found in JuliaLang#40644
for me.
d-netto pushed a commit that referenced this issue Mar 15, 2024
If something odd happens during GC (the PC goes to sleep) or a very big
transient the heuristics might make a bad decision. What this PR
implements is if we try to make our target more than double the one we
had before we fallback to a more conservative method. This fixes the new
issue @vtjnash found in #40644
for me.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GC Garbage collector performance Must go faster
Projects
None yet
Development

No branches or pull requests

5 participants