[WIP] precise GC as presented at DConf 2013 #1022

rainers · 2014-11-14T19:47:43Z

This is the precise GC as presented at DConf 2013. It generates pointer bitmaps for each type T using RTInfo!T and stores a bitmap alongside pool memory to use for following only pointer references.

RTInfo generation isn't working properly at the moment (needs dlang/dmd#3958 or dlang/dmd#2480), but using the compiler default will just scan some types conservatively instead of precisely.

Some notes:

only the heap is scanned precisely, not stacks or the data/TLS segments
compilation is slower: I measured 3:46min for the "quick" test suite on Win32 instead of 3:26 for master (we might need a special trait to avoid a lot of CTFE)
execution is slower: the allocation heavy druntime benchmarks take 58sec instead of 51sec (unfortunately there are 2 list-benchmark which don't give reliable timings).
associative arrays need a dirty hack to support precise scanning of the key/value pairs in Entry: the TypeInfo for that is geneerated at runtime and stored in the Impl struct of each instance.
dynamic arrays need to "re-emplace" their type info pointer bitmaps for large arrays, because data is "sometimes" placed at an offset of 16 bytes

I'm opening this as a pull request to verify building on other platforms aswell as making it more publically visible and open for review/discussion.

I plan to create another version that stores type info per allocation block and use this during scanning. This can avoid the troubles for the arrays, but will not allow reassigning new types to partial memory blocks (e.g. when using std.emplace on fields of a class).

rainers · 2014-11-16T12:18:46Z

The tests are currently not passing due to https://issues.dlang.org/show_bug.cgi?id=13738 (dlang/dmd#4147) and https://issues.dlang.org/show_bug.cgi?id=13736 (dlang/dmd#4143).

rainers · 2014-12-01T08:16:49Z

I've tried to reduce the additional compilation time by replacing some CTFE code with cachable template instances. For the dmd test suite this brought the gap dwon from 20 seconds to about 15 seconds.

Considering that the test suite spends most of the time not compiling, I benchmarked building Visual D: the debug version takes about 16% longer (22 seconds instead of 19), the release version is even worse with 23% (43 seconds instead 35). Please note that Visual D imports most of the Windows SDK and the full Visual Studio SDK, so it contains a lot of struct definitions and interfaces.

I guess for this to be acceptable we must make the pointer bitmap a compiler trait.

The binary size overhead of the release version was about 0.5% (2491kB instead of 2480kB).

rainers · 2014-12-05T11:46:57Z

I guess for this to be acceptable we must make the pointer bitmap a compiler trait.

I have implemented the trait here: dlang/dmd#4192
My tests don't show any measurable loss of compilation speed when using the trait, plus it avoids troubles when using Unqual on shared(T[N]).

rainers · 2014-12-05T13:54:41Z

I plan to create another version that stores type info per allocation block and use this during scanning.

This is now available here: #1057

MartinNowak · 2014-12-06T22:46:15Z

unfortunately there are 2 list-benchmark which don't give reliable timings

Use avgtime and take the minima of 10 or 100 runs, that works because noise in benchmarks is mostly additive.

rainers · 2014-12-07T09:20:40Z

Use avgtime and take the minima of 10 or 100 runs, that works because noise in benchmarks is mostly additive.

I use it sometimes, but I'm unsure about a sole minimum being very representative (on a mobile processor where a lot of magic goes). If there are false pointers, they sometimes don't go away by just rerunning the same program. When comparing precise to imprecise scanning, you might even want to see the effects of false pointers ;-)

At the moment, I don't see these shaky results, though.

rainers · 2015-01-22T08:29:34Z

Updated to recent changes of the GC. With the getPointerBitmap trait, there is no longer a compile time penalty, but it's still 0-20% slower in the benchmarks, mostly during allocation. The benchmarks don't suffer much from false pointers, though.

MartinNowak · 2015-01-22T23:12:55Z

src/gc/gc.d

Twice as big and can't be returned in registers, that looks expensive.

Might be better to use a second stack for the pools in parallel.

I was hoping for any function calls dealing with ScanRange to be inlined, though I didn't check yet.

Or you might try to pack the struct and change passing and return to Range + Pool*, using ref return for pop.

I checked the disassembly: it's all inlined AFAICT. So the struct needs 50% more memory (there is no padding), but there is probably not a lot we can do here. The non-precise version might use the Range struct though, so we could templatize ToScanStack on that type.
Indexing might be a little more expensive than with a struct size of 8/16, so we could also use a pointer to walk the stack.

MartinNowak · 2015-01-22T23:24:20Z

I'll have a look at the benchmarks, we definitely need to get this fast enough.

MartinNowak · 2015-01-22T23:29:32Z

src/gc/gc.d

You should move this test to before p1 is dereferenced. If something isn't a pointer we should perform as little work as possible. This loop is key to make this thing fast.

The previous versions had some optimization to this respect, but I wanted to make minimal changes to begin with. The benchmarks don't show a large increase in scanning time, the preformance decrease is caused by setting the pointer bitmaps.

But it might be possible to make up for this by improving the marking time.

I just tried it: without further optimizations, moving the code above p = *p1 slows down marking by up to 10%. (Win32 on i7 mobile)

rainers · 2015-01-23T23:34:56Z

src/rt/aaA.d

This change is unrelated, but the current version causes a range error when building druntime without -release.
The return value of _d_newarrayU is confusing, as it actually returns the length for T[], not void[]. It should rather just return void*

MartinNowak · 2015-02-11T19:07:19Z

Delay until we have an idea how to get this fast, e.g. 2.068?

rainers · 2015-02-11T22:34:56Z

Delay until we have an idea how to get this fast, e.g. 2.068?

Yes. I'm also waiting for either dlang/dmd#2480 or dlang/dmd#3958 making some progress. This GC implementation still works with the incomplete RTInfo by making conservative assumptions, though.

rainers · 2015-07-14T06:53:32Z

Rebased.

DemiMarie · 2015-09-04T01:46:41Z

What is the status of this? Will this be merged?

rainers · 2015-11-08T07:39:52Z

Rebased in preparation of the PR's first anniversary.

PetarKirov · 2015-11-08T09:17:01Z

@rainers what's the status of this PR? Are there any serious showstoppers before this can be merged, or is it only the performance that you want to improve?

rainers · 2015-11-08T09:44:19Z

Are there any serious showstoppers before this can be merged, or is it only the performance that you want to improve?

AFAICT this is good enough to be included. @MartinNowak has some reservations regarding performance, but I don't see a faster solution in the near future. I tried some microtuning in the past, but that is rather frustrating when done against the dmd backend.

Before actual merging I'd make it opt-in rather than opt-out. Presice scanning is enabled in this PR so the auto-tester is actually running any of the new code.

What's missing is some documentation (some notes on what has to be done to get your manually managed memory scanned precisely), but I'd rather postpone writing it until this PR has a good chance of being merged.

rainers · 2016-03-23T22:32:58Z

Rebased. Needs dlang/dmd#5566 to build phobos unittests, though.

fix copyRangeRepeated optimization

DemiMarie · 2016-05-08T20:22:37Z

I would love to see this. Next step is incremental concurrent collection.

MartinNowak · 2016-05-08T20:43:36Z

Next step is incremental concurrent collection.

It is unlikely that we'll add write barriers, so we can't use classical techniques (e.g. generational) for incremental collection.

rainers · 2017-08-18T16:57:05Z

Closing in favor of #1603

MartinNowak added this to the 2.067 milestone Nov 18, 2014

rainers force-pushed the gc_precise_nov14 branch from 338e663 to 9238b4a Compare November 20, 2014 08:03

rainers mentioned this pull request Nov 26, 2014

Fix Issue 2834 - The GC will now call destructors on heap allocated structs #864

Merged

rainers force-pushed the gc_precise_nov14 branch from 156506c to e10438f Compare December 1, 2014 08:21

rainers mentioned this pull request Dec 5, 2014

Enhancement: add trait getPointerBitmap to help precise scanning dlang/dmd#4192

Merged

rainers mentioned this pull request Dec 5, 2014

Alternate precise GC implementation #1057

Closed

rainers force-pushed the gc_precise_nov14 branch from 5b953cf to 13cd2c7 Compare December 5, 2014 13:53

rainers force-pushed the gc_precise_nov14 branch from 13cd2c7 to 88c4f09 Compare January 22, 2015 08:26

MartinNowak reviewed Jan 22, 2015
View reviewed changes

rainers force-pushed the gc_precise_nov14 branch from f104470 to a23de34 Compare January 23, 2015 23:29

rainers reviewed Jan 23, 2015
View reviewed changes

rainers force-pushed the gc_precise_nov14 branch from 1c753e1 to 679b84f Compare January 29, 2015 19:32

MartinNowak modified the milestones: 2.068, 2.067 Feb 12, 2015

rainers force-pushed the gc_precise_nov14 branch from 679b84f to 1444996 Compare February 20, 2015 20:56

MartinNowak removed this from the 2.068 milestone Jun 20, 2015

rainers force-pushed the gc_precise_nov14 branch from 8b4cc04 to 7603ff8 Compare July 14, 2015 06:53

rainers force-pushed the gc_precise_nov14 branch 3 times, most recently from b4cf9fe to 24de144 Compare July 19, 2015 14:22

rainers force-pushed the gc_precise_nov14 branch from 24de144 to 1da68f3 Compare November 8, 2015 07:37

rainers mentioned this pull request Mar 23, 2016

fix __traits(getPointerBitmap): allow typeof(null) dlang/dmd#5566

Merged

rainers force-pushed the gc_precise_nov14 branch from 1da68f3 to 4505914 Compare March 23, 2016 22:31

rainers added 9 commits March 25, 2016 20:08

precise GC as presented at DConf 2013

f7e737c

use typeid pointers instead off TypeInfo classname to detect classes

7d76528

move attr evaluation into setPointerBitmap

66c8dc1

add @nogc to currTicks, tweak SENTINEL for precise GC

0c5dabf

adjust new AA for precise scanning

3ce92ef

fix copyRangeRepeated optimization

inline optimizations

90336c1

remove non-working pragma(inline)

ab74b15

tweak option help output

320b50b

fix after rebase

a1c94c9

rainers force-pushed the gc_precise_nov14 branch from 4505914 to a1c94c9 Compare March 25, 2016 19:11

Jebbs mentioned this pull request Jun 29, 2016

Add precise gc implementation #1603

Closed

rainers closed this Aug 18, 2017

rainers mentioned this pull request Jan 17, 2019

Another precise GC implementation #2418

Merged

Uh oh!

[WIP] precise GC as presented at DConf 2013 #1022

[WIP] precise GC as presented at DConf 2013 #1022

Uh oh!

Conversation

rainers commented Nov 14, 2014

Uh oh!

rainers commented Nov 16, 2014

Uh oh!

rainers commented Dec 1, 2014

Uh oh!

rainers commented Dec 5, 2014

Uh oh!

rainers commented Dec 5, 2014

Uh oh!

MartinNowak commented Dec 6, 2014

Uh oh!

rainers commented Dec 7, 2014

Uh oh!

rainers commented Jan 22, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MartinNowak commented Jan 22, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MartinNowak commented Feb 11, 2015

Uh oh!

rainers commented Feb 11, 2015

Uh oh!

rainers commented Jul 14, 2015

Uh oh!

DemiMarie commented Sep 4, 2015

Uh oh!

rainers commented Nov 8, 2015

Uh oh!

PetarKirov commented Nov 8, 2015

Uh oh!

rainers commented Nov 8, 2015

Uh oh!

rainers commented Mar 23, 2016

Uh oh!

DemiMarie commented May 8, 2016

Uh oh!

MartinNowak commented May 8, 2016

Uh oh!

rainers commented Aug 18, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants