Support obj_free #21

wks · 2022-08-15T08:06:16Z

We add all types that has non-trivial clean-up code in obj_free as candidates for finalization. MMTk-core will put them into a queue when they die.

This PR mainly focuses on recycling off-heap buffers of large arrays and strings. If those off-heap objects are not recycled in time, it may thrash the whole system before being killed by the operating system. So currently the invocation of finalizers is invoked by ruby_xmalloc. When allocating too much off-heap objects using ruby_xmalloc, it will trigger GC, and execute obj_free on ready-to-finalize objects. This gives us a chance to free the off-heap xmalloc-allocated buffers for dead objects.

Known issues:

We still need to give the Ruby VM a chance to call obj_free when GC is triggered by regular heap allocations, and we want to avoid calling finalizers on the fast path of allocation.
The GC cannot scan objects that already have obj_free called on them, because it frees the underlying data structures (usually off-heap C objects of T_DATA) which dmark attempts to read. Currently, we change the object type to T_NONE after calling obj_free, and skip T_NONE before calling gc_mark_children. When the feature in mmtk-core is ready (see Add an API to clear the valid-object bit (a.k.a. alloc-bit) mmtk-core#648), we will invalidate the objects by clearing its "valid-object bit" so that mmtk-core will not attempt to call Scanning::scan_object on invalid objects.

We register common types that contain underlying xmalloc-ed buffers, such as strings, arrays and hashes, as finalization candidates. When xmalloc allocates enough memory, it will trigger GC using MMTk's API. The GC will identify objects ready for finalization. We will call obj_free on them to free their underlying buffers.

This allows the stack scanner to skip such invalid objects during stack scanning, and prevents scanning such invalid objects during tracing.

chrisseaton · 2022-08-24T12:40:57Z

Is there anything lighter-weight than full finalisers? Like if we want to delete the C array backing an array object, can we not hook into the GC to do that as soon as the object is found to be dead rather than pushing the work onto a queue?

wks · 2022-08-24T14:44:49Z

Is there anything lighter-weight than full finalisers? Like if we want to delete the C array backing an array object, can we not hook into the GC to do that as soon as the object is found to be dead rather than pushing the work onto a queue?

Currently, there is no such a mechanism in mmtk-core.

Yes. It is possible to add a hook to the GC (or more precisely, hook into the FinalizerProcessor) to call obj_free immediately when objects die. However, I don't think it helps much.

Finalizable objects still need to be registered into the FinalizerProcessor. This is because most GC algorithms, other than MarkSweep, do not have a "sweep" phase in which the GC goes through all allocated objects to sweep them (and call obj_free when visiting dead objects). They just discard unused chunks as a whole. SemiSpace copies live objects out, and discards the entire from-space without looking at dead objects. Immix can recycle dead blocks or lines at a time, ignoring objects inside. Generational collectors, like SemiSpace, also discards the nursery without looking at dead objects inside. Therefore, to call a particular function (obj_free) on an object when it dies, the VM must tell mmtk-core which object needs such special care. In most languages, very few objects have finalizers. Ruby is a very big exception in that most Ruby objects are subject to finalization (obj_free).

So a hook can be added at the resurrection stage, i.e. the time when a heap traversal has finished, and the FinalizerProcessor scans registered objects to see if they are still alive. Instead of resurrecting an object and adding it into the ready-to-finalize queue, the FinalizerProcessor can call a hook registered by the VM. In the case of Ruby, it is obj_free. If we do this, those objects do not need to be resurrected at all. GC can just treat them as dead.

But this approach has two problems.

It doesn't reduce the amount of work to be done. obj_free has to be called at some time (whether during GC or after GC) by some thread (GC thread or mutator thread) on the same number of objects subject to finalization. Doing it during GC doesn't make the program call less obj_free. It just moves the job from the mutator to the GC.

If obj_free is called during GC, it increases the GC latency. With dfree provided by the user, the time it takes is unbounded. But GC must work as fast as possible to reduce the pause time. The new LXR algorithm does marking concurrently, and minimizes stop-the-world time, to reduce latency. If we call obj_free during GC (and currently it needs to be called on most Ruby objects), then obj_free may become the biggest source of latency.

Therefore, to solve the root problem, we should reduce the number of finalizable objects. In other words, we should make it unnecessary to call obj_free on as many objects as possible. The next thing I will do will be changing the C buffers in objects so that they are allocated by the GC, instead. And I'll start with T_OBJECT, T_STRING and T_ARRAY.

By the way, Section 12.1 of the GC Handbook mentioned the possibility to run finalizers during GC, and listed some down sides, such as (1) it cannot allocate objects during GC, (2) risk of deadlock, and (3) may impose constraints on GC algorithms. Not all of them may apply to Ruby, but I think the idea of adding such a hook to the MMTk API is at least questionable.

Anyway, I'll raise this in our next group meeting.

chrisseaton · 2022-08-24T18:06:50Z

Ok let's go ahead with this option.

Ruby has so much off-heap memory that is easily freed that it might make sense to think about new ideas for this. The issue I have with pushing work to a queue is that it goes out of cache just to come back into cache. Surely we can do something better for the most common cases. Something to think about.

[Bug #20921] When we create a cache entry for a constant, the following sequence of events could happen: - vm_track_constant_cache is called to insert a constant cache. - In vm_track_constant_cache, we first look up the ST table for the ID of the constant. Assume the ST table exists because another iseq also holds a cache entry for this ID. - We then insert into this ST table with the iseq_inline_constant_cache. - However, while inserting into this ST table, it allocates memory, which could trigger a GC. Assume that it does trigger a GC. - The GC frees the one and only other iseq that holds a cache entry for this ID. - In remove_from_constant_cache, it will appear that the ST table is now empty because there are no more iseq with cache entries for this ID, so we free the ST table. - We complete GC and continue our st_insert. However, this ST table has been freed so we now have a use-after-free. This issue is very hard to reproduce, because it requires that the GC runs at a very specific time. However, we can make it show up by applying this patch which runs GC right before the st_insert to mimic the st_insert triggering a GC: diff --git a/vm_insnhelper.c b/vm_insnhelper.c index 3cb23f0..a93998136a 100644 --- a/vm_insnhelper.c +++ b/vm_insnhelper.c @@ -6338,6 +6338,10 @@ vm_track_constant_cache(ID id, void *ic) rb_id_table_insert(const_cache, id, (VALUE)ics); } + if (id == rb_intern("MyConstant")) rb_gc(); + st_insert(ics, (st_data_t) ic, (st_data_t) Qtrue); } And if we run this script: Object.const_set("MyConstant", "Hello!") my_proc = eval("-> { MyConstant }") my_proc.call my_proc = eval("-> { MyConstant }") my_proc.call We can see that ASAN outputs a use-after-free error: ==36540==ERROR: AddressSanitizer: heap-use-after-free on address 0x606000049528 at pc 0x000102f3ceac bp 0x00016d607a70 sp 0x00016d607a68 READ of size 8 at 0x606000049528 thread T0 #0 0x102f3cea8 in do_hash st.c:321 #1 0x102f3ddd0 in rb_st_insert st.c:1132 #2 0x103140700 in vm_track_constant_cache vm_insnhelper.c:6345 #3 0x1030b91d8 in vm_ic_track_const_chain vm_insnhelper.c:6356 #4 0x1030b8cf8 in rb_vm_opt_getconstant_path vm_insnhelper.c:6424 #5 0x1030bc1e0 in vm_exec_core insns.def:263 #6 0x1030b55fc in rb_vm_exec vm.c:2585 #7 0x1030fe0ac in rb_iseq_eval_main vm.c:2851 #8 0x102a82588 in rb_ec_exec_node eval.c:281 #9 0x102a81fe0 in ruby_run_node eval.c:319 #10 0x1027f3db4 in rb_main main.c:43 #11 0x1027f3bd4 in main main.c:68 #12 0x183900270 (<unknown module>) 0x606000049528 is located 8 bytes inside of 56-byte region [0x606000049520,0x606000049558) freed by thread T0 here: #0 0x104174d40 in free+0x98 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x54d40) #1 0x102ada89c in rb_gc_impl_free default.c:8183 #2 0x102ada7dc in ruby_sized_xfree gc.c:4507 #3 0x102ac4d34 in ruby_xfree gc.c:4518 #4 0x102f3cb34 in rb_st_free_table st.c:663 #5 0x102bd52d8 in remove_from_constant_cache iseq.c:119 #6 0x102bbe2cc in iseq_clear_ic_references iseq.c:153 #7 0x102bbd2a0 in rb_iseq_free iseq.c:166 #8 0x102b32ed0 in rb_imemo_free imemo.c:564 #9 0x102ac4b44 in rb_gc_obj_free gc.c:1407 #10 0x102af4290 in gc_sweep_plane default.c:3546 #11 0x102af3bdc in gc_sweep_page default.c:3634 #12 0x102aeb140 in gc_sweep_step default.c:3906 #13 0x102aeadf0 in gc_sweep_rest default.c:3978 #14 0x102ae4714 in gc_sweep default.c:4155 #15 0x102af8474 in gc_start default.c:6484 #16 0x102afbe30 in garbage_collect default.c:6363 #17 0x102ad37f0 in rb_gc_impl_start default.c:6816 #18 0x102ad3634 in rb_gc gc.c:3624 #19 0x1031406ec in vm_track_constant_cache vm_insnhelper.c:6342 #20 0x1030b91d8 in vm_ic_track_const_chain vm_insnhelper.c:6356 #21 0x1030b8cf8 in rb_vm_opt_getconstant_path vm_insnhelper.c:6424 #22 0x1030bc1e0 in vm_exec_core insns.def:263 #23 0x1030b55fc in rb_vm_exec vm.c:2585 #24 0x1030fe0ac in rb_iseq_eval_main vm.c:2851 #25 0x102a82588 in rb_ec_exec_node eval.c:281 #26 0x102a81fe0 in ruby_run_node eval.c:319 #27 0x1027f3db4 in rb_main main.c:43 #28 0x1027f3bd4 in main main.c:68 #29 0x183900270 (<unknown module>) previously allocated by thread T0 here: #0 0x104174c04 in malloc+0x94 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x54c04) #1 0x102ada0ec in rb_gc_impl_malloc default.c:8198 #2 0x102acee44 in ruby_xmalloc gc.c:4438 #3 0x102f3c85c in rb_st_init_table_with_size st.c:571 #4 0x102f3c900 in rb_st_init_table st.c:600 #5 0x102f3c920 in rb_st_init_numtable st.c:608 #6 0x103140698 in vm_track_constant_cache vm_insnhelper.c:6337 #7 0x1030b91d8 in vm_ic_track_const_chain vm_insnhelper.c:6356 #8 0x1030b8cf8 in rb_vm_opt_getconstant_path vm_insnhelper.c:6424 #9 0x1030bc1e0 in vm_exec_core insns.def:263 #10 0x1030b55fc in rb_vm_exec vm.c:2585 #11 0x1030fe0ac in rb_iseq_eval_main vm.c:2851 #12 0x102a82588 in rb_ec_exec_node eval.c:281 #13 0x102a81fe0 in ruby_run_node eval.c:319 #14 0x1027f3db4 in rb_main main.c:43 #15 0x1027f3bd4 in main main.c:68 #16 0x183900270 (<unknown module>) This commit fixes this bug by adding a inserting_constant_cache_id field to the VM, which stores the ID that is currently being inserted and, in remove_from_constant_cache, we don't free the ST table for ID equal to this one. Co-Authored-By: Alan Wu <alanwu@ruby-lang.org>

wks added 3 commits August 12, 2022 15:14

Use MMTk's official API to handle finalizers

55f4175

We don't need to expose obj_free to Rust.

d1549d4

wks changed the title ~~Count xmalloc4~~ Support obj_free Aug 15, 2022

wks added 3 commits August 18, 2022 17:59

Add more types as finalizable

a21845d

Don't skip special objects unless before exit

552a7c1

Clear object header after calling obj_free

c86dc56

This allows the stack scanner to skip such invalid objects during stack scanning, and prevents scanning such invalid objects during tracing.

wks marked this pull request as ready for review August 24, 2022 12:02

wks mentioned this pull request Aug 24, 2022

Support obj_free mmtk/mmtk-ruby#8

Merged

wks requested a review from chrisseaton August 24, 2022 12:05

chrisseaton approved these changes Aug 24, 2022

View reviewed changes

wks merged commit b45c3c8 into mmtk:mmtk Aug 25, 2022

wks mentioned this pull request Aug 29, 2022

Take xmalloc into account mmtk/mmtk-ruby#7

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support obj_free #21

Support obj_free #21

Uh oh!

wks commented Aug 15, 2022 •

edited

Loading

Uh oh!

chrisseaton commented Aug 24, 2022

Uh oh!

wks commented Aug 24, 2022

Uh oh!

chrisseaton commented Aug 24, 2022

Uh oh!

Uh oh!

Support obj_free #21

Support obj_free #21

Uh oh!

Conversation

wks commented Aug 15, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chrisseaton commented Aug 24, 2022

Uh oh!

wks commented Aug 24, 2022

Uh oh!

chrisseaton commented Aug 24, 2022

Uh oh!

Uh oh!

wks commented Aug 15, 2022 •

edited

Loading