-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose trace_object
to the VM binding
#710
Comments
PR #700 has been merged and we now have a general language-independent weak reference processing mechanism. It gives the binding a The problem with global roots scanning remain unsolved. We cannot call One solution is that we allow the VM binding to create work packets that can use
The simplest solution is just adding a method Example: impl Scanning for Ruby {
fn scan_vm_specific_root(factory: impl RootsWorkFactory) {
let context = factory.get_object_tracer_context(); // Don't use it now. Objects are not pinned yet.
let packet = ProcessRubyVMRoots { context }; // Add this "context" to the work packet
add_work_packet(packet, WorkBucketStage::Closure);
}
}
struct ProcessRubyVMRoots { context: Box<dyn ObjectTracerContext> }
impl GCWork for ProcessRubyVMRoots {
fn do_work(&mut self, worker: &GCWorker, mmtk: &MMTK) {
context.with_tracer(worker, |tracer| { // so that we can use `trace_object` when executing this packet.
let ruby_vm = ...;
// NATIVE CODE: The following are implemented in equivalent C code which calls back to Rust for accessing `trace_object`
*ruby_vm.field1 = tracer.trace_object(ruby_vm.field1); // The compiler cannot inline methods of Box<dyn...>
*ruby_vm.field2 = tracer.trace_object(ruby_vm.field2); // So we may switch to `impl ...` if possible.
*ruby_vm.field3 = tracer.trace_object(ruby_vm.field3);
// ...
// END of NATIVE CODE
});
}
} |
I tried to expose The real issue is that
Theoretically we already have all of this in the roots scanning work packet, but they're not exposed to the VM. Perhaps an idea could be to have a global fn trace_object<VM: VMBinding>(worker: &mut GCWorker<VM>, object: ObjectReference) -> ObjectReference {
//
} Or even as function of a impl<VM: VMBinding> GCWorker<VM> {
[...]
fn trace_object(&mut self, object: ObjectReference) -> ObjectReference {
//
}
[...]
} so that a VM can arbitrarily call |
This may be a problem because we can't call
Exactly. I am having a solution in mind, and it is described in details in #1137. The basic idea is, it does not expose |
TL;DR: We need to expose the
trace_object
function to the VM binding for global roots scanning (Ruby needs it) and weak reference processing (VM-side ref processing needs it). However,trace_object
depends on a queue (to enqueue objects) and aGCWorker
instance (to copy object). Care needs to be taken so that VM bindings can share it with multiple threads (or work packets). We also need to make sure the proposed new interface is general enough to support different kinds of GCs, including concurrent GC and reference counting.Why the VM binding needs
trace_object
?Ruby, copying GC and global roots
We discussed before that Ruby scans objects by providing C functions that enumerate and update fields. See: #581
And it is similar for global roots. Ruby has two functions:
gc_mark_roots
andgc_update_references
.gc_mark_roots
marks global roots. It callsrb_gc_root(var)
on each root variable.gc_update_references
updates the fields.Currently, we hijack
rb_gc_mark
to record the values of root variable, and present it to MMTk core withRootsWorkFactory::create_process_node_roots_work
. MMTk core receives a list of objects so that they can pin them, and it is never necessary to update root variables because they are pinned.However, to support copying GC in Ruby, roots need to be updated, too (unless we are willing to pin all global roots for the ease of implementation). Because of the
var = rb_gc_location(var)
idiom in Ruby, the easiest way to support updating roots is replacingrb_gc_location
withtrace_object
. This is impossible with the currentRootsWorkFactory
API because it only has two methods:create_process_edge_roots_work
gives MMTk core a list ofEdge
, andEdge
is usually a pointer to a root variable.create_process_node_roots_work
gives MMTk core a list ofObjectReference
, and it inevitably pins all the objects because it cannot update the roots.VM-side weak reference processing
Different VMs implement weak references, weak tables, finalisers and ephemerons differently. They have different layout and semantics. The most general way to support different VMs is to provide some kind of primitive, and let the VM binding scan and update weak references.
In #700, I designed a new API that gives the VM binding temporary access to the
trace_object
function.The access is "temporary" because MMTk core prepares a
ProcessEdgesWork
, and wraps it into aProcessWeakRefsTracer
that borrows theProcessEdgesWork
.From my experiment, this API is able to implement JikesRVM-style reference processing as implemented in mmtk-core, and it is able to support Ruby by updating weak tables of
obj_free
candidates, finalisable objects, as well as hash tables that map object addresses to GenIVTbl, ID and other things. This means even temporary access totrace_object
is enough for VM bindings to handle weak references.Why is temporary access to
trace_object
not enough?If
ProcessWeakRefsTracer
borrows aProcesssEdgesWork
, then theCollection::process_weak_refs(tls, context, tracer)
function will only be able to use thetracer
instance in its function scope. It forbids, for example, creating more work packets and callingProcessWeakRefsTracer::trace_object
from other work packets because it will violate the borrowing rules.Similar is true for root scanning. If
scan_vm_specific_root
has temporary access totrace_object
, it will not be able to spawn more work packets and scan roots in parallel. That was the reason whyRootsWorkFactory
requires theClone
trait, because VM bindings canclone()
theRootsWorkFactory
and scan it in multiple work packets.The approach of the lxr branch
In the
lxr
branch, theCollection
trait has the following methods:Note that those functions expose the
E: ProcessEdgesWork
type to the VM binding.The VM binding is able to spawn multiple work packets to process "discovered lists" in parallel.
Note that the
process_lists
method also has the<E: ProcessEdgesWork>
type parameter. As a result, theProcessDiscoveredList<E: ProcessEdgesWork<VM = OpenJDK>>
is specialised to theE
type, too. Then it can gain access totrace_object
by instantiatingE
:While exposing
E
to the VM binding works, I think it is in-elegant. As discussed in #604, the weak reference processor is not really using the wholeProcessEdgesWork
. Thevec![]
above is assigned to the edges list, and it remains empty because it is never used as "processing edges". The weak reference processor is actually using thetrace_object
part, and its ability to createScanObjects
work packets (usingtrace.flush()
) so that it can expand the transitive closure.Proposed API
I am currently thinking about designing a trait that encapsulates just that.
One design goal is to make it compatible with
ProcessEdgesWork
so we can implement it now without much refactoring to mmtk-core. It can be implemented by wrapping aProcessEdgesWork
inside. LikeProcessEdgesWork
, it has anew
method to create new instances.In #700, I mentioned that exposing
set_worker
andflush
to VM binding may be inelegant, as it complicates the API. But after a second thought, I think we can't avoid associating aGCWorker
to theTracer
object becausetrace_object
will accessGCWorker
(more precisely, the CopyContext local to the worker).Example
Collection::process_weak_refs
will provide a type parameter instead of animpl ProcessWeakRefsTracer
.Refactoring
ProcessEdgesWork
A more ambitious goal is to refactor
ProcessEdgesWork
itself and split it into two parts:tracer
as shown above.So that a
ProcessEdgesWork
can be implemented as iterating through the edge list, and feeding edges into the tracer. Then we don't need to passProcessEdgesWork
everywhere, and use it only internally.What about reference counting?
The
Tracer
trait shown above is provided toscan_vm_specific_roots
andprocess_weak_refs
, which should be part of tracing GCs instead of reference counting GCs. But deferred reference counting, when scanning stacks, may only apply DECs and INCs. We need to be careful of what is the expectation (i.e. what's the VM binding's obligation) when mmtk-core callsCollection::scan_stack_roots
. We need to discuss this.The text was updated successfully, but these errors were encountered: