Decuple reference processor from `ProcessEdgesWork` #604

wks · 2022-05-31T09:52:48Z

TL;DR: The reference processor is tightly coupled with ProcessEdgesWork work packets, making it impossible to support runtimes that can only do object-enqueuing. We can make it general.

Problem

Currently, reference-processing work packets {Soft,Weak,Phantom}RefProcessing take ProcessEdgesWork as a parameter. For example:

pub struct SoftRefProcessing<E: ProcessEdgesWork>(PhantomData<E>);
impl<E: ProcessEdgesWork> GCWork<E::VM> for SoftRefProcessing<E> {
    fn do_work(&mut self, worker: &mut GCWorker<E::VM>, mmtk: &'static MMTK<E::VM>) {
        let mut w = E::new(vec![], false, mmtk);
        w.set_worker(worker);
        mmtk.reference_processors.scan_soft_refs(&mut w, mmtk);
        w.flush();
    }
}

Seeing from the use pattern, it instantiates E, passes it to scan_soft_refs, and calls .flush() on it. It never calls w.do_work, which indicates ProcessEdgesWork is not a sub-task of SoftRefProcessing. Instead, SoftRefProcessing is just stealing some functionalities from ProcessEdgesWork!

Analysis

It uses the E: ProcessEdgesWork type for three purposes:

Use E::trace_object to trace object. Traced soft/weak/phantom references will become softly/weakly/phantomly reachable.
Use E as an object queue. E::trace_object takes a queue parameter (currently in the form of a TransitiveClosure, but will be refactored in Remove TransitiveClosure param from Space::trace_object and Scanning::scan_object #559).
Use the object queue to create a object-scanning work packet. w.flush() does this. It will create a ScanObjects work packet to scan objects.

From the analysis, the SoftRefProcessing work packet only has two dependencies:

A delegate for calling trace_object in the appropriate space, and
The type of the object-scanning work packet to create. (More concretely, the post-scan hook which Immix needs.)

The queue is just a Vec<ObjectReference> (or whatever that wraps it) and can be created locally.

Solution

To move away from ProcessEdgesWork, we just need to parameterise {Soft,Weak,Phantom}RefProcessing with a trait that provides the above two operations, namely trace_object and post_scan_object.

I have a draft for this trait. I call it TracingDelegate.

pub trait TracingDelegate<VM: VMBinding>: 'static + Copy + Send {
    fn trace_object<T: TransitiveClosure>(
        &self,
        trace: &mut T,
        object: ObjectReference,
        worker: &mut GCWorker<VM>,
    ) -> ObjectReference;

    fn may_move_objects() -> bool;

    fn post_scan_object(&self, object: ObjectReference);
}

They are just the three methods provided by PlanTraceObject, except without the KIND which can be parameterised on concrete implementations.

There should be two implementations, one for SFT, and the other for PlanTraceObject, just like there are SFTProcessEdges and PlanProcessEdges.

Then SoftRefProcessing can just call trace_object from that trait. The ScanObjects trait can also be refactored to use that trait.

The text was updated successfully, but these errors were encountered:

qinsoon · 2022-05-31T22:35:22Z

I don't think we need a special type for reference processor. Reference processor could use any type that we use for tracing (currently ProcessEdgesWork, or any type that will supersede ProcessEdgesWork).

TracingDelegate looks quite similar to PlanTraceObject. We could just rename PlanTraceObject to TracingDelegate. Reusing code is also an important part of software engineering.

wks · 2022-06-01T02:38:30Z

TracingDelegate looks quite similar to PlanTraceObject. We could just rename PlanTraceObject to TracingDelegate. Reusing code is also an important part of software engineering.

TracingDelegate can also be implemented for SFT. For example, SFTTracingDelegate::trace_object could call SFT::sft_trace_object, like SFTProcessEdges does.

TracingDelegate intends to extract "good parts" from SFTProcessEdges and PlanProcessEdges into SFTTracingDelegate and PlanTracingDelegate, and make them reusable. My plan is to reuse TracingDelegate in other work packets as well, such as:

struct TracingProcessEdges<D: TracingDelegate>: Replacing trait ProcessEdgesWork.
struct TracingProcessEdges<D: TracingDelegate>: Replacing struct ScanObjects and struct PlanScanObjects. It scans object like ScanObjects does, but can optionally process its fields (edges), too, if the VM (Ruby) does not support edge enqueuing for some objects.

Neither of them can have subclasses, but plans can customise them by provide different TracingDelegate instances.

However, if we plan to commit to PlanTraceObject and dismiss SFT, we won't need TracingDelegate. Instead, we just embed a plan: &'static p where P: Plan<VM = VM> + PlanTraceObject<VM>, and those work packets can call into the plan directly.

I don't think we need a special type for reference processor. Reference processor could use any type that we use for tracing (currently ProcessEdgesWork, or any type that will supersede ProcessEdgesWork).

If the "special type" means SoftRefProcessing (and its weak/phantom counterparts), then we do need it. The logic of reference processing is still a bit different from ordinary edge processing. Edge processing does the following:

Make an object queue q
Load objref from the edge
Call trace_object(&mut q, objref)
Store the new objref back to the edge (if moved).
Repeat 2-4 until all edges are processed
If q is not empty, create an object-scanning work packet with all elements in q
Execute the object-scanning work packet, or submit the work packet to the scheduler.

SoftRefProcessing is different in the following aspects:

Step 2 and 4 are different, because it accesses SoftReference objects via special VM API.
If it is weak reference processing, or it decides not to retain soft references, it will only do step 2-4 for reachable Soft/WeakReference.

TracingDelegate supports step 3, so the reference processor can still call it.

TracingDelegate indirectly supports step 7. It provides post_scan_object which is currently the only difference between different XxxxxScanObjects work packets. And the code for step 7 (creating a work packet) is trivial (just a few lines of code). If code repetition is a problem, we can still abstract it out in a function.

Here is an in-progress work for TracingProcessEdges::gc_work which does what ProcessEdgesWork::gc_work does with TracingDelegate: https://github.com/wks/mmtk-core/blob/8217c09480451e4ed7b43a8cea3b2aead8e1913e/src/scheduler/gc_work.rs#L872

wks · 2023-03-08T08:39:41Z

We implemented object enqueuing by wrapping ProcessEdgesWork. #628

Ruby now uses the new VM-specific weak reference processing API: #700 This makes the changes of the built-in reference processors unnecessary. Actually we shall replace the built-in reference and finalization processors with OpenJDK-specific and JikesRVM-specific implementations.

I am closing this issue. More discussions about migrating to the new weak reference processing API happen in: #694

wks mentioned this issue Jun 22, 2022

No ProcessEdgesWork in API functions #611

Merged

wks mentioned this issue Dec 12, 2022

Expose trace_object to the VM binding #710

Open

wks closed this as completed Mar 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decuple reference processor from `ProcessEdgesWork` #604

Decuple reference processor from `ProcessEdgesWork` #604

wks commented May 31, 2022

qinsoon commented May 31, 2022

wks commented Jun 1, 2022

wks commented Mar 8, 2023

Decuple reference processor from ProcessEdgesWork #604

Decuple reference processor from ProcessEdgesWork #604

Comments

wks commented May 31, 2022

Problem

Analysis

Solution

qinsoon commented May 31, 2022

wks commented Jun 1, 2022

wks commented Mar 8, 2023

Decuple reference processor from `ProcessEdgesWork` #604

Decuple reference processor from `ProcessEdgesWork` #604