-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement mem balancer #724
Conversation
the heap size cannot grow.
descriptor. Deal with multiply overflow.
@MarisaKirisame This is the PR that implements mem balancer. I was trying to implement based on my understanding of the paper. Let me know if there is any issue. Any comment is welcome. The changes that may be interesting to you is in |
Is it possible to implement the fix for nursery GCs we discussed in #723 in this PR? Or will we have a separate PR for that. |
Yeah, I think that is a different issue. |
src/util/heap/gc_trigger.rs
Outdated
@@ -139,27 +147,203 @@ pub struct MemBalancerTrigger { | |||
max_heap_pages: usize, | |||
/// The current heap size | |||
current_heap_pages: AtomicUsize, | |||
/// Statistics | |||
stats: Atomic<MemBalancerStats>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Atomic type is used on a complex struct. The underlying implementation uses a global hash map to assign a spin lock for each hash value of address. (See: https://docs.rs/atomic/latest/src/atomic/fallback.rs.html#101 ) This is not really lock-free, and is not efficient, either.
Given that this struct is only used at GC start/release/end when no two threads access that struct concurrently, I suggest using AtomicRefCell and it’s borrow_mut method (you can implement access_stats with them). It panics if two threads attempt to borrow it mutably at the same time, but given the use pattern, this shouldn’t happen unless there is a bug.
GC work packet stats are implemented using AtomicRefCell, too.
Alternatively you can always use Mutex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I changed that to AtomicRefCell
.
it look good. there are tricks here and there (e.g. you could try to add a constant 1MB to all heap size), but no doubt you guys will find them when doing some performance turning. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I observed an phenomenon that some methods of Plan
are only relevant to generational plans. It includes:
- The old
generational()
that returns&Gen<Self::VM>
withoutOption
is_current_gc_nursery
: Only called by theprepare
method of generational plans and the newly addedGCTrigger
.get_mature_physical_pages_available
: Only called byGen
- Newly added
get_mature_reserved_pages
: Only called byGCTrigger
ingenerational_mem_stats_on_gc_release
.
In Plan
, they are implemented as stubs. is_current_gc_nursery
returns false
; other methods panic if the plan is not generational.
With this PR, the new generational()
method returns an Option<&Gen>
. I think we can refactor the code and move those generational-specific methods to Gen
. Existing call sites should already have access to &Gen
instance. If not, the Plan::generational()
method can return an Option
for the caller to match against. (See my comments on the code below.)
However, I am not sure how to implement get_mature_physical_pages_available
because it needs to know which space is the "mature space", and that is per-plan knowledge. Intuitively, there should be another trait, such as
trait GenPlan: Plan {
fn is_current_gc_nursery(&self) -> bool;
fn get_mature_physical_pages_available(&self) -> usize;
fn get_mature_reserved_pages(&self) -> usize;
}
Concrete plans, such as GenCopy
and GenImmix
shall implement GenPlan
in addition to Plan
.
I am trying to refactor locally, but haven't managed to get the downcast
operation working.
What do you think? If it is too complicated, we can defer this refactoring to another PR.
@@ -108,6 +108,8 @@ impl<C: GCWorkContext> Release<C> { | |||
impl<C: GCWorkContext + 'static> GCWork<C::VM> for Release<C> { | |||
fn do_work(&mut self, worker: &mut GCWorker<C::VM>, mmtk: &'static MMTK<C::VM>) { | |||
trace!("Release Global"); | |||
self.plan.base().gc_trigger.policy.on_gc_release(mmtk); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a problem for now. This is added before the VMCollection::vm_release()
below, but plan_mut.release
happens after VMCollection::vm_release()
. It's a bit weird this way, but it doesn't matter if the VM doesn't implement vm_release
.
VMCollection::vm_release()
was created for VMs to do what our current RefEnqueue
does to handle weak references. Existing bindings (except mmtk-openjdk in the lxr branch) do not use it, but with my new weak-ref API, the openjdk binding should use it instead of relying on mmtk-core's RefEnqueue
work packet. I am considering moving this <C::VM as VMBinding>::VMCollection::vm_release();
statement to a dedicated work packets, because ref-enqueuing work does not depend on the Plan
instance or the release of the Plan
, and can therefore be parallelised.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we can run some code when we open the Release
bucket, that would be a good timing to run this on_gc_release()
method.
@wks I have done some refactoring about methods that are in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work.
Two problems remain.
One is the naming of the Gen
struct and the gen()
method. It is a bit confusing to have both trait GenerationalPlan
and struct Gen
. Similar is true for generational()
and gen()
. I think we can rename Gen
to CommonGenPlan
to indicate that it plays a similar role as CommonPlan
, i.e. CommonGenPlan
is the common parts of most generational plans, whereas CommonPlan
is the common parts of most plans.
The other is the is_nursery_gc
function. This function works for both generational plans and non-generational plans. It is used in two places.
- Finalization processor. It still uses
is_nursery_gc
as a hint so that it only needs to look at newly registered finalizable objects since the last GC. It is still helpful. - The
ProcessModBuf
andProcessRegionModBuf
work packets. Those work packets are only used by generational plans, so they can be sure that the current plan is generational. I think they can skip themap_or
part, and panic if the plan is not generational.
Generational work packets directly use is_current_gc_nursery() from CommonGenPlan.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks good to me now.
This PR implements mem balancer (https://dl.acm.org/doi/pdf/10.1145/3563323). Changes: * Introduce a trait `GenerationalPlan: Plan`, and move methods that are specific to generational plans to the new trait from the old `trait Plan`. * Change the return type of `Plan::generational()` to return an option of `&dyn GenerationalPlan`. We can use this to know whether a plan is generational or not. If it is, we can further invoke methods about the generational plan. * Add `GCTriggerPolicy::on_gc_release()` which is called in the `Release` work packet. A `GCTriggerPolicy` can use this to get stats before we reclaim memory (e.g. to get promoted size). * Add an implementation of mem balancer to replace the current simple resize policy. * Add some logging to help debug GC triggering.
This PR implements mem balancer (https://dl.acm.org/doi/pdf/10.1145/3563323).
Changes:
GenerationalPlan: Plan
, and move methods that are specific to generational plans to the new trait from the oldtrait Plan
.Plan::generational()
to return an option of&dyn GenerationalPlan
. We can use this to know whether a plan is generational or not. If it is, we can further invoke methods about the generational plan.GCTriggerPolicy::on_gc_release()
which is called in theRelease
work packet. AGCTriggerPolicy
can use this to get stats before we reclaim memory (e.g. to get promoted size).