Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement mem balancer #724

Merged
merged 29 commits into from
Jan 13, 2023
Merged

Implement mem balancer #724

merged 29 commits into from
Jan 13, 2023

Conversation

qinsoon
Copy link
Member

@qinsoon qinsoon commented Jan 3, 2023

This PR implements mem balancer (https://dl.acm.org/doi/pdf/10.1145/3563323).

Changes:

  • Introduce a trait GenerationalPlan: Plan, and move methods that are specific to generational plans to the new trait from the old trait Plan.
  • Change the return type of Plan::generational() to return an option of &dyn GenerationalPlan. We can use this to know whether a plan is generational or not. If it is, we can further invoke methods about the generational plan.
  • Add GCTriggerPolicy::on_gc_release() which is called in the Release work packet. A GCTriggerPolicy can use this to get stats before we reclaim memory (e.g. to get promoted size).
  • Add an implementation of mem balancer to replace the current simple resize policy.
  • Add some logging to help debug GC triggering.

@qinsoon qinsoon added the PR-testing Run binding tests for the pull request (deprecated: use PR-extended-testing instead) label Jan 4, 2023
@qinsoon qinsoon marked this pull request as ready for review January 4, 2023 23:21
@qinsoon qinsoon requested a review from wks January 4, 2023 23:21
@qinsoon
Copy link
Member Author

qinsoon commented Jan 4, 2023

@MarisaKirisame This is the PR that implements mem balancer. I was trying to implement based on my understanding of the paper. Let me know if there is any issue. Any comment is welcome. The changes that may be interesting to you is in src/util/heap/gc_trigger.rs.

@k-sareen
Copy link
Collaborator

k-sareen commented Jan 5, 2023

Is it possible to implement the fix for nursery GCs we discussed in #723 in this PR? Or will we have a separate PR for that.

@qinsoon
Copy link
Member Author

qinsoon commented Jan 5, 2023

Is it possible to implement the fix for nursery GCs we discussed in #723 in this PR? Or will we have a separate PR for that.

Yeah, I think that is a different issue.

@@ -139,27 +147,203 @@ pub struct MemBalancerTrigger {
max_heap_pages: usize,
/// The current heap size
current_heap_pages: AtomicUsize,
/// Statistics
stats: Atomic<MemBalancerStats>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Atomic type is used on a complex struct. The underlying implementation uses a global hash map to assign a spin lock for each hash value of address. (See: https://docs.rs/atomic/latest/src/atomic/fallback.rs.html#101 ) This is not really lock-free, and is not efficient, either.

Given that this struct is only used at GC start/release/end when no two threads access that struct concurrently, I suggest using AtomicRefCell and it’s borrow_mut method (you can implement access_stats with them). It panics if two threads attempt to borrow it mutably at the same time, but given the use pattern, this shouldn’t happen unless there is a bug.

GC work packet stats are implemented using AtomicRefCell, too.

Alternatively you can always use Mutex.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I changed that to AtomicRefCell.

@MarisaKirisame
Copy link

it look good. there are tricks here and there (e.g. you could try to add a constant 1MB to all heap size), but no doubt you guys will find them when doing some performance turning.

src/util/options.rs Outdated Show resolved Hide resolved
Copy link
Collaborator

@wks wks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I observed an phenomenon that some methods of Plan are only relevant to generational plans. It includes:

  • The old generational() that returns &Gen<Self::VM> without Option
  • is_current_gc_nursery: Only called by the prepare method of generational plans and the newly added GCTrigger.
  • get_mature_physical_pages_available: Only called by Gen
  • Newly added get_mature_reserved_pages: Only called by GCTrigger in generational_mem_stats_on_gc_release.

In Plan, they are implemented as stubs. is_current_gc_nursery returns false; other methods panic if the plan is not generational.

With this PR, the new generational() method returns an Option<&Gen>. I think we can refactor the code and move those generational-specific methods to Gen. Existing call sites should already have access to &Gen instance. If not, the Plan::generational() method can return an Option for the caller to match against. (See my comments on the code below.)

However, I am not sure how to implement get_mature_physical_pages_available because it needs to know which space is the "mature space", and that is per-plan knowledge. Intuitively, there should be another trait, such as

trait GenPlan: Plan {
    fn is_current_gc_nursery(&self) -> bool;
    fn get_mature_physical_pages_available(&self) -> usize;
    fn get_mature_reserved_pages(&self) -> usize;
}

Concrete plans, such as GenCopy and GenImmix shall implement GenPlan in addition to Plan.

I am trying to refactor locally, but haven't managed to get the downcast operation working.

What do you think? If it is too complicated, we can defer this refactoring to another PR.

src/plan/global.rs Outdated Show resolved Hide resolved
src/plan/global.rs Outdated Show resolved Hide resolved
src/util/heap/gc_trigger.rs Outdated Show resolved Hide resolved
src/util/heap/gc_trigger.rs Outdated Show resolved Hide resolved
@@ -108,6 +108,8 @@ impl<C: GCWorkContext> Release<C> {
impl<C: GCWorkContext + 'static> GCWork<C::VM> for Release<C> {
fn do_work(&mut self, worker: &mut GCWorker<C::VM>, mmtk: &'static MMTK<C::VM>) {
trace!("Release Global");
self.plan.base().gc_trigger.policy.on_gc_release(mmtk);
Copy link
Collaborator

@wks wks Jan 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a problem for now. This is added before the VMCollection::vm_release() below, but plan_mut.release happens after VMCollection::vm_release(). It's a bit weird this way, but it doesn't matter if the VM doesn't implement vm_release.

VMCollection::vm_release() was created for VMs to do what our current RefEnqueue does to handle weak references. Existing bindings (except mmtk-openjdk in the lxr branch) do not use it, but with my new weak-ref API, the openjdk binding should use it instead of relying on mmtk-core's RefEnqueue work packet. I am considering moving this <C::VM as VMBinding>::VMCollection::vm_release(); statement to a dedicated work packets, because ref-enqueuing work does not depend on the Plan instance or the release of the Plan, and can therefore be parallelised.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can run some code when we open the Release bucket, that would be a good timing to run this on_gc_release() method.

src/util/heap/gc_trigger.rs Outdated Show resolved Hide resolved
@qinsoon
Copy link
Member Author

qinsoon commented Jan 12, 2023

@wks I have done some refactoring about methods that are in Plan but should be specific to generational plans. See 5aa7023 for the changes. In short, Plan::generational() now returns Option<&dyn GenerationalPlan> so we do not need to do downcast. Let me know if there could be any issue on this change.

Copy link
Collaborator

@wks wks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work.

Two problems remain.

One is the naming of the Gen struct and the gen() method. It is a bit confusing to have both trait GenerationalPlan and struct Gen. Similar is true for generational() and gen(). I think we can rename Gen to CommonGenPlan to indicate that it plays a similar role as CommonPlan, i.e. CommonGenPlan is the common parts of most generational plans, whereas CommonPlan is the common parts of most plans.

The other is the is_nursery_gc function. This function works for both generational plans and non-generational plans. It is used in two places.

  1. Finalization processor. It still uses is_nursery_gc as a hint so that it only needs to look at newly registered finalizable objects since the last GC. It is still helpful.
  2. The ProcessModBuf and ProcessRegionModBuf work packets. Those work packets are only used by generational plans, so they can be sure that the current plan is generational. I think they can skip the map_or part, and panic if the plan is not generational.

src/plan/generational/gc_work.rs Outdated Show resolved Hide resolved
src/plan/generational/global.rs Outdated Show resolved Hide resolved
src/plan/generational/gc_work.rs Outdated Show resolved Hide resolved
src/plan/generational/gc_work.rs Outdated Show resolved Hide resolved
Generational work packets directly use is_current_gc_nursery() from
CommonGenPlan.
Copy link
Collaborator

@wks wks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good to me now.

@qinsoon qinsoon merged commit 541020b into mmtk:master Jan 13, 2023
wenyuzhao pushed a commit to wenyuzhao/mmtk-core that referenced this pull request Mar 20, 2023
This PR implements mem balancer (https://dl.acm.org/doi/pdf/10.1145/3563323).

Changes:
* Introduce a trait `GenerationalPlan: Plan`, and move methods that are specific to generational plans to the new trait from the old `trait Plan`.
* Change the return type of `Plan::generational()` to return an option of `&dyn GenerationalPlan`. We can use this to know whether a plan is generational or not. If it is, we can further invoke methods about the generational plan.
* Add `GCTriggerPolicy::on_gc_release()` which is called in the `Release` work packet. A `GCTriggerPolicy` can use this to get stats before we reclaim memory (e.g. to get promoted size).
* Add an implementation of mem balancer to replace the current simple resize policy.
* Add some logging to help debug GC triggering.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PR-testing Run binding tests for the pull request (deprecated: use PR-extended-testing instead)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants