-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow different GC triggers #712
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR tidied up the creation of plans and spaces. This is a good thing, but it also exposed some potential structural problem in mmtk-core, which should be addressed later.
There are some minor problems in the parsing of options that should be fixed now.
{ | ||
// We haven't finished creating the plan. No one is using the GC trigger. We cast the arc into a mutable reference. | ||
// TODO: use Arc::get_mut_unchecked() when it is availble. | ||
let gc_trigger: &mut GCTrigger<VM> = unsafe { &mut *(Arc::as_ptr(&gc_trigger) as *mut _) }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The use of unsafe
is unfortunate, but I can't find a better solution, either, without changing too many places.
I think the root cause of this cyclic reference is that MMTk was designed in an object-oriented way, where objects hold references to things they depends on. In this case, Space
needs to trigger GC in acquire
, so Space
needs to point to a GCTrigger
which points to Plan
.
It is more idiomatic in Rust to put actors at the centre. In this case, the actor is Mutator. Mutator requests for more memory via alloc
, and alloc
eventually reaches Space::acquire
, and it may trigger GC. Because Mutator has a pointer to an MMTk
instance, it has access to everything. So the plan can be obtained from the mutator instance (or the tls
argument of Space::acquire
).
In the case where it is the GC worker that allocates, it cannot trigger GC anyway, so we don't need those context in that case.
vm_map: self.global_args.vm_map, | ||
mmapper: self.global_args.mmapper, | ||
heap: &mut self.global_args.heap, | ||
constraints: self.constraints, | ||
gc_trigger: self.global_args.gc_trigger.clone(), | ||
scheduler: self.global_args.scheduler.clone(), | ||
options: &self.global_args.options, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fact that those globally shared resources are referred by deeply embedded sub-structs indicates that they are just "execution contexts" that are required by everything, and should be held by some higher-level structs such as MMTk or Mutator.
But let's tidy this up later. This PR does a good job by making them less messy.
src/util/options.rs
Outdated
if let Some(stripped) = s.strip_prefix("FixedHeapSize") { | ||
// FixedHeapSize:size | ||
if let Some(size_str) = stripped.strip_prefix(':') { | ||
return Self::parse_size(size_str).map(Self::FixedHeapSize); | ||
} | ||
} else if s.starts_with("DynamicHeapSize") { | ||
// DynamicHeapSize:min,max | ||
if let (Some(colon), Some(comma)) = (s.find(':'), s.find(',')) { | ||
let min = Self::parse_size(&s[(colon + 1)..comma])?; | ||
let max = Self::parse_size(&s[(comma + 1)..])?; | ||
if min > max { | ||
return Err(format!( | ||
"Min heap size {:?} is larger than the given max heap size {:?}", | ||
min, max | ||
)); | ||
} | ||
return Ok(Self::DynamicHeapSize(min, max)); | ||
} | ||
} else if s.starts_with("Delegated") { | ||
return Ok(Self::Delegated); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regular expression should make this easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Now I am using regex to deal with the parsing. I was concerned whether it is worth to add the regex crate dependency. Maybe it is worth it.
src/util/options.rs
Outdated
Ok(num * Self::K) | ||
} else if s.ends_with('m') { | ||
Ok(num * Self::M) | ||
} else if s.ends_with('g') { | ||
Ok(num * Self::G) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if num
is within the range of usize
, this may still overflow. For example, "1073741824G" is equal to 2 to the power of 64, and that overflows usize. For 32-bit machines, "4G" already overflows usize
. Use usize::checked_mul
instead.
We can also use u64
or u128
when computing, and convert back to usize
afterwards. This can simplify the parsing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I am parse the number and do multiplication in u64
, and convert to usize at the end.
the heap size cannot grow.
descriptor. Deal with multiply overflow.
binding-refs |
src/util/options.rs
Outdated
let size = if s.ends_with('k') { | ||
num * Self::K | ||
} else if s.ends_with('m') { | ||
num * Self::M | ||
} else if s.ends_with('g') { | ||
num * Self::G | ||
} else if s.ends_with('t') { | ||
num * Self::T |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think we should still use checked_mul
here. Because it is a user-facing interface, it is possible that the user will supply an unreasonably large number and cause an overflow.
The current Rust Reference says overflow will panic in debug mode, but doesn't specify whether it has undefined behaviour or wrapping behaviour. But it is better to be safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
subscribing myself to this PR. LMK if you guys have any question. |
Thanks a lot for offering help. This PR is mostly about the structural changes to MMTk to support different GC triggers. I will have another PR that implements mem balancer. I will tag you in that PR, and it will be great if you can help check and make sure that it is implemented properly. |
This PR reverts a change about where to call `pr.clear_request()` in `Space.acquire()` in #712. The change caused issues like #734. This PR reverts the change. We now clear the reserved pages before GC (the same as before, and the same as Java MMTk), and additionally we inform the GC trigger about pending allocation pages. This closes #734.
This PR reverts a change about where to call `pr.clear_request()` in `Space.acquire()` in mmtk#712. The change caused issues like mmtk#734. This PR reverts the change. We now clear the reserved pages before GC (the same as before, and the same as Java MMTk), and additionally we inform the GC trigger about pending allocation pages. This closes mmtk#734.
This PR adds `GCTrigger` and forwards calls related with GC triggering to `GCTrigger`. `GCTrigger` would allow different implementations, such as fixed heap size, dynamic resizing, and runtime delegation. This PR only implements `FixedHeapSizeTrigger` and a simplified version of `MemBalancerTrigger`. This PR closes mmtk#561, and makes it easier to implement mmtk#641. Changes: * Adds `util::heap::gc_trigger::GCTrigger`. `Plan::poll()` is moved to `GCTrigger`. Any call to `Plan::poll()` is replaced with `GCTrigger::poll()`. * Adds `util::options::GCTriggerSelector` and `gc_trigger` in `Options`. `gc_trigger` defaults to a fixed heap size of half of the physical memory. * Refactoring on arguments for creating plans and spaces. * I needed to pass `GCTrigger` to plans and spaces so I think it would be better that I just take this chance to do the refactoring. * We now have structs to organize the arguments and any plan and space will receive the same set of the arguments for their constructors. This greatly simplifies our code.
This PR reverts a change about where to call `pr.clear_request()` in `Space.acquire()` in mmtk#712. The change caused issues like mmtk#734. This PR reverts the change. We now clear the reserved pages before GC (the same as before, and the same as Java MMTk), and additionally we inform the GC trigger about pending allocation pages. This closes mmtk#734.
This PR adds
GCTrigger
and forwards calls related with GC triggering toGCTrigger
.GCTrigger
would allow different implementations, such as fixed heap size, dynamic resizing, and runtime delegation. This PR only implementsFixedHeapSizeTrigger
and a simplified version ofMemBalancerTrigger
. This PR closes #561, and makes it easier to implement #641.Changes:
util::heap::gc_trigger::GCTrigger
.Plan::poll()
is moved toGCTrigger
. Any call toPlan::poll()
is replaced withGCTrigger::poll()
.util::options::GCTriggerSelector
andgc_trigger
inOptions
.gc_trigger
defaults to a fixed heap size of half of the physical memory.GCTrigger
to plans and spaces so I think it would be better that I just take this chance to do the refactoring.Performance:
There is very little difference for the refactoring in this PR. The following compares this PR with master, both using the same fixed heap size. It shows the refactoring and the changes have no measurable overhead.
Plotty link (not public visible)
https://squirrel.anu.edu.au/plotty/yilin/mmtk/#0|skunk-2022-12-14-Wed-054336&benchmark^build^invocation^iteration&bmtime^time^time.other^time.stw&|10&iteration^1^2|20&1^invocation|30&1&benchmark&build;jdk-mmtk-master|41&Histogram%20(with%20CI)^build^benchmark&