-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8236926: Concurrently uncommit memory in G1 #1141
Conversation
👋 Welcome back sjohanss! A progress list of the required criteria for merging this PR into |
Webrevs
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR description provides a very good summary. Furthermore, it would be nice to have an overview of the algorithm: how the additional concurrency managed, what resources are protected by the lock, etc. Ideally, this should be in the comments, making references to corresponding classes/variables.
I've also done a performance run and as expected there are not significant changes.
How about pauses? One of the motivations of this PR is to remove pauses introduced by synchronous uncommit, right?
@@ -2659,6 +2663,12 @@ void G1CollectedHeap::gc_epilogue(bool full) { | |||
_collection_pause_end = Ticks::now(); | |||
} | |||
|
|||
void G1CollectedHeap::uncommit_heap_if_necessary() { | |||
if (hrm()->has_inactive_regions()) { | |||
G1UncommitRegionTask::activate(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since activate
is already used for regions, I would suggest another word; simple run
is enough. Additionally, I don't think using an enum is necessary, a plain bool is_running
should do.
enum class TaskState { active, inactive };
TaskState _state;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, for a short while I had the state running
as well, but now just having a simple bool is enough now. I still call the state _active
since I think that is more accurate. Also change the name to run()
.
@@ -182,13 +174,30 @@ HeapRegion* HeapRegionManager::new_heap_region(uint hrm_index) { | |||
return g1h->new_heap_region(hrm_index, mr); | |||
} | |||
|
|||
void HeapRegionManager::expand(uint start, uint num_regions, WorkGang* pretouch_gang) { | |||
guarantee(num_regions > 0, "No point in calling this for zero regions"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same guarantee
is already in commit_regions
; I think an assert is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed it, as you say we check it the first thing we do in commit_regions
.
_allocated_heapregions_length = MAX2(_allocated_heapregions_length, i + 1); | ||
} | ||
|
||
if (G1CollectedHeap::heap()->hr_printer()->is_active()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This if
is not needed, right? commit(hr)
does such check already. There are a few other cases of the same kind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit unfortunate, but this is_active()
is checking if the G1HRPrinter
is active and should print stuff. So it is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Misunderstood Alberts comment here. He is correct that the check inside commit(hr)
is enough, updated this and a few other similar cases.
guarantee(num_regions >= 1, "Need to specify at least one region to uncommit, tried to uncommit zero regions at %u", start); | ||
guarantee(length() >= num_regions, "pre-condition"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not really sure why guarantee
here but assert
in reactivate_regions
. I don't see a strong reason to use guarantee
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to assert, the reason for them being different is that the code in deactivate_regions
is old and just moved into this function. Also changed condition to > 0 like we have most other places.
void HeapRegionManager::reactivate_regions(uint start, uint num_regions) { | ||
assert(num_regions > 0, "No point in calling this for zero regions"); | ||
|
||
clear_auxiliary_data_structures(start, num_regions); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, this name is too general. Even after reading its body and the associated comments, I don't get what "data structures" are cleared.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me the comments and implementation is pretty descriptive, but I did update the comment for signal_mapping_changed
a bit to make it more explicit. Also added a few lines to explain what will be cleared by each mapper.
I agree that this is not a perfect name but I think the naming is in line with how we refer to these structures elsewhere in the code.
@@ -107,6 +90,9 @@ class HeapRegionManager: public CHeapObj<mtGC> { | |||
// Pass down commit calls to the VirtualSpace. | |||
void commit_regions(uint index, size_t num_regions = 1, WorkGang* pretouch_gang = NULL); | |||
|
|||
// Initialize the HeapRegions in the range and put them on the free list. | |||
void initialize_regions(uint start, uint num_regions); | |||
|
|||
// Notify other data structures about change in the heap layout. | |||
void update_committed_space(HeapWord* old_end, HeapWord* new_end); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update_committed_space
is not implemented, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, filed JDK-8256323 for this.
|
||
// The number of regions committed in the heap. | ||
uint _num_committed; | ||
// Map to keep track of which regions are in use. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name of the variable suggests it's tracking what regions are committed and that's all. After reading G1CommittedRegionMap
, I believe the class name and the var name are quite misleading; it tracks the state of mem backing up all regions, committed+mapped
, committed
, to_be_uncommitted
, committed
. I don't have any good alternatives, but the comments could surely be expanded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naming is hard and I agree this isn't perfect but the union of the two bitmaps in G1CommittedRegionMap
do track what is committed, so it's not completely wrong.
I did update the comment here a bit.
void activate_regions(uint index, uint num_regions = 1); | ||
void deactivate_regions(uint start, size_t num_regions); | ||
void reactivate_regions(uint start, uint num_regions); | ||
void uncommit_regions(uint start, uint num_regions); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why uint
vs size_t
? Why = 1
for some but not others? If such inconsistency is intentional, it should be documented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for size_t
for deactivate_regions
is that we have a size_t in shrink_at
which calls it. But for consistency here I will change it to uint
and add a cast in shrink_at
.
For the default values, on expand it is historic but can now be removed. Good catch, and for inactive I don't have any good excuse =)
assert(_state == TaskState::active, "Must be active"); | ||
|
||
// Each execution is limited to uncommit at most 256M worth of regions. | ||
static const uint region_limit = (uint) (256 * M / G1HeapRegionSize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why 256M? Better include some motivation in the comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
256M is just a "reasonable" limit that I picked to get short enough invocations. I updated the comment a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One suggestion I have is to make this a static constant in the class declaration not dig it up somewhere in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved the 256 * M out to be a class constant, leaving the region limit here since it is not known compile time.
void active_set_range(uint start, uint end); | ||
void active_clear_range(uint start, uint end); | ||
void inactive_set_range(uint start, uint end); | ||
void inactive_clear_range(uint start, uint end); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After reading their implementation, I see that Uncommit_lock
must be held on call them. I think it's best to mention this precondition in the comments, and explain why this lock is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the comments a bit and referred to guarantee_mt_safty_* for more details.
@kstefanj this pull request can not be integrated into git checkout 8236926-ccu
git fetch https://git.openjdk.java.net/jdk master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push |
Thanks @albertnetymk for your comments, this update does not include any updates for those. This instead fixes a race in the low level committing code for G1 where it was assumed only one thread can commit and uncommit regions at a time. This is no longer the case after this change. It is true for a single region, but adjacent regions are allowed to be committed/uncommitted in parallel. This can for example happen if a humongous allocation happens during concurrent uncommit. The fix is to use parallel versions of the bitmap operations and for the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comments Albert.
@@ -2659,6 +2663,12 @@ void G1CollectedHeap::gc_epilogue(bool full) { | |||
_collection_pause_end = Ticks::now(); | |||
} | |||
|
|||
void G1CollectedHeap::uncommit_heap_if_necessary() { | |||
if (hrm()->has_inactive_regions()) { | |||
G1UncommitRegionTask::activate(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, for a short while I had the state running
as well, but now just having a simple bool is enough now. I still call the state _active
since I think that is more accurate. Also change the name to run()
.
void active_set_range(uint start, uint end); | ||
void active_clear_range(uint start, uint end); | ||
void inactive_set_range(uint start, uint end); | ||
void inactive_clear_range(uint start, uint end); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the comments a bit and referred to guarantee_mt_safty_* for more details.
assert(_state == TaskState::active, "Must be active"); | ||
|
||
// Each execution is limited to uncommit at most 256M worth of regions. | ||
static const uint region_limit = (uint) (256 * M / G1HeapRegionSize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
256M is just a "reasonable" limit that I picked to get short enough invocations. I updated the comment a bit.
@@ -182,13 +174,30 @@ HeapRegion* HeapRegionManager::new_heap_region(uint hrm_index) { | |||
return g1h->new_heap_region(hrm_index, mr); | |||
} | |||
|
|||
void HeapRegionManager::expand(uint start, uint num_regions, WorkGang* pretouch_gang) { | |||
guarantee(num_regions > 0, "No point in calling this for zero regions"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed it, as you say we check it the first thing we do in commit_regions
.
_allocated_heapregions_length = MAX2(_allocated_heapregions_length, i + 1); | ||
} | ||
|
||
if (G1CollectedHeap::heap()->hr_printer()->is_active()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit unfortunate, but this is_active()
is checking if the G1HRPrinter
is active and should print stuff. So it is needed.
guarantee(num_regions >= 1, "Need to specify at least one region to uncommit, tried to uncommit zero regions at %u", start); | ||
guarantee(length() >= num_regions, "pre-condition"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to assert, the reason for them being different is that the code in deactivate_regions
is old and just moved into this function. Also changed condition to > 0 like we have most other places.
HeapRegionRange range = _committed_map.next_inactive_range(offset); | ||
// No more regions available for uncommit | ||
if (range.length() == 0) { | ||
return uncommitted; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uncommitted
can be 0 here so we can't add that assert. There is a chance that between we add the uncommit-task (which calls this function) and that we grab the lock, someone else might have used the inactive
regions to expand the heap again. Update the comment a bit, I hope it makes it easier to follow.
|
||
// The number of regions committed in the heap. | ||
uint _num_committed; | ||
// Map to keep track of which regions are in use. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naming is hard and I agree this isn't perfect but the union of the two bitmaps in G1CommittedRegionMap
do track what is committed, so it's not completely wrong.
I did update the comment here a bit.
@@ -107,6 +90,9 @@ class HeapRegionManager: public CHeapObj<mtGC> { | |||
// Pass down commit calls to the VirtualSpace. | |||
void commit_regions(uint index, size_t num_regions = 1, WorkGang* pretouch_gang = NULL); | |||
|
|||
// Initialize the HeapRegions in the range and put them on the free list. | |||
void initialize_regions(uint start, uint num_regions); | |||
|
|||
// Notify other data structures about change in the heap layout. | |||
void update_committed_space(HeapWord* old_end, HeapWord* new_end); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, filed JDK-8256323 for this.
void activate_regions(uint index, uint num_regions = 1); | ||
void deactivate_regions(uint start, size_t num_regions); | ||
void reactivate_regions(uint start, uint num_regions); | ||
void uncommit_regions(uint start, uint num_regions); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for size_t
for deactivate_regions
is that we have a size_t in shrink_at
which calls it. But for consistency here I will change it to uint
and add a cast in shrink_at
.
For the default values, on expand it is historic but can now be removed. Good catch, and for inactive I don't have any good excuse =)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the revision.
HeapRegionRange next_inactive_range(uint offset) const; | ||
// Finds the next range of committable regions starting at offset. | ||
// This function must only be called when no inactive regions are | ||
// present and can be used to active more regions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/active/activate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@@ -563,6 +563,11 @@ class G1CollectedHeap : public CollectedHeap { | |||
|
|||
void resize_heap_if_necessary(); | |||
|
|||
// Check if there is memory to uncommit and if so schedule a task to do it. | |||
void uncommit_heap_if_necessary(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer if the method were called uncommit_regions_if_necessary()
as this method does not uncommit the heap but just uncomittable regions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, I agree that is more accurate.
@@ -563,6 +563,11 @@ class G1CollectedHeap : public CollectedHeap { | |||
|
|||
void resize_heap_if_necessary(); | |||
|
|||
// Check if there is memory to uncommit and if so schedule a task to do it. | |||
void uncommit_heap_if_necessary(); | |||
uint uncommit_regions(uint region_limit); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment like "// Immediately uncommits uncommittable regions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
log_debug(gc, ergo, heap)("Attempt heap shrinking (archive regions). Total size: " SIZE_FORMAT "B", | ||
HeapRegion::GrainWords * HeapWordSize * shrink_count); | ||
// Explicit uncommit. | ||
_hrm.uncommit_inactive_regions((uint) shrink_count); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please let the code use the G1CollectedHeap::uncommit_regions()
helper here to limit the references to _hrm
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, fixed.
@@ -743,7 +744,7 @@ void G1CollectedHeap::dealloc_archive_regions(MemRegion* ranges, size_t count) { | |||
HeapWord* prev_last_addr = NULL; | |||
HeapRegion* prev_last_region = NULL; | |||
size_t size_used = 0; | |||
size_t uncommitted_regions = 0; | |||
size_t shrink_count = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code may as well define shrink_count
as uint
as the only use seems to cast it to uint
anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, I left it size_t
since everything else in this function uses size_t
, but uint
is a better fit.
virtual void commit_regions(uint start_idx, size_t num_regions, WorkGang* pretouch_gang) { | ||
guarantee(uncommitted_range(start_idx, num_regions), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure this (and the one in uncommit_regions
) should be guarantees.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went with guarantee
since this is what's used in G1PageBasedVirtualSpace
for similar checks. Errors like this might be more likely in release builds.
// those resources are in sync: | ||
// - G1RegionToSpaceMapper::_region_commit_map; | ||
// - G1PageBasedVirtualSpace::_committed (_storage.commit()) | ||
Mutex _lock; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good comment! :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
@@ -133,6 +133,9 @@ class G1ServiceThread: public ConcurrentGCThread { | |||
// Register a task with the service thread and schedule it. If | |||
// no delay is specified the task is scheduled to run directly. | |||
void register_task(G1ServiceTask* task, jlong delay = 0); | |||
// Notify a change to the service thread. Used to stop either | |||
// stop the service or to force check for new tasks. | |||
void notify(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first "stop" is too much
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch.
uint length() const { return _end - _start; } | ||
}; | ||
|
||
class G1CommittedRegionMap : public CHeapObj<mtGC> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to have a diagram of the region states and their transitions here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea.
} | ||
G1CollectedHeap::heap()->hr_printer()->commit(hr); | ||
} | ||
activate_regions(start, num_regions); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this place there will be two messages from the HRPrinter for every region:
- a COMMIT message and
- an ACTIVATE message
This is a bit confusing as in my understanding (that's why I asked for a region state diagram in the G1CommittedRegionMap
which may as well be put in HeapRegionManager
) the (typical) flow of states are Uncommitted->Committed/Active->Committed/Inactive->Uncommitted.
As mentioned, I'm not sure if it is a good idea to send two separate messages here; better rename the "Active" message to "Commit-Active" (and "Inactive" to "Commit-Inactive") instead imho, even if it's quite long (and drop HRPrinter::commit()
completely)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's true that it will generate two messages when committing a previously uncommitted region, but I still think it is valuable to separate them since we can also have the state change Active->Inactive->Active
. In this case the transition from Inactive to Active will not include a commit, but rather making inactive regions active again. Just seeing a "Commit-Active" message in this case would not be as clear as seeing "Active" that is not immediately preceded by "Commit".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I looked at the G1HRPrinter
again, and indeed it prints the action, not the region state, so this is a fair point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review Thomas. Addressed most of your concerns, but some things I left as is.
@@ -743,7 +744,7 @@ void G1CollectedHeap::dealloc_archive_regions(MemRegion* ranges, size_t count) { | |||
HeapWord* prev_last_addr = NULL; | |||
HeapRegion* prev_last_region = NULL; | |||
size_t size_used = 0; | |||
size_t uncommitted_regions = 0; | |||
size_t shrink_count = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, I left it size_t
since everything else in this function uses size_t
, but uint
is a better fit.
log_debug(gc, ergo, heap)("Attempt heap shrinking (archive regions). Total size: " SIZE_FORMAT "B", | ||
HeapRegion::GrainWords * HeapWordSize * shrink_count); | ||
// Explicit uncommit. | ||
_hrm.uncommit_inactive_regions((uint) shrink_count); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, fixed.
@@ -1305,6 +1309,7 @@ void G1CollectedHeap::shrink_helper(size_t shrink_bytes) { | |||
log_debug(gc, ergo, heap)("Shrink the heap. requested shrinking amount: " SIZE_FORMAT "B aligned shrinking amount: " SIZE_FORMAT "B attempted shrinking amount: " SIZE_FORMAT "B", | |||
shrink_bytes, aligned_shrink_bytes, shrunk_bytes); | |||
if (num_regions_removed > 0) { | |||
log_debug(gc, heap)("Regions ready for uncommit: %u", num_regions_removed); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sound good.
@@ -563,6 +563,11 @@ class G1CollectedHeap : public CollectedHeap { | |||
|
|||
void resize_heap_if_necessary(); | |||
|
|||
// Check if there is memory to uncommit and if so schedule a task to do it. | |||
void uncommit_heap_if_necessary(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, I agree that is more accurate.
@@ -563,6 +563,11 @@ class G1CollectedHeap : public CollectedHeap { | |||
|
|||
void resize_heap_if_necessary(); | |||
|
|||
// Check if there is memory to uncommit and if so schedule a task to do it. | |||
void uncommit_heap_if_necessary(); | |||
uint uncommit_regions(uint region_limit); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
if (g1h->has_uncommittable_regions()) { | ||
// No delay, reason to reschedule rather then to loop is to allow | ||
// other tasks to run without waiting for a full uncommit cycle. | ||
schedule(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because here we know that the service thread is running and if we schedule with a 0 delay, there is no risk of it going to sleep before we run again. Another task might be up next, but this task will eventually run before the service thread can go to sleep.
There is room for improvement on how we schedule tasks from another thread. I changed enqueue
to call a new public function on the G1ServiceThread
called schedule_task
, which calls schedule(task, delay)
and then notify()
. The function schedule(task, delay)
is what previously was named schedule_task
. This is what will be called when someone does task->schedule(delay)
as well, so it is a bit more unified.
// set to active only from a safepoint and it is set to false | ||
// while running on the service thread joined with the suspendible | ||
// thread set. | ||
bool _active; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point, since I moved the 256M constant into the class, the chunking info is now in a comment for this constant. For the state I added the info around usage and moved the implementation detail into set_active()
.
void clear_summary(); | ||
|
||
public: | ||
static void run(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like enqueue
, changed.
@@ -133,6 +133,9 @@ class G1ServiceThread: public ConcurrentGCThread { | |||
// Register a task with the service thread and schedule it. If | |||
// no delay is specified the task is scheduled to run directly. | |||
void register_task(G1ServiceTask* task, jlong delay = 0); | |||
// Notify a change to the service thread. Used to stop either | |||
// stop the service or to force check for new tasks. | |||
void notify(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch.
} | ||
G1CollectedHeap::heap()->hr_printer()->commit(hr); | ||
} | ||
activate_regions(start, num_regions); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's true that it will generate two messages when committing a previously uncommitted region, but I still think it is valuable to separate them since we can also have the state change Active->Inactive->Active
. In this case the transition from Inactive to Active will not include a commit, but rather making inactive regions active again. Just seeing a "Commit-Active" message in this case would not be as clear as seeing "Active" that is not immediately preceded by "Commit".
Testing on the latest changes looks good. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All good, thanks, except for some remaining minor nit and some question.
@@ -210,11 +210,10 @@ void G1ServiceThread::register_task(G1ServiceTask* task, jlong delay) { | |||
|
|||
// Notify the service thread that there is a new task, thread might | |||
// be waiting and the newly added task might be first in the list. | |||
MonitorLocker ml(&_monitor, Mutex::_no_safepoint_check_flag); | |||
ml.notify(); | |||
notify(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe call schedule_task()
here because the two calls are just that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, yes since we are already calling schedule_task()
this call to notify
is not needed.
if (g1h->has_uncommittable_regions()) { | ||
// No delay, reason to reschedule rather then to loop is to allow | ||
// other tasks to run without waiting for a full uncommit cycle. | ||
schedule(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My question is mainly, why not notify the task to wake up when scheduling a new task in all cases. I understand the reason for the zero delay.
Not seeing the problem of doing so:
-
tasks that need to run (or are overdue) are automatically run before this task as they have a time-to-run < current time, and so this task is scheduled after
-
The extra notification for
schedule_task()
does not seem to hurt, at most it wakes up the service thread to do the next scheduled task (which afaiu other tasks if they are already due).
I.e. the "optimization" here to not notify the service thread seems to be superfluous. Or maybe the notification could be suppressed automatically if schedule_task()
is called in execute
(G1ServiceThread
can check fairly easily if it is currently running a task)
I am concerned about users of the API to needlessly have to decide whether they should call schedule()
or schedule_task()
as they have different effect.
Maybe schedule()
could just call schedule_task()
.
(That might be a pre-existing issue of using schedule
vs. schedule_task()
, so feel free to say it's out of scope. :) )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Thomas for your review.
if (g1h->has_uncommittable_regions()) { | ||
// No delay, reason to reschedule rather then to loop is to allow | ||
// other tasks to run without waiting for a full uncommit cycle. | ||
schedule(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just pushed the assert and also update the comments a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm. Ship it.
@kstefanj This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 21 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
Thanks for the reviews @albertnetymk and @tschatzl! |
@kstefanj Since your change was applied there have been 21 commits pushed to the
Your commit was automatically rebased without conflicts. Pushed as commit b8244b6. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
Please review this change that implements concurrent uncommit for G1.
Summary
G1 currently check if the heap can be shrunk at the end of the Remark pause and at the end of a Full GC. The uncommit work (handing back the memory to the OS) is quite expensive and this change moves it out of the pause. The actual uncommitting is now handled by the G1 service thread and the new task
G1 Uncommit Region Task
. The new task will uncommit memory in chunks of regions to avoid starving out other tasks.The calculations of how much to shrink the heap and when is not changed, but during the pause only quick preparation work is done. Splitting the uncommit work into two parts comes with some additional meta-data cost. Previously we had a single bitmap to mark if a region was committed or not, now we need two bitmaps. One bitmap to keep track of the regions available for use (active) and one bitmap for the regions ready to be uncommitted (inactive). The union of those two bitmaps are the regions currently committed. When expanding the heap we prefer to re-activate regions from the inactive bitmap if there are any, instead of committing new regions, since this is cheaper (avoiding calls to the OS).
Splitting the work also comes with some additional synchronization. Both the uncommit task and a mutator thread doing a humongous allocation might want to alter the inactive map at the same time. To prevent this a new lock
Uncommit_lock
is added.One thing to note is that there is still one case left where we do the uncommit directly and this is during CDS initialization.
Logging
To track the concurrent uncommit in logs a few additional messages have been added. There are no new
info
messages, but forgc+heap
there are two newdebug
messages and onetrace
:The
trace
is printed for each invocation of the task while thedebug
message is only printed when there is no more uncommit work available. As you can see in the above example, it's not certain that all regions made ready for uncommit are actually uncommitted. The reason for this is that the heap had to grow again during the concurrent uncommit, and regions were re-activated.On
gc+heap+region
there are new logs to see how ranges of regions transition between different states:Testing
Two new tests have been added, one gtest and one jtreg test. These are intended to test the basic functionality, but most testing is gained by just running applications that resize the heap. This is quite common in our testing, so the code will be exercised a lot.
I've run multiple runs of mach5 testing tier 1-5 as well as local testing. I've also done a performance run and as expected there are not significant changes.
Progress
Testing
Issue
Reviewers
Download
$ git fetch https://git.openjdk.java.net/jdk pull/1141/head:pull/1141
$ git checkout pull/1141