8236926: Concurrently uncommit memory in G1 #1141

kstefanj · 2020-11-10T10:43:20Z

Please review this change that implements concurrent uncommit for G1.

Summary
G1 currently check if the heap can be shrunk at the end of the Remark pause and at the end of a Full GC. The uncommit work (handing back the memory to the OS) is quite expensive and this change moves it out of the pause. The actual uncommitting is now handled by the G1 service thread and the new task G1 Uncommit Region Task. The new task will uncommit memory in chunks of regions to avoid starving out other tasks.

The calculations of how much to shrink the heap and when is not changed, but during the pause only quick preparation work is done. Splitting the uncommit work into two parts comes with some additional meta-data cost. Previously we had a single bitmap to mark if a region was committed or not, now we need two bitmaps. One bitmap to keep track of the regions available for use (active) and one bitmap for the regions ready to be uncommitted (inactive). The union of those two bitmaps are the regions currently committed. When expanding the heap we prefer to re-activate regions from the inactive bitmap if there are any, instead of committing new regions, since this is cheaper (avoiding calls to the OS).

Splitting the work also comes with some additional synchronization. Both the uncommit task and a mutator thread doing a humongous allocation might want to alter the inactive map at the same time. To prevent this a new lock Uncommit_lock is added.

One thing to note is that there is still one case left where we do the uncommit directly and this is during CDS initialization.

Logging
To track the concurrent uncommit in logs a few additional messages have been added. There are no new info messages, but for gc+heap there are two new debug messages and one trace:

[7,468s][debug][gc,heap        ] GC(32) Regions ready for uncommit: 1873
...
[7,509s][trace][gc,heap        ] Concurrent Uncommit: 256M, 32 regions, 11,173ms
[7,522s][trace][gc,heap        ] Concurrent Uncommit: 256M, 32 regions, 12,599ms
...
[9,691s][debug][gc,heap        ] Concurrent Uncommit Summary: 4864M, 608 regions, 405,827ms

The trace is printed for each invocation of the task while the debug message is only printed when there is no more uncommit work available. As you can see in the above example, it's not certain that all regions made ready for uncommit are actually uncommitted. The reason for this is that the heap had to grow again during the concurrent uncommit, and regions were re-activated.

On gc+heap+region there are new logs to see how ranges of regions transition between different states:

[6,337s][debug][gc,heap,region ] Uncommit regions [12768, 13024)
[6,424s][debug][gc,heap,region ] Uncommit regions [13024, 13280)
[6,438s][debug][gc,heap,region ] Uncommit regions [13280, 13536)
[6,510s][debug][gc,heap,region ] Uncommit regions [13536, 13792)
[6,573s][debug][gc,heap,region ] GC(79) Reactivate regions [13792, 15651)
[6,574s][debug][gc,heap,region ] GC(79) Activate regions [76, 96)
[6,579s][debug][gc,heap,region ] GC(79) Activate regions [97, 1099)

Testing
Two new tests have been added, one gtest and one jtreg test. These are intended to test the basic functionality, but most testing is gained by just running applications that resize the heap. This is quite common in our testing, so the code will be exercised a lot.

I've run multiple runs of mach5 testing tier 1-5 as well as local testing. I've also done a performance run and as expected there are not significant changes.

Progress

Change must not contain extraneous whitespace
Commit message must refer to an issue
Change must be properly reviewed

Testing

	Linux aarch64	Linux arm	Linux ppc64le	Linux s390x	Linux x64	Linux x86	Windows x64	macOS x64
Build	⏳ (1/1 running)	⏳ (1/1 running)	⏳ (1/1 running)	⏳ (1/1 running)	✔️ (6/6 passed)	✔️ (2/2 passed)	⏳ (2/2 running)	✔️ (2/2 passed)
Test (tier1)					⏳ (7/9 running)	⏳ (8/9 running)		⏳ (8/9 running)

Issue

JDK-8236926: Concurrently uncommit memory in G1

Reviewers

Albert Mingkun Yang (@albertnetymk - Author) ⚠️ Review applies to c354b1d
Thomas Schatzl (@tschatzl - Reviewer) ⚠️ Review applies to 553f99a

Download

$ git fetch https://git.openjdk.java.net/jdk pull/1141/head:pull/1141
$ git checkout pull/1141

bridgekeeper · 2020-11-10T10:43:27Z

👋 Welcome back sjohanss! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

openjdk · 2020-11-10T10:44:33Z

@kstefanj The following label will be automatically applied to this pull request:

hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

mlbridge · 2020-11-10T10:56:24Z

Webrevs

albertnetymk

The PR description provides a very good summary. Furthermore, it would be nice to have an overview of the algorithm: how the additional concurrency managed, what resources are protected by the lock, etc. Ideally, this should be in the comments, making references to corresponding classes/variables.

I've also done a performance run and as expected there are not significant changes.

How about pauses? One of the motivations of this PR is to remove pauses introduced by synchronous uncommit, right?

albertnetymk · 2020-11-12T10:49:23Z

src/hotspot/share/gc/g1/g1CollectedHeap.cpp

@@ -2659,6 +2663,12 @@ void G1CollectedHeap::gc_epilogue(bool full) {
  _collection_pause_end = Ticks::now();
 }

+void G1CollectedHeap::uncommit_heap_if_necessary() {
+  if (hrm()->has_inactive_regions()) {
+    G1UncommitRegionTask::activate();


Since activate is already used for regions, I would suggest another word; simple run is enough. Additionally, I don't think using an enum is necessary, a plain bool is_running should do.

enum class TaskState { active, inactive }; TaskState _state;

Good catch, for a short while I had the state running as well, but now just having a simple bool is enough now. I still call the state _active since I think that is more accurate. Also change the name to run().

albertnetymk · 2020-11-12T10:50:40Z

src/hotspot/share/gc/g1/heapRegionManager.cpp

@@ -182,13 +174,30 @@ HeapRegion* HeapRegionManager::new_heap_region(uint hrm_index) {
  return g1h->new_heap_region(hrm_index, mr);
 }

+void HeapRegionManager::expand(uint start, uint num_regions, WorkGang* pretouch_gang) {
+  guarantee(num_regions > 0, "No point in calling this for zero regions");


The same guarantee is already in commit_regions; I think an assert is fine.

Removed it, as you say we check it the first thing we do in commit_regions.

albertnetymk · 2020-11-12T10:51:39Z

src/hotspot/share/gc/g1/heapRegionManager.cpp

+      _allocated_heapregions_length = MAX2(_allocated_heapregions_length, i + 1);
+    }
+
+    if (G1CollectedHeap::heap()->hr_printer()->is_active()) {


This if is not needed, right? commit(hr) does such check already. There are a few other cases of the same kind.

This is a bit unfortunate, but this is_active() is checking if the G1HRPrinter is active and should print stuff. So it is needed.

Misunderstood Alberts comment here. He is correct that the check inside commit(hr) is enough, updated this and a few other similar cases.

albertnetymk · 2020-11-12T11:12:01Z

src/hotspot/share/gc/g1/heapRegionManager.cpp

+  guarantee(num_regions >= 1, "Need to specify at least one region to uncommit, tried to uncommit zero regions at %u", start);
+  guarantee(length() >= num_regions, "pre-condition");


I am not really sure why guarantee here but assert in reactivate_regions. I don't see a strong reason to use guarantee here.

Changed to assert, the reason for them being different is that the code in deactivate_regions is old and just moved into this function. Also changed condition to > 0 like we have most other places.

albertnetymk · 2020-11-12T20:38:17Z

src/hotspot/share/gc/g1/heapRegionManager.cpp

+void HeapRegionManager::reactivate_regions(uint start, uint num_regions) {
+  assert(num_regions > 0, "No point in calling this for zero regions");
+
+  clear_auxiliary_data_structures(start, num_regions);


IMO, this name is too general. Even after reading its body and the associated comments, I don't get what "data structures" are cleared.

To me the comments and implementation is pretty descriptive, but I did update the comment for signal_mapping_changed a bit to make it more explicit. Also added a few lines to explain what will be cleared by each mapper.

I agree that this is not a perfect name but I think the naming is in line with how we refer to these structures elsewhere in the code.

albertnetymk · 2020-11-12T20:45:52Z

src/hotspot/share/gc/g1/heapRegionManager.hpp

@@ -107,6 +90,9 @@ class HeapRegionManager: public CHeapObj<mtGC> {
  // Pass down commit calls to the VirtualSpace.
  void commit_regions(uint index, size_t num_regions = 1, WorkGang* pretouch_gang = NULL);

+  // Initialize the HeapRegions in the range and put them on the free list.
+  void initialize_regions(uint start, uint num_regions);
+
  // Notify other data structures about change in the heap layout.
  void update_committed_space(HeapWord* old_end, HeapWord* new_end);


update_committed_space is not implemented, right?

Correct, filed JDK-8256323 for this.

albertnetymk · 2020-11-12T20:57:39Z

src/hotspot/share/gc/g1/heapRegionManager.hpp

-
-  // The number of regions committed in the heap.
-  uint _num_committed;
+  // Map to keep track of which regions are in use.


The name of the variable suggests it's tracking what regions are committed and that's all. After reading G1CommittedRegionMap, I believe the class name and the var name are quite misleading; it tracks the state of mem backing up all regions, committed+mapped, committed, to_be_uncommitted, committed. I don't have any good alternatives, but the comments could surely be expanded.

Naming is hard and I agree this isn't perfect but the union of the two bitmaps in G1CommittedRegionMap do track what is committed, so it's not completely wrong.

I did update the comment here a bit.

albertnetymk · 2020-11-12T21:00:22Z

src/hotspot/share/gc/g1/heapRegionManager.hpp

+  void activate_regions(uint index, uint num_regions = 1);
+  void deactivate_regions(uint start, size_t num_regions);
+  void reactivate_regions(uint start, uint num_regions);
+  void uncommit_regions(uint start, uint num_regions);


Why uint vs size_t? Why = 1 for some but not others? If such inconsistency is intentional, it should be documented.

The reason for size_t for deactivate_regions is that we have a size_t in shrink_at which calls it. But for consistency here I will change it to uint and add a cast in shrink_at.

For the default values, on expand it is historic but can now be removed. Good catch, and for inactive I don't have any good excuse =)

albertnetymk · 2020-11-12T21:32:20Z

src/hotspot/share/gc/g1/g1UncommitRegionTask.cpp

+  assert(_state == TaskState::active, "Must be active");
+
+  // Each execution is limited to uncommit at most 256M worth of regions.
+  static const uint region_limit = (uint) (256 * M / G1HeapRegionSize);


Why 256M? Better include some motivation in the comments.

256M is just a "reasonable" limit that I picked to get short enough invocations. I updated the comment a bit.

One suggestion I have is to make this a static constant in the class declaration not dig it up somewhere in the code.

Moved the 256 * M out to be a class constant, leaving the region limit here since it is not known compile time.

albertnetymk · 2020-11-12T21:42:39Z

src/hotspot/share/gc/g1/g1CommittedRegionMap.hpp

+  void active_set_range(uint start, uint end);
+  void active_clear_range(uint start, uint end);
+  void inactive_set_range(uint start, uint end);
+  void inactive_clear_range(uint start, uint end);


After reading their implementation, I see that Uncommit_lock must be held on call them. I think it's best to mention this precondition in the comments, and explain why this lock is needed.

Updated the comments a bit and referred to guarantee_mt_safty_* for more details.

openjdk · 2020-11-13T08:23:36Z

@kstefanj this pull request can not be integrated into master due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:

git checkout 8236926-ccu
git fetch https://git.openjdk.java.net/jdk master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push

kstefanj · 2020-11-13T08:33:07Z

Thanks @albertnetymk for your comments, this update does not include any updates for those. This instead fixes a race in the low level committing code for G1 where it was assumed only one thread can commit and uncommit regions at a time. This is no longer the case after this change. It is true for a single region, but adjacent regions are allowed to be committed/uncommitted in parallel. This can for example happen if a humongous allocation happens during concurrent uncommit.

The fix is to use parallel versions of the bitmap operations and for the G1RegionsSmallerThanCommitSizeMapper add a lock to prevent parallel updates for this mapper. This is needed because multiple regions can share a single underlying OS page, so we need to make sure those updates are atomic on a page level.

kstefanj

Thanks for the comments Albert.

kstefanj · 2020-11-13T10:28:33Z

src/hotspot/share/gc/g1/g1CollectedHeap.cpp

@@ -2659,6 +2663,12 @@ void G1CollectedHeap::gc_epilogue(bool full) {
  _collection_pause_end = Ticks::now();
 }

+void G1CollectedHeap::uncommit_heap_if_necessary() {
+  if (hrm()->has_inactive_regions()) {
+    G1UncommitRegionTask::activate();


Good catch, for a short while I had the state running as well, but now just having a simple bool is enough now. I still call the state _active since I think that is more accurate. Also change the name to run().

kstefanj · 2020-11-13T11:24:47Z

src/hotspot/share/gc/g1/g1CommittedRegionMap.hpp

+  void active_set_range(uint start, uint end);
+  void active_clear_range(uint start, uint end);
+  void inactive_set_range(uint start, uint end);
+  void inactive_clear_range(uint start, uint end);


Updated the comments a bit and referred to guarantee_mt_safty_* for more details.

kstefanj · 2020-11-13T12:15:58Z

src/hotspot/share/gc/g1/g1UncommitRegionTask.cpp

+  assert(_state == TaskState::active, "Must be active");
+
+  // Each execution is limited to uncommit at most 256M worth of regions.
+  static const uint region_limit = (uint) (256 * M / G1HeapRegionSize);


256M is just a "reasonable" limit that I picked to get short enough invocations. I updated the comment a bit.

kstefanj · 2020-11-13T12:33:53Z

src/hotspot/share/gc/g1/heapRegionManager.cpp

@@ -182,13 +174,30 @@ HeapRegion* HeapRegionManager::new_heap_region(uint hrm_index) {
  return g1h->new_heap_region(hrm_index, mr);
 }

+void HeapRegionManager::expand(uint start, uint num_regions, WorkGang* pretouch_gang) {
+  guarantee(num_regions > 0, "No point in calling this for zero regions");


Removed it, as you say we check it the first thing we do in commit_regions.

kstefanj · 2020-11-13T12:35:58Z

src/hotspot/share/gc/g1/heapRegionManager.cpp

+      _allocated_heapregions_length = MAX2(_allocated_heapregions_length, i + 1);
+    }
+
+    if (G1CollectedHeap::heap()->hr_printer()->is_active()) {


This is a bit unfortunate, but this is_active() is checking if the G1HRPrinter is active and should print stuff. So it is needed.

kstefanj · 2020-11-13T13:01:59Z

src/hotspot/share/gc/g1/heapRegionManager.cpp

+  guarantee(num_regions >= 1, "Need to specify at least one region to uncommit, tried to uncommit zero regions at %u", start);
+  guarantee(length() >= num_regions, "pre-condition");


Changed to assert, the reason for them being different is that the code in deactivate_regions is old and just moved into this function. Also changed condition to > 0 like we have most other places.

kstefanj · 2020-11-13T13:13:48Z

src/hotspot/share/gc/g1/heapRegionManager.cpp

+    HeapRegionRange range = _committed_map.next_inactive_range(offset);
+    // No more regions available for uncommit
+    if (range.length() == 0) {
+      return uncommitted;


uncommitted can be 0 here so we can't add that assert. There is a chance that between we add the uncommit-task (which calls this function) and that we grab the lock, someone else might have used the inactive regions to expand the heap again. Update the comment a bit, I hope it makes it easier to follow.

kstefanj · 2020-11-13T13:21:40Z

src/hotspot/share/gc/g1/heapRegionManager.hpp

-
-  // The number of regions committed in the heap.
-  uint _num_committed;
+  // Map to keep track of which regions are in use.


Naming is hard and I agree this isn't perfect but the union of the two bitmaps in G1CommittedRegionMap do track what is committed, so it's not completely wrong.

I did update the comment here a bit.

kstefanj · 2020-11-13T13:22:16Z

src/hotspot/share/gc/g1/heapRegionManager.hpp

@@ -107,6 +90,9 @@ class HeapRegionManager: public CHeapObj<mtGC> {
  // Pass down commit calls to the VirtualSpace.
  void commit_regions(uint index, size_t num_regions = 1, WorkGang* pretouch_gang = NULL);

+  // Initialize the HeapRegions in the range and put them on the free list.
+  void initialize_regions(uint start, uint num_regions);
+
  // Notify other data structures about change in the heap layout.
  void update_committed_space(HeapWord* old_end, HeapWord* new_end);


Correct, filed JDK-8256323 for this.

kstefanj · 2020-11-13T13:28:52Z

src/hotspot/share/gc/g1/heapRegionManager.hpp

+  void activate_regions(uint index, uint num_regions = 1);
+  void deactivate_regions(uint start, size_t num_regions);
+  void reactivate_regions(uint start, uint num_regions);
+  void uncommit_regions(uint start, uint num_regions);


The reason for size_t for deactivate_regions is that we have a size_t in shrink_at which calls it. But for consistency here I will change it to uint and add a cast in shrink_at.

For the default values, on expand it is historic but can now be removed. Good catch, and for inactive I don't have any good excuse =)

albertnetymk

Thank you for the revision.

tschatzl · 2020-11-18T09:04:42Z

src/hotspot/share/gc/g1/g1CommittedRegionMap.hpp

+  HeapRegionRange next_inactive_range(uint offset) const;
+  // Finds the next range of committable regions starting at offset.
+  // This function must only be called when no inactive regions are
+  // present and can be used to active more regions.


s/active/activate

tschatzl · 2020-11-18T09:11:40Z

src/hotspot/share/gc/g1/g1CollectedHeap.hpp

@@ -563,6 +563,11 @@ class G1CollectedHeap : public CollectedHeap {

  void resize_heap_if_necessary();

+  // Check if there is memory to uncommit and if so schedule a task to do it.
+  void uncommit_heap_if_necessary();


I would prefer if the method were called uncommit_regions_if_necessary() as this method does not uncommit the heap but just uncomittable regions.

Done, I agree that is more accurate.

tschatzl · 2020-11-18T09:12:19Z

src/hotspot/share/gc/g1/g1CollectedHeap.hpp

@@ -563,6 +563,11 @@ class G1CollectedHeap : public CollectedHeap {

  void resize_heap_if_necessary();

+  // Check if there is memory to uncommit and if so schedule a task to do it.
+  void uncommit_heap_if_necessary();
+  uint uncommit_regions(uint region_limit);


Please add a comment like "// Immediately uncommits uncommittable regions.

tschatzl · 2020-11-18T09:17:05Z

src/hotspot/share/gc/g1/g1CollectedHeap.cpp

+    log_debug(gc, ergo, heap)("Attempt heap shrinking (archive regions). Total size: " SIZE_FORMAT "B",
+                              HeapRegion::GrainWords * HeapWordSize * shrink_count);
+    // Explicit uncommit.
+    _hrm.uncommit_inactive_regions((uint) shrink_count);


Please let the code use the G1CollectedHeap::uncommit_regions() helper here to limit the references to _hrm.

Good point, fixed.

tschatzl · 2020-11-18T09:17:42Z

src/hotspot/share/gc/g1/g1CollectedHeap.cpp

@@ -743,7 +744,7 @@ void G1CollectedHeap::dealloc_archive_regions(MemRegion* ranges, size_t count) {
  HeapWord* prev_last_addr = NULL;
  HeapRegion* prev_last_region = NULL;
  size_t size_used = 0;
-  size_t uncommitted_regions = 0;
+  size_t shrink_count = 0;


The code may as well define shrink_count as uint as the only use seems to cast it to uint anyway.

I agree, I left it size_t since everything else in this function uses size_t, but uint is a better fit.

tschatzl · 2020-11-18T10:08:12Z

src/hotspot/share/gc/g1/g1RegionToSpaceMapper.cpp

  virtual void commit_regions(uint start_idx, size_t num_regions, WorkGang* pretouch_gang) {
+    guarantee(uncommitted_range(start_idx, num_regions),


Not sure this (and the one in uncommit_regions) should be guarantees.

I went with guarantee since this is what's used in G1PageBasedVirtualSpace for similar checks. Errors like this might be more likely in release builds.

tschatzl · 2020-11-18T10:08:58Z

src/hotspot/share/gc/g1/g1RegionToSpaceMapper.cpp

+  // those resources are in sync:
+  // - G1RegionToSpaceMapper::_region_commit_map;
+  // - G1PageBasedVirtualSpace::_committed (_storage.commit())
+  Mutex _lock;


Good comment! :)

tschatzl · 2020-11-18T10:25:24Z

src/hotspot/share/gc/g1/g1ServiceThread.hpp

@@ -133,6 +133,9 @@ class G1ServiceThread: public ConcurrentGCThread {
  // Register a task with the service thread and schedule it. If
  // no delay is specified the task is scheduled to run directly.
  void register_task(G1ServiceTask* task, jlong delay = 0);
+  // Notify a change to the service thread. Used to stop either
+  // stop the service or to force check for new tasks.
+  void notify();


The first "stop" is too much

Good catch.

tschatzl · 2020-11-18T10:30:45Z

src/hotspot/share/gc/g1/g1CommittedRegionMap.hpp

+  uint length() const { return _end - _start; }
+};
+
+class G1CommittedRegionMap : public CHeapObj<mtGC> {


It would be nice to have a diagram of the region states and their transitions here.

tschatzl · 2020-11-18T10:36:02Z

src/hotspot/share/gc/g1/heapRegionManager.cpp

+    }
+    G1CollectedHeap::heap()->hr_printer()->commit(hr);
+  }
+  activate_regions(start, num_regions);


In this place there will be two messages from the HRPrinter for every region:

a COMMIT message and

an ACTIVATE message

This is a bit confusing as in my understanding (that's why I asked for a region state diagram in the G1CommittedRegionMap which may as well be put in HeapRegionManager) the (typical) flow of states are Uncommitted->Committed/Active->Committed/Inactive->Uncommitted.
As mentioned, I'm not sure if it is a good idea to send two separate messages here; better rename the "Active" message to "Commit-Active" (and "Inactive" to "Commit-Inactive") instead imho, even if it's quite long (and drop HRPrinter::commit() completely)

It's true that it will generate two messages when committing a previously uncommitted region, but I still think it is valuable to separate them since we can also have the state change Active->Inactive->Active. In this case the transition from Inactive to Active will not include a commit, but rather making inactive regions active again. Just seeing a "Commit-Active" message in this case would not be as clear as seeing "Active" that is not immediately preceded by "Commit".

Okay, I looked at the G1HRPrinter again, and indeed it prints the action, not the region state, so this is a fair point.

kstefanj

Thanks for the review Thomas. Addressed most of your concerns, but some things I left as is.

kstefanj · 2020-11-18T12:26:18Z

src/hotspot/share/gc/g1/g1CollectedHeap.cpp

@@ -743,7 +744,7 @@ void G1CollectedHeap::dealloc_archive_regions(MemRegion* ranges, size_t count) {
  HeapWord* prev_last_addr = NULL;
  HeapRegion* prev_last_region = NULL;
  size_t size_used = 0;
-  size_t uncommitted_regions = 0;
+  size_t shrink_count = 0;


I agree, I left it size_t since everything else in this function uses size_t, but uint is a better fit.

kstefanj · 2020-11-18T12:27:54Z

src/hotspot/share/gc/g1/g1CollectedHeap.cpp

+    log_debug(gc, ergo, heap)("Attempt heap shrinking (archive regions). Total size: " SIZE_FORMAT "B",
+                              HeapRegion::GrainWords * HeapWordSize * shrink_count);
+    // Explicit uncommit.
+    _hrm.uncommit_inactive_regions((uint) shrink_count);


Good point, fixed.

kstefanj · 2020-11-18T12:30:04Z

src/hotspot/share/gc/g1/g1CollectedHeap.cpp

@@ -1305,6 +1309,7 @@ void G1CollectedHeap::shrink_helper(size_t shrink_bytes) {
  log_debug(gc, ergo, heap)("Shrink the heap. requested shrinking amount: " SIZE_FORMAT "B aligned shrinking amount: " SIZE_FORMAT "B attempted shrinking amount: " SIZE_FORMAT "B",
                            shrink_bytes, aligned_shrink_bytes, shrunk_bytes);
  if (num_regions_removed > 0) {
+    log_debug(gc, heap)("Regions ready for uncommit: %u", num_regions_removed);


Sound good.

kstefanj · 2020-11-18T12:33:02Z

src/hotspot/share/gc/g1/g1CollectedHeap.hpp

@@ -563,6 +563,11 @@ class G1CollectedHeap : public CollectedHeap {

  void resize_heap_if_necessary();

+  // Check if there is memory to uncommit and if so schedule a task to do it.
+  void uncommit_heap_if_necessary();


Done, I agree that is more accurate.

kstefanj · 2020-11-18T12:39:30Z

src/hotspot/share/gc/g1/g1CollectedHeap.hpp

@@ -563,6 +563,11 @@ class G1CollectedHeap : public CollectedHeap {

  void resize_heap_if_necessary();

+  // Check if there is memory to uncommit and if so schedule a task to do it.
+  void uncommit_heap_if_necessary();
+  uint uncommit_regions(uint region_limit);


kstefanj · 2020-11-18T14:32:02Z

src/hotspot/share/gc/g1/g1UncommitRegionTask.cpp

+  if (g1h->has_uncommittable_regions()) {
+    // No delay, reason to reschedule rather then to loop is to allow
+    // other tasks to run without waiting for a full uncommit cycle.
+    schedule(0);


Because here we know that the service thread is running and if we schedule with a 0 delay, there is no risk of it going to sleep before we run again. Another task might be up next, but this task will eventually run before the service thread can go to sleep.

There is room for improvement on how we schedule tasks from another thread. I changed enqueue to call a new public function on the G1ServiceThread called schedule_task, which calls schedule(task, delay) and then notify(). The function schedule(task, delay) is what previously was named schedule_task. This is what will be called when someone does task->schedule(delay) as well, so it is a bit more unified.

kstefanj · 2020-11-18T14:36:24Z

src/hotspot/share/gc/g1/g1UncommitRegionTask.hpp

+  // set to active only from a safepoint and it is set to false
+  // while running on the service thread joined with the suspendible
+  // thread set.
+  bool _active;


Fair point, since I moved the 256M constant into the class, the chunking info is now in a comment for this constant. For the state I added the info around usage and moved the implementation detail into set_active().

kstefanj · 2020-11-18T14:40:32Z

src/hotspot/share/gc/g1/g1UncommitRegionTask.hpp

+  void clear_summary();
+
+public:
+  static void run();


I like enqueue, changed.

kstefanj · 2020-11-18T20:20:42Z

src/hotspot/share/gc/g1/g1ServiceThread.hpp

@@ -133,6 +133,9 @@ class G1ServiceThread: public ConcurrentGCThread {
  // Register a task with the service thread and schedule it. If
  // no delay is specified the task is scheduled to run directly.
  void register_task(G1ServiceTask* task, jlong delay = 0);
+  // Notify a change to the service thread. Used to stop either
+  // stop the service or to force check for new tasks.
+  void notify();


Good catch.

kstefanj · 2020-11-18T20:34:42Z

src/hotspot/share/gc/g1/heapRegionManager.cpp

+    }
+    G1CollectedHeap::heap()->hr_printer()->commit(hr);
+  }
+  activate_regions(start, num_regions);


It's true that it will generate two messages when committing a previously uncommitted region, but I still think it is valuable to separate them since we can also have the state change Active->Inactive->Active. In this case the transition from Inactive to Active will not include a commit, but rather making inactive regions active again. Just seeing a "Commit-Active" message in this case would not be as clear as seeing "Active" that is not immediately preceded by "Commit".

kstefanj · 2020-11-19T08:40:19Z

Testing on the latest changes looks good.

tschatzl

All good, thanks, except for some remaining minor nit and some question.

tschatzl · 2020-11-19T08:45:31Z

src/hotspot/share/gc/g1/g1ServiceThread.cpp

@@ -210,11 +210,10 @@ void G1ServiceThread::register_task(G1ServiceTask* task, jlong delay) {

  // Notify the service thread that there is a new task, thread might
  // be waiting and the newly added task might be first in the list.
-  MonitorLocker ml(&_monitor, Mutex::_no_safepoint_check_flag);
-  ml.notify();
+  notify();


Maybe call schedule_task() here because the two calls are just that.

Good catch, yes since we are already calling schedule_task() this call to notify is not needed.

tschatzl · 2020-11-19T08:47:47Z

src/hotspot/share/gc/g1/g1UncommitRegionTask.cpp

+  if (g1h->has_uncommittable_regions()) {
+    // No delay, reason to reschedule rather then to loop is to allow
+    // other tasks to run without waiting for a full uncommit cycle.
+    schedule(0);


My question is mainly, why not notify the task to wake up when scheduling a new task in all cases. I understand the reason for the zero delay.

Not seeing the problem of doing so:

tasks that need to run (or are overdue) are automatically run before this task as they have a time-to-run < current time, and so this task is scheduled after

The extra notification for schedule_task() does not seem to hurt, at most it wakes up the service thread to do the next scheduled task (which afaiu other tasks if they are already due).

I.e. the "optimization" here to not notify the service thread seems to be superfluous. Or maybe the notification could be suppressed automatically if schedule_task() is called in execute (G1ServiceThread can check fairly easily if it is currently running a task)

I am concerned about users of the API to needlessly have to decide whether they should call schedule() or schedule_task() as they have different effect.

Maybe schedule() could just call schedule_task().

(That might be a pre-existing issue of using schedule vs. schedule_task(), so feel free to say it's out of scope. :) )

kstefanj

Thanks Thomas for your review.

kstefanj · 2020-11-19T13:32:17Z

src/hotspot/share/gc/g1/g1UncommitRegionTask.cpp

+  if (g1h->has_uncommittable_regions()) {
+    // No delay, reason to reschedule rather then to loop is to allow
+    // other tasks to run without waiting for a full uncommit cycle.
+    schedule(0);


Just pushed the assert and also update the comments a bit.

tschatzl

Lgtm. Ship it.

openjdk · 2020-11-19T14:15:57Z

@kstefanj This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8236926: Concurrently uncommit memory in G1

Reviewed-by: ayang, tschatzl

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 21 new commits pushed to the master branch:

defdd12: 8142984: Zero: fast accessors should handle both getters and setters
1718aba: 8227400: Adjust jib profiles to make 3rd party tools for creating installers available on Mach5 test machines
9bb8223: 8253299: Manifest bytes are read twice when verifying a signed JAR
580f22c: 8256581: Refactor vector conversion tests
675d1d5: 8256516: Simplify clearing References
ba721f5: 8212879: Make JVMTI TagMap table concurrent
3a4b90f: 8202343: Disable TLS 1.0 and 1.1
342ccf6: 8256253: Defer Biased Locking obsoletion to JDK 18
d183fc7: 8221554: aarch64 cross-modifying code
f626ed6: 8255978: [windows] os::release_memory may not release the full range
... and 11 more: https://git.openjdk.java.net/jdk/compare/03e84ef7e32b5f65c263695251ee5ae55b9e7ce6...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

kstefanj · 2020-11-19T17:54:27Z

Thanks for the reviews @albertnetymk and @tschatzl!
/integrate

openjdk · 2020-11-19T17:56:04Z

@kstefanj Since your change was applied there have been 21 commits pushed to the master branch:

defdd12: 8142984: Zero: fast accessors should handle both getters and setters
1718aba: 8227400: Adjust jib profiles to make 3rd party tools for creating installers available on Mach5 test machines
9bb8223: 8253299: Manifest bytes are read twice when verifying a signed JAR
580f22c: 8256581: Refactor vector conversion tests
675d1d5: 8256516: Simplify clearing References
ba721f5: 8212879: Make JVMTI TagMap table concurrent
3a4b90f: 8202343: Disable TLS 1.0 and 1.1
342ccf6: 8256253: Defer Biased Locking obsoletion to JDK 18
d183fc7: 8221554: aarch64 cross-modifying code
f626ed6: 8255978: [windows] os::release_memory may not release the full range
... and 11 more: https://git.openjdk.java.net/jdk/compare/03e84ef7e32b5f65c263695251ee5ae55b9e7ce6...master

Your commit was automatically rebased without conflicts.

Pushed as commit b8244b6.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Reviewed-by: ayang, tschatzl

kstefanj added 9 commits November 9, 2020 22:06

Initial patch for concurrent uncommit

5dae77a

Feedback from dev-meeting

fcba6bf

Stress Uncommit

1fd726e

Move HeapRegionRange constructor

3b08247

Uncommit task

94e8961

Test improvement

7c0eb0f

Improved logging

c4f37a4

Simplified task

d7db88e

Self review

3f09c5e

openjdk bot added the hotspot hotspot-dev@openjdk.org label Nov 10, 2020

kstefanj marked this pull request as ready for review November 10, 2020 10:52

openjdk bot added the rfr Pull request is ready for review label Nov 10, 2020

albertnetymk suggested changes Nov 12, 2020

View reviewed changes

Lock for small mapper and use BitMap parallel operations.

54e16ca

openjdk bot added the merge-conflict Pull request has merge conflict with target branch label Nov 13, 2020

Merge branch 'master' into 8236926-ccu

0a3ba09

openjdk bot removed the merge-conflict Pull request has merge conflict with target branch label Nov 13, 2020

Albert review

1949076

kstefanj commented Nov 13, 2020

View reviewed changes

kstefanj added 2 commits November 13, 2020 16:30

Albert review 2

8552d23

Zoom feedback

8636071

openjdk bot added the merge-conflict Pull request has merge conflict with target branch label Nov 16, 2020

Merge branch 'master' into 8236926-ccu

c354b1d

openjdk bot removed the merge-conflict Pull request has merge conflict with target branch label Nov 16, 2020

albertnetymk approved these changes Nov 16, 2020

View reviewed changes

tschatzl suggested changes Nov 18, 2020

View reviewed changes

kstefanj added 2 commits November 18, 2020 21:18

Thomas review

1a82bfd

Merge branch 'master' into 8236926-ccu

6e3e33f

kstefanj commented Nov 18, 2020

View reviewed changes

tschatzl suggested changes Nov 19, 2020

View reviewed changes

Thomas review 2

553f99a

kstefanj commented Nov 19, 2020

View reviewed changes

tschatzl approved these changes Nov 19, 2020

View reviewed changes

openjdk bot added the ready Pull request is ready to be integrated label Nov 19, 2020

Fix missing include

ee031dc

openjdk bot closed this Nov 19, 2020

openjdk bot added integrated Pull request has been integrated and removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Nov 19, 2020

openjdk-notifier bot referenced this pull request Nov 19, 2020

8236926: Concurrently uncommit memory in G1

b8244b6

Reviewed-by: ayang, tschatzl

kstefanj deleted the 8236926-ccu branch May 18, 2021 12:07

		guarantee(num_regions >= 1, "Need to specify at least one region to uncommit, tried to uncommit zero regions at %u", start);
		guarantee(length() >= num_regions, "pre-condition");

		virtual void commit_regions(uint start_idx, size_t num_regions, WorkGang* pretouch_gang) {
		guarantee(uncommitted_range(start_idx, num_regions),

8236926: Concurrently uncommit memory in G1 #1141

8236926: Concurrently uncommit memory in G1 #1141

Conversation

kstefanj commented Nov 10, 2020 • edited by openjdk bot Loading

Progress

Testing

Issue

Reviewers

Download

bridgekeeper bot commented Nov 10, 2020

openjdk bot commented Nov 10, 2020

mlbridge bot commented Nov 10, 2020 • edited Loading

Webrevs

albertnetymk left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kstefanj Nov 13, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

openjdk bot commented Nov 13, 2020

kstefanj commented Nov 13, 2020

kstefanj left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kstefanj Nov 13, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

albertnetymk left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tschatzl Nov 18, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kstefanj left a comment

Choose a reason for hiding this comment

kstefanj commented Nov 10, 2020 •

edited by openjdk bot

Loading

mlbridge bot commented Nov 10, 2020 •

edited

Loading

kstefanj Nov 13, 2020 •

edited

Loading

kstefanj Nov 13, 2020 •

edited

Loading

tschatzl Nov 18, 2020 •

edited

Loading