8370345: Parallel: Rework TLAB accounting in MutableNUMASpace #27935
+40
−48
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hello,
Parallel's MutableNUMASpace is the only GC interface that uses the Thread parameter passed through the general CollectedHeap interface to tlab_capacity, tlab_used, and unsafe_max_tlab_alloc. It would be nice if Parallel's MutableNUMASpace could do without the Thread and instead find a thread-agnostic approach. By removing the need for the thread, it becomes possible to clean up the shared CollectedHeap interface, which makes it easier to read and maintain all GCs. Also, the lgrp_id that is stored in the Thread class should really have been moved to GCThreadLocalData after that concept was created, but with a thread-agnostic approach, the field can be removed entirely.
The current solution is not without problems. When a new allocation is made inside one of the LGRP spaces in MutableNUMASpace using cas_allocate(), the NUMA/LGRP id is polled and stored inside the Thread, and we only attempt to allocate on that LGRP. If allocation fails on the local LGRP, we do not try to allocate on any other (remote) LGRP(s). This fact is reflected in the TLAB accounting methods tlab_capacity, tlab_used, and unsafe_max_tlab_alloc, which only check how much memory is used, etc., for the LGRP matching the stored LGRP id in the Thread. This model breaks down when threads are allowed to migrate between different CPUs, and therefore also NUMA nodes, which might change the LGRP id.
For example, a system with two NUMA nodes gives us two LGRPs with ids 0 and 1. If a thread allocates most of its memory on LGRP 0 and then migrates to a CPU on LGRP 1, the thread will show that it allocated a significant amount of memory, but the used memory on the LGRP it is currently on could be very low. This would give a disproportionate allocation fraction. This is not a problem as the TLAB code accounts for this, but for a different reason entirely. The other way around could also be problematic. If a thread allocates very little memory on LGRP 0 and then migrates to LGRP 1, where another thread has allocated a lot of memory, the allocation fraction will be very low, when it could have a really high fraction if accounting for the used memory on its original LGRP.
A solution to both of these issues is to average the capacity, used, and available memory across all LGRPs for the TLAB accounting methods. This approach provides a more accurate and stable view of memory usage and availability, regardless of thread migration or imbalances in NUMA/LGRP allocation. However, there are trade-offs to consider. Averaging the accounting may mask allocation pressure on specific LGRPs and could result in the allocation history of TLABs being updated more/less frequently. In summary, considering the whole eden space (i.e., the entire MutableNUMASpace) instead of just a single LGRP when accounting for TLABs should give a more holistic view.
I plan to clean up the
Thread*parameter to tlab_capacity, tlab_used and unsafe_max_tlab_alloc in a following RFE. Since it is a pretty significant cleanupTesting:
-XX:+UseParallelGC -XX:+UseNUMA. Slight improvement on SPECjbb2005.Progress
Issue
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/27935/head:pull/27935$ git checkout pull/27935Update a local copy of the PR:
$ git checkout pull/27935$ git pull https://git.openjdk.org/jdk.git pull/27935/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 27935View PR using the GUI difftool:
$ git pr show -t 27935Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/27935.diff
Using Webrev
Link to Webrev Comment