Skip to content

Conversation

@WangGuangxin
Copy link
Contributor

@WangGuangxin WangGuangxin commented Dec 9, 2021

What changes were proposed in this pull request?

Spark's tungsten memory model usually tries to allocate memory by one page each time and allocated by long[pageSizeBytes/8] in HeapMemoryAllocator.allocate.

Remember that java long array needs extra object header (usually 16 bytes in 64bit system), so the really bytes allocated is pageSize+16.

Assume that the G1HeapRegionSize is 4M and pageSizeBytes is 4M as well. Since every time we need to allocate 4M+16byte memory, so two regions are used with one region only occupies 16byte. Then there are about 50% memory waste.
It can happenes under different combinations of G1HeapRegionSize (varies from 1M to 32M) and pageSizeBytes (varies from 1M to 64M).

We can demo it using following piece of code.

public static void bufferSizeTest(boolean optimize) {
    long totalAllocatedSize = 0L;
    int blockSize = 1024 * 1024 * 4; // 4m
    if (optimize) {
      blockSize -= 16;
    }
    List<long[]> buffers = new ArrayList<>();
    while (true) {
      long[] arr = new long[blockSize/8];
      buffers.add(arr);
      totalAllocatedSize += blockSize;
      System.out.println("Total allocated size: " + totalAllocatedSize);
    }
  }

Run it using following jvm params

java -Xmx100m -XX:+UseG1GC -XX:G1HeapRegionSize=4m -XX:-UseGCOverheadLimit -verbose:gc -XX:+UnlockDiagnosticVMOptions -XX:+G1SummarizeConcMark -Xss4m -XX:+ExitOnOutOfMemoryError -XX:ParallelGCThreads=4 -XX:ConcGCThreads=4

with optimized = false

Total allocated size: 46137344
[GC pause (G1 Humongous Allocation) (young) 44M->44M(100M), 0.0007091 secs]
[GC pause (G1 Evacuation Pause) (young) (initial-mark)-- 48M->48M(100M), 0.0021528 secs]
[GC concurrent-root-region-scan-start]
[GC concurrent-root-region-scan-end, 0.0000021 secs]
[GC concurrent-mark-start]
[GC pause (G1 Evacuation Pause) (young) 48M->48M(100M), 0.0011289 secs]
[Full GC (Allocation Failure)  48M->48M(100M), 0.0017284 secs]
[Full GC (Allocation Failure)  48M->48M(100M), 0.0013437 secs]
Terminating due to java.lang.OutOfMemoryError: Java heap space

with optimzied = true

Total allocated size: 96468624
[GC pause (G1 Humongous Allocation) (young)-- 92M->92M(100M), 0.0024416 secs]
[Full GC (Allocation Failure)  92M->92M(100M), 0.0019883 secs]
[GC pause (G1 Evacuation Pause) (young) (initial-mark) 96M->96M(100M), 0.0004282 secs]
[GC concurrent-root-region-scan-start]
[GC concurrent-root-region-scan-end, 0.0000040 secs]
[GC concurrent-mark-start]
[GC pause (G1 Evacuation Pause) (young) 96M->96M(100M), 0.0003269 secs]
[Full GC (Allocation Failure)  96M->96M(100M), 0.0012409 secs]
[Full GC (Allocation Failure)  96M->96M(100M), 0.0012607 secs]
Terminating due to java.lang.OutOfMemoryError: Java heap space

This PR try to optimize the pageSize to avoid memory waste.

This case exists not only in MemoryManagement, but also in other places such as TorrentBroadcast.blockSize. I would like to submit a followup PR if this modification is reasonable.

Why are the changes needed?

To avoid memory waste in G1 GC

Does this PR introduce any user-facing change?

No

How was this patch tested?

Existing UT

@github-actions github-actions bot added the CORE label Dec 9, 2021
@WangGuangxin WangGuangxin changed the title [SPARK-37593] Optimize HeapMemoryAllocator to avoid memory waste in humongous allocation when using G1GC [SPARK-37593][CORE] Optimize HeapMemoryAllocator to avoid memory waste in humongous allocation when using G1GC Dec 9, 2021
@WangGuangxin
Copy link
Contributor Author

@kiszk @cloud-fan @maropu @JoshRosen Can you please help review this?

Comment on lines 3221 to 3223
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am afraid that this line may throw an exception in the future due to specification changes.
How about wrapping this by try ... catch?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if we always pick this as the page size?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose we lose a few bytes in the allocation, and maybe that makes some nice power-of-two data structure not fit, but, I wonder if that's pretty rare and if we can just go with this always indeed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean even if it is not homongous allocation?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I think the current logic is good, just wondering if it matter much if we do it in all cases. Maybe it's bad for small allocations, but is the offset ever significant relative to the allocation size I wonder? probably not. I wonder if there are future cases, different GCs, that we're not checking here that also need this treatment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I revisit here, it's really no need to restrict to only homongous allocation

@cloud-fan
Copy link
Contributor

cc @rednaxelafx @zsxwing

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a flag for this? when would I not want it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, it's indeed no need. removed it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be a val? I don't think it will change
Likewise could you just try to get G1HeapRegionSize once here? if it's not G1GC, just store "None". Then you don't check it each time. Because I think this can't change during the JVM's lifetime.
Then you don't need a utility method below, even

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit. Indentation. We need two more spaces.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did't get this, do you mean Line 261?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you update the function description with this new addition?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WangGuangxin . Do you think you can add one specific example about the case to the PR description?

@WangGuangxin
Copy link
Contributor Author

WangGuangxin commented Dec 13, 2021

@WangGuangxin . Do you think you can add one specific example about the case to the PR description?

Hi @dongjoon-hyun , We can demo it using following piece of code.

public static void bufferSizeTest(boolean optimize) {
    long totalAllocatedSize = 0L;
    int blockSize = 1024 * 1024 * 4; // 4m
    if (optimize) {
      blockSize -= 16;
    }
    List<long[]> buffers = new ArrayList<>();
    while (true) {
      long[] arr = new long[blockSize/8];
      buffers.add(arr);
      totalAllocatedSize += blockSize;
      System.out.println("Total allocated size: " + totalAllocatedSize);
    }
  }

Run it using following jvm params

java -Xmx100m -XX:+UseG1GC -XX:G1HeapRegionSize=4m -XX:-UseGCOverheadLimit -verbose:gc -XX:+UnlockDiagnosticVMOptions -XX:+G1SummarizeConcMark -Xss4m -XX:+ExitOnOutOfMemoryError -XX:ParallelGCThreads=4 -XX:ConcGCThreads=4

with optimized = false

Total allocated size: 46137344
[GC pause (G1 Humongous Allocation) (young) 44M->44M(100M), 0.0007091 secs]
[GC pause (G1 Evacuation Pause) (young) (initial-mark)-- 48M->48M(100M), 0.0021528 secs]
[GC concurrent-root-region-scan-start]
[GC concurrent-root-region-scan-end, 0.0000021 secs]
[GC concurrent-mark-start]
[GC pause (G1 Evacuation Pause) (young) 48M->48M(100M), 0.0011289 secs]
[Full GC (Allocation Failure)  48M->48M(100M), 0.0017284 secs]
[Full GC (Allocation Failure)  48M->48M(100M), 0.0013437 secs]
Terminating due to java.lang.OutOfMemoryError: Java heap space

with optimzied = true

Total allocated size: 96468624
[GC pause (G1 Humongous Allocation) (young)-- 92M->92M(100M), 0.0024416 secs]
[Full GC (Allocation Failure)  92M->92M(100M), 0.0019883 secs]
[GC pause (G1 Evacuation Pause) (young) (initial-mark) 96M->96M(100M), 0.0004282 secs]
[GC concurrent-root-region-scan-start]
[GC concurrent-root-region-scan-end, 0.0000040 secs]
[GC concurrent-mark-start]
[GC pause (G1 Evacuation Pause) (young) 96M->96M(100M), 0.0003269 secs]
[Full GC (Allocation Failure)  96M->96M(100M), 0.0012409 secs]
[Full GC (Allocation Failure)  96M->96M(100M), 0.0012607 secs]
Terminating due to java.lang.OutOfMemoryError: Java heap space

@dongjoon-hyun
Copy link
Member

@WangGuangxin . Thank you. Please include #34846 (comment) into the PR description. It will be a permanent commit log.

@dongjoon-hyun
Copy link
Member

Also, cc @sunchao , @viirya , @huaxingao , too

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just curious whether this adjustment can be done in HeapMemoryAllocator.allocate instead? since it's closer to the actual logic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer not to put it in HeapMemoryAllocator.allocate because it may break the semantics. When we do HeapMemoryAllocator.allocate(size) we expected to get memory with specified size or throw oom, but we internally change it to another value (size-Platform.LONG_ARRAY_OFFSET), which may crash the caller's code or brings confusion

Copy link
Contributor

@attilapiros attilapiros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few questions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I looked for how to find out what the garbage collector type is I bumped into this several times:

  HotSpotDiagnosticMXBean diagnostic = ManagementFactoryHelper.getDiagnosticMXBean();

    VMOption option = diagnostic.getVMOption("UseG1GC");
    if (option.getValue().equals("false")) {
      ...
    }

For example at the OpenJDK tests.

Is there any reason why a different solution have been chosen here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

emm, seems your solution is more elegant. I'll update it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I get this right in case of G1GC the best would be if we choose a pageSize where the following holds:

  G1HeapRegionSize % (pageSize + Platform.LONG_ARRAY_OFFSET) == 0

And when When BUFFER_PAGESIZE is not set we are free to choose it as:

pageSize = G1HeapRegionSize - Platform.LONG_ARRAY_OFFSET;

And with the above code we just try to calculate G1HeapRegionSize with our own way. But what about accessing this value?
Like the same way used in the OpenJDK tests:

  HotSpotDiagnosticMXBean diagnostic = ManagementFactoryHelper.getDiagnosticMXBean();
  option = diagnostic.getVMOption("G1HeapRegionSize");

As I see the OpenJDK test was executed like:

run main/othervm -Xmx64m TestG1HeapRegionSize 1048576

So diagnostic.getVMOption("G1HeapRegionSize") gives back the calculated region size.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's better to make sure G1HeapRegionSize % (pageSize + Platform.LONG_ARRAY_OFFSET) == 0.

But when BUFFER_PAGESIZE is not set, I'm not quite sure if it's reasonable to choose it as pageSize = G1HeapRegionSize - Platform.LONG_ARRAY_OFFSET, which seems a bit different with current logic to get default page size.
By following the current logic to calculte default page size and then minus Platform.LONG_ARRAY_OFFSET can also make sure G1HeapRegionSize % (pageSize + Platform.LONG_ARRAY_OFFSET) == 0

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But when BUFFER_PAGESIZE is not set, I'm not quite sure if it's reasonable to choose it as pageSize = >G1HeapRegionSize - Platform.LONG_ARRAY_OFFSET, which seems a bit different with current logic to get default >page size.

How the current logic for the default handles a case where a custom -XX:G1HeapRegionSize is given as extra java options?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you mean, you are right

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@attilapiros Since -XX:G1HeapRegionSize can only be set to N-th power of 2, and the default size calculated here is also N-th power of 2, so it can be guanranteed that G1HeapRegionSize % (pageSize + Platform.LONG_ARRAY_OFFSET) == 0 or (pageSize + Platform.LONG_ARRAY_OFFSET) % G1HeapRegionSize == 0, right?
Such a change has little effect on the current logic.

@WangGuangxin WangGuangxin changed the title [SPARK-37593][CORE] Optimize HeapMemoryAllocator to avoid memory waste in humongous allocation when using G1GC [SPARK-37593][CORE] Optimize HeapMemoryAllocator to avoid memory waste when using G1GC Dec 15, 2021
@attilapiros
Copy link
Contributor

ok to test

Copy link
Contributor

@attilapiros attilapiros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good to me but I am interested in the others opinion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lazy val

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would lazy do anything here? it's a local inside what's evaluated for a val. I could see declaring it lazily outside this block instead. Then even being lazy doesn't matter

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would do:

scala> lazy val foo = {
     |   println("Initialized")
     |   1
     | }
foo: Int = <lazy>

scala> "test"
res0: String = test

scala> None.getOrElse(foo)
Initialized
res1: Int = 1

But have it outside even better!

@SparkQA
Copy link

SparkQA commented Dec 15, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50710/

@SparkQA
Copy link

SparkQA commented Dec 15, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50710/

@SparkQA
Copy link

SparkQA commented Dec 15, 2021

Test build #146236 has finished for PR 34846 at commit e9eff1a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 16, 2021

Test build #146277 has finished for PR 34846 at commit e43878c.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 16, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50753/

@SparkQA
Copy link

SparkQA commented Dec 16, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50753/

@SparkQA
Copy link

SparkQA commented Dec 16, 2021

Test build #146279 has finished for PR 34846 at commit c2b0cb9.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

G1HeapRegionSize -> maybeG1HeapRegionSize?

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @WangGuangxin .
You are suggesting G1GC issue which is enabled by default in Java 11.
Please double check if your code is running with Java 11 and Java 17.
According to the latest change, it seems that you are working with Java 8 only.

cc @attilapiros too about the above Java8 test case example.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WangGuangxin This is not Java11 and Java17 API.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great findings! We need to test this on other JVM versions too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dongjoon-hyun Thanks for your findings. I've updated and checked against JDK8, 11, and 17. also cc @attilapiros

@SparkQA
Copy link

SparkQA commented Dec 17, 2021

Kubernetes integration test unable to build dist.

exiting with code: 1
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50798/

@WangGuangxin
Copy link
Contributor Author

@srowen @dongjoon-hyun @attilapiros
Sorry for the late reply. Based on the comments, I made a slightly change on this.

  1. If user set page size by spark.buffer.pageSize, we just leave it alone
  2. Otherwise, we first follow Spark's current algorithm to choose a best page size by available cores and memory. After that we just need to ajust it by minus LONG_ARRAY_OFFSET if we are using G1GC, that's enough.

Because both the choosed page size by Spark and the G1GC region size if the power of 2, so choosedPageSize and heapRegionSize must be multiple relations. There are three cases:

  • choosedPageSize == heapRegionSize, then choosedPageSize - LONG_ARRAY_OFFSET can make sure that every page occupies one G1 region without wast.

  • choosedPageSize > heapRegionSize, then choosedPageSize - LONG_ARRAY_OFFSET can make sure that every page occupies N G1 regions without wast.

  • choosedPageSize < heapRegionSize, then choosedPageSize - LONG_ARRAY_OFFSET can make sure that every G1 region can allocation N pages without wast.

Compared to existing defaultPageSize, the diff is only LONG_ARRAY_OFFSET bytes, it's safe enough to not cause perf regression under various cases.

@dongjoon-hyun
Copy link
Member

Thank you for updates, @WangGuangxin .

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functionally I think this looks OK

val size = ByteArrayMethods.nextPowerOf2(maxTungstenMemory / cores / safetyFactor)
val default = math.min(maxPageSize, math.max(minPageSize, size))
conf.get(BUFFER_PAGESIZE).getOrElse(default)
val choosedPageSize = math.min(maxPageSize, math.max(minPageSize, size))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Total nit, but choosed -> chosen

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

/**
* Return whether we are using G1GC or not
*/
val isG1GC: Boolean = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is OK here, though if it's only used here, maybe leave it in MemoryManager until it has reason to be shared? Utils is quite big already

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, updated

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you, @WangGuangxin and all.

@dongjoon-hyun
Copy link
Member

cc @viirya , @sunchao , @cloud-fan , @kiszk , @attilapiros, @HyukjinKwon again

@HyukjinKwon
Copy link
Member

cc @rednaxelafx too FYI

@HyukjinKwon
Copy link
Member

Looks making sense to me.

Copy link
Contributor

@rednaxelafx rednaxelafx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this behavior is guarded on G1 GC in use, I'm okay with it. LGTM.

A few side comments:

  • The same behavior affects the Shenandoah GC as well since it reuses the G1 region-based heap structure. Not to complicate this PR, but perhaps consider a follow-up change to also use the same logic for Shenandoah GC (which is also available in OpenJDK)
  • If we're using a very big G1 heap region size, the current logic seem to be a bit too conservative, i.e. if the chosen page size is less than 1/2 of G1 region size, then we don't necessarily have to subtract the array header size
  • The current logic doesn't take object size alignment into account. The default is 8-byte aligned but if a different value is chosen, the size to subtract might not have to be the array header size... just another detail, this PR doesn't have to deal with it.

@rednaxelafx
Copy link
Contributor

I have a question to those who are more familiar with Spark internals though: are there any places inside of Spark that implicitly depends on the page size being a power-of-two? Do we run any risk that such assumption is not checked at runtime, which can lead to out-of-bounds access in the boundary cases?

@WangGuangxin
Copy link
Contributor Author

I have a question to those who are more familiar with Spark internals though: are there any places inside of Spark that implicitly depends on the page size being a power-of-two? Do we run any risk that such assumption is not checked at runtime, which can lead to out-of-bounds access in the boundary cases?

As far as I know there is no such constraint. And Spark allows users set custom page size by conf spark.buffer.pageSize, whichi is not restricted to the power-of-two.

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-37593][CORE] Optimize HeapMemoryAllocator to avoid memory waste when using G1GC [SPARK-37593][CORE] Reduce default page size by LONG_ARRAY_OFFSET if G1GC and ON_HEAP are used Mar 1, 2022
@dongjoon-hyun dongjoon-hyun changed the title [SPARK-37593][CORE] Reduce default page size by LONG_ARRAY_OFFSET if G1GC and ON_HEAP are used [SPARK-37593][CORE] Reduce default page size by LONG_ARRAY_OFFSET if G1GC and ON_HEAP are used Mar 1, 2022
@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Mar 1, 2022

Thank you, @WangGuangxin and all!

As the above comments, I agree that there are more combinations like new GC and different alignments. However, this PR is very narrow-downed to G1GC and OnHeap and the adjusted amount is a little. I believe we can merge this and move forward on top of this.

I revise the PR title according to the PR content because we have more chances of optimization.

Merged to master for Apache Spark 3.3.0.

kazuyukitanimura pushed a commit to kazuyukitanimura/spark that referenced this pull request Aug 10, 2022
…f `G1GC` and `ON_HEAP` are used (apache#1372)

### What changes were proposed in this pull request?
Spark's tungsten memory model usually tries to allocate memory by one `page` each time and allocated by `long[pageSizeBytes/8]` in `HeapMemoryAllocator.allocate`.

Remember that java long array needs extra object header (usually 16 bytes in 64bit system), so the really bytes allocated is `pageSize+16`.

Assume that the `G1HeapRegionSize` is 4M and `pageSizeBytes` is 4M as well. Since every time we need to allocate 4M+16byte memory, so two regions are used with one region only occupies 16byte. Then there are about **50%** memory waste.
It can happenes under different combinations of G1HeapRegionSize (varies from 1M to 32M) and pageSizeBytes (varies from 1M to 64M).

 We can demo it using following piece of code.

```
public static void bufferSizeTest(boolean optimize) {
    long totalAllocatedSize = 0L;
    int blockSize = 1024 * 1024 * 4; // 4m
    if (optimize) {
      blockSize -= 16;
    }
    List<long[]> buffers = new ArrayList<>();
    while (true) {
      long[] arr = new long[blockSize/8];
      buffers.add(arr);
      totalAllocatedSize += blockSize;
      System.out.println("Total allocated size: " + totalAllocatedSize);
    }
  }
```

Run it using following jvm params
```
java -Xmx100m -XX:+UseG1GC -XX:G1HeapRegionSize=4m -XX:-UseGCOverheadLimit -verbose:gc -XX:+UnlockDiagnosticVMOptions -XX:+G1SummarizeConcMark -Xss4m -XX:+ExitOnOutOfMemoryError -XX:ParallelGCThreads=4 -XX:ConcGCThreads=4
```

with optimized = false
```
Total allocated size: 46137344
[GC pause (G1 Humongous Allocation) (young) 44M->44M(100M), 0.0007091 secs]
[GC pause (G1 Evacuation Pause) (young) (initial-mark)-- 48M->48M(100M), 0.0021528 secs]
[GC concurrent-root-region-scan-start]
[GC concurrent-root-region-scan-end, 0.0000021 secs]
[GC concurrent-mark-start]
[GC pause (G1 Evacuation Pause) (young) 48M->48M(100M), 0.0011289 secs]
[Full GC (Allocation Failure)  48M->48M(100M), 0.0017284 secs]
[Full GC (Allocation Failure)  48M->48M(100M), 0.0013437 secs]
Terminating due to java.lang.OutOfMemoryError: Java heap space
```

with optimzied = true
```
Total allocated size: 96468624
[GC pause (G1 Humongous Allocation) (young)-- 92M->92M(100M), 0.0024416 secs]
[Full GC (Allocation Failure)  92M->92M(100M), 0.0019883 secs]
[GC pause (G1 Evacuation Pause) (young) (initial-mark) 96M->96M(100M), 0.0004282 secs]
[GC concurrent-root-region-scan-start]
[GC concurrent-root-region-scan-end, 0.0000040 secs]
[GC concurrent-mark-start]
[GC pause (G1 Evacuation Pause) (young) 96M->96M(100M), 0.0003269 secs]
[Full GC (Allocation Failure)  96M->96M(100M), 0.0012409 secs]
[Full GC (Allocation Failure)  96M->96M(100M), 0.0012607 secs]
Terminating due to java.lang.OutOfMemoryError: Java heap space
```

This PR try to optimize the pageSize to avoid memory waste.

This case exists not only in `MemoryManagement`, but also in other places such as `TorrentBroadcast.blockSize`.  I would like to submit a followup PR if this modification is reasonable.

### Why are the changes needed?
To avoid memory waste in G1 GC

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Existing UT

Closes apache#34846 from WangGuangxin/g1_humongous_optimize.

Authored-by: wangguangxin.cn <wangguangxin.cn@bytedance.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit e81333c)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 92fd5bb)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

Co-authored-by: wangguangxin.cn <wangguangxin.cn@bytedance.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.