Skip to content
This repository has been archived by the owner on Sep 18, 2023. It is now read-only.

[NSE-588] config the pre-allocated memory for shuffle's splitter #594

Merged
merged 5 commits into from
Dec 3, 2021

Conversation

FelixYBW
Copy link
Collaborator

What changes were proposed in this pull request?

Previously we preallocate memory during split operation for each column and each reducer. the size is controlled by record batch size. It doesn't scale if we use larger record batch, larger shuffle.partitions or more columns to shuffle.

we added a config to set the preallocated size and differ it from record batch size. In the code, we control the maximal memory allocated as 1/4 of total task offheap memory.

How was this patch tested?

tested on single node.

binwei added 2 commits November 30, 2021 16:25
…the preallocated buffer size for each column each reducer.

max of pre-allocated memory in split is 1/4 of offheap memory per task
@github-actions
Copy link

#588

binwei and others added 3 commits December 1, 2021 00:18
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants