Skip to content
This repository has been archived by the owner on Sep 18, 2023. It is now read-only.

[NSE-537] Increase partition number adaptively for large SHJ stages #538

Merged
merged 2 commits into from
Nov 29, 2021

Conversation

zhztheplayer
Copy link
Collaborator

What changes were proposed in this pull request?

Involve in AQE execution to tune against SHJ input partitions to avoid memory leaks.

@github-actions
Copy link

#537

@zhztheplayer
Copy link
Collaborator Author

zhztheplayer commented Nov 29, 2021

We've changed the approach to a customizable build side input size of SHJ. To use, for example, limit build size to 500m per task:

set spark.oap.sql.columnar.shuffledhashjoin.resizeinputpartitions=true
set spark.oap.sql.columnar.shuffledhashjoin.buildsizelimit=500m

Within the settings, build side of SHJ will be split as more input partitions if the size is beyound 500m

@zhztheplayer zhztheplayer marked this pull request as ready for review November 29, 2021 06:57
@zhztheplayer zhztheplayer merged commit 2a16e7b into oap-project:master Nov 29, 2021
@weiting-chen weiting-chen added the bug Something isn't working label Apr 8, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants