-
Notifications
You must be signed in to change notification settings - Fork 452
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CH]Avoid OOM and make shuffle write more stable #4013
Labels
enhancement
New feature or request
Comments
This was referenced Dec 12, 2023
We cannot rely on Following logs shows that
then the executor crush
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Description
Shuffle write and high cardinality aggregating is two high memory consumers. There is contention between the two in memory usage. The memory usage threshold is controled by two configures.
spark.gluten.sql.columnar.backend.ch.runtime_settings.max_memory_usage
, orspark.memory.offHeap.size
whenmax_memory_usage
is not set.spark.gluten.sql.columnar.backend.ch.spillThreshold
When run a query to aggregate high cardinality keys, shuffle write offten causes OOM, because it doesn't know how much memory is currently available, but aggregating operators has token too much memory, leave the available memory less then
spark.gluten.sql.columnar.backend.ch.spillThreshold
.The text was updated successfully, but these errors were encountered: