diff --git a/docs/sql-performance-tuning.md b/docs/sql-performance-tuning.md index 4ede18d1938b..e3e5444d2a9c 100644 --- a/docs/sql-performance-tuning.md +++ b/docs/sql-performance-tuning.md @@ -267,7 +267,7 @@ This feature coalesces the post shuffle partitions based on the map output stati spark.sql.adaptive.coalescePartitions.parallelismFirst true - When true, Spark ignores the target size specified by spark.sql.adaptive.advisoryPartitionSizeInBytes (default 64MB) when coalescing contiguous shuffle partitions, and only respect the minimum partition size specified by spark.sql.adaptive.coalescePartitions.minPartitionSize (default 1MB), to maximize the parallelism. This is to avoid performance regression when enabling adaptive query execution. It's recommended to set this config to false and respect the target size specified by spark.sql.adaptive.advisoryPartitionSizeInBytes. + When true, Spark ignores the target size specified by spark.sql.adaptive.advisoryPartitionSizeInBytes (default 64MB) when coalescing contiguous shuffle partitions, and only respect the minimum partition size specified by spark.sql.adaptive.coalescePartitions.minPartitionSize (default 1MB), to maximize the parallelism. This is to avoid performance regressions when enabling adaptive query execution. It's recommended to set this config to true on a busy cluster to make resource utilization more efficient (not many small tasks). 3.2.0 diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala index eb5233bfb123..1d7b86cba917 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala @@ -721,8 +721,9 @@ object SQLConf { "shuffle partitions, but adaptively calculate the target size according to the default " + "parallelism of the Spark cluster. The calculated size is usually smaller than the " + "configured target size. This is to maximize the parallelism and avoid performance " + - "regression when enabling adaptive query execution. It's recommended to set this config " + - "to false and respect the configured target size.") + "regressions when enabling adaptive query execution. It's recommended to set this " + + "config to true on a busy cluster to make resource utilization more efficient (not many " + + "small tasks).") .version("3.2.0") .booleanConf .createWithDefault(true)