-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-45592][SPARK-45282][SQL] Correctness issue in AQE with InMemoryTableScanExec #43760
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -402,8 +402,9 @@ class CacheManager extends Logging with AdaptiveSparkPlanHelper { | |
| if (session.conf.get(SQLConf.CAN_CHANGE_CACHED_PLAN_OUTPUT_PARTITIONING)) { | ||
| // Bucketed scan only has one time overhead but can have multi-times benefits in cache, | ||
| // so we always do bucketed scan in a cached plan. | ||
| SparkSession.getOrCloneSessionWithConfigsOff( | ||
| session, SQLConf.AUTO_BUCKETED_SCAN_ENABLED :: Nil) | ||
| SparkSession.getOrCloneSessionWithConfigsOff(session, | ||
| SQLConf.ADAPTIVE_EXECUTION_APPLY_FINAL_STAGE_SHUFFLE_OPTIMIZATIONS :: | ||
| SQLConf.AUTO_BUCKETED_SCAN_ENABLED :: Nil) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think a question here is: do we want strictly better performance, or generally better performance?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Adding an extra shuffle can be very expensive, so I'm inclined to this change.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This approach is guaranteed better performance compared to AQE in SQL cache disabled.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thank you, @maryannxue . |
||
| } else { | ||
| SparkSession.getOrCloneSessionWithConfigsOff(session, forceDisableConfigs) | ||
| } | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -2655,16 +2655,29 @@ class DatasetSuite extends QueryTest | |
| } | ||
|
|
||
| test("SPARK-45592: Coaleasced shuffle read is not compatible with hash partitioning") { | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @maryannxue This test seems to pass on
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. b/c this bug has been fixed by the other PR, right?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I simply changed this test to make it more robust in terms of reproducing the original bug.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Got it. |
||
| val ee = spark.range(0, 1000000, 1, 5).map(l => (l, l)).toDF() | ||
| .persist(org.apache.spark.storage.StorageLevel.MEMORY_AND_DISK) | ||
| ee.count() | ||
|
|
||
| val minNbrs1 = ee | ||
| .groupBy("_1").agg(min(col("_2")).as("min_number")) | ||
| .persist(org.apache.spark.storage.StorageLevel.MEMORY_AND_DISK) | ||
|
|
||
| val join = ee.join(minNbrs1, "_1") | ||
| assert(join.count() == 1000000) | ||
| withSQLConf(SQLConf.CAN_CHANGE_CACHED_PLAN_OUTPUT_PARTITIONING.key -> "true", | ||
| SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key -> "-1", | ||
| SQLConf.SHUFFLE_PARTITIONS.key -> "20", | ||
| SQLConf.ADVISORY_PARTITION_SIZE_IN_BYTES.key -> "2000") { | ||
| val ee = spark.range(0, 1000, 1, 5).map(l => (l, l - 1)).toDF() | ||
| .persist(org.apache.spark.storage.StorageLevel.MEMORY_AND_DISK) | ||
| ee.count() | ||
|
|
||
| // `minNbrs1` will start with 20 partitions and without the fix would coalesce to ~10 | ||
| // partitions. | ||
| val minNbrs1 = ee | ||
| .groupBy("_2").agg(min(col("_1")).as("min_number")) | ||
| .select(col("_2") as "_1", col("min_number")) | ||
| .persist(org.apache.spark.storage.StorageLevel.MEMORY_AND_DISK) | ||
| minNbrs1.count() | ||
|
|
||
| // shuffle on `ee` will start with 2 partitions, smaller than `minNbrs1`'s partition num, | ||
| // and `EnsureRequirements` will change its partition num to `minNbrs1`'s partition num. | ||
| withSQLConf(SQLConf.SHUFFLE_PARTITIONS.key -> "5") { | ||
| val join = ee.join(minNbrs1, "_1") | ||
| assert(join.count() == 999) | ||
| } | ||
| } | ||
| } | ||
|
|
||
| test("SPARK-45022: exact DatasetQueryContext call site") { | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ulysses-you is this useful even for non-table-cache queries?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bucket table write ? but I think people would specify partition number explicitly if there is a shuffle on bucket column. I can't find other case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea this can be a use case. People may want to fully control the partitioning of the final write stage, which can affect number of files.