[GLUTEN-11355][UT] Add new Spark 4.1 tests#11380
Merged
baibaichen merged 19 commits intoapache:mainfrom Jan 13, 2026
Merged
Conversation
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
baibaichen
commented
Jan 8, 2026
| - 'shims/spark40/**' | ||
| - 'shims/spark41/**' | ||
| - 'gluten-ut/spark40/**' | ||
| - 'gluten-ut/spark41/**' |
Contributor
Author
319fe1b to
97dc360
Compare
|
Run Gluten Clickhouse CI on x86 |
97dc360 to
4b2224c
Compare
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
ffeb840 to
a291da4
Compare
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
…ctories for clickhouse backend
Three-way merge performed using Git: - Base: Spark 4.0.1 (29434ea766b) - Left: Spark 4.1.0 (e221b56be7b) - Right: Gluten Spark 4.1 backends-velox Summary: - Auto-merged: 165 files - New tests added: 31 files (collations, edge cases, recursion, spatial, etc.) - Modified tests: 134 files - Deleted tests: 2 files (collations.sql -> split into 4 files, timestamp-ntz.sql) Conflicts resolved: - inputs/timestamp-ntz.sql: Right deleted + Left modified -> DELETED (per resolution rule) New test suites from Spark 4.1.0: - Collations (4 files): aliases, basic, padding-trim, string-functions - Edge cases (6 files): alias-resolution, extract-value, join-resolution, etc. - Advanced features: cte-recursion, generators, kllquantiles, thetasketch, time - Name resolution: order-by-alias, session-variable-precedence, runtime-replaceable - Spatial functions: st-functions (ANSI and non-ANSI variants) - Various resolution edge cases Total files after merge: 671 (up from 613)
…dBatchSerializer to work across different spark versions
545c5a1 to
6526606
Compare
|
Run Gluten Clickhouse CI on x86 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR automates the migration of Gluten test and SQL Test Suites from spark 4.1
Fixes #11355
How to Generate Spark 4.1 Gluten Test Suites
With the help of AI, we follow a four-step process—details can be found here
1 Baseline Analysis
Gluten*Suite.scala)2 Package Scope Extraction
3 Delta Detection
*Suite.scalafiles introduced in Spark 4.14 Code Generation
Result: Automated generation of 27 Gluten test suites for gluten Spark 4.1, ensuring test coverage parity with upstream changes.
How to Migrate SQL Test Suites to Spark 4.1
Three-Way Git Merge Approach
This PR migrates SQL test suites (
inputs/*. sqlandresults/*.sql.out) from Spark 4.1 using Git's three-way merge algorithm to preserve both upstream changes and Gluten-specific modifications.Result:
Fix and Exclusion
Excluded test:
- "infer shredding with mixed scale" in GlutenFileBasedDataSourceSuite
gluten-ut/spark41/.../GlutenCacheTableInKryoSuite.scala
gluten-ut/spark41/.../GlutenMapStatusEndToEndSuite.scala
Excluded tests:
- GlutenMapStatusEndToEndSuite (entire suite)
#52870
#52891
Excluded tests:
- GlutenStreamRealTimeModeAllowlistSuite: "rtm operator allowlist", "repartition not allowed", "stateful queries not allowed"
- GlutenStreamRealTimeModeE2ESuite: "foreach", "to_json and from_json round-trip", "generateExec passthrough"
- GlutenStreamRealTimeModeSuite: "processAllAvailable"
Excluded tests:
- cast.sql
- describe.sql
- nonansi/cast.sql
- nonansi/st-functions.sql
- scripting/randomly_generated_scripts.sql
- st-functions.sql
- type-coercion-edge-cases.sql
- variant-field-extractions.sql
RuntimeReplaceablewith itsreplacementto fix unit tests. This handles the RuntimeReplaceable expression changes in Spark to ensure proper expression transformation.gluten-ut/spark41/.../GlutenTimeExpressionsSuite.scala
gluten-ut/spark41/.../shim/GlutenTestsTrait.scala
How was this patch tested?
Run new tests.