-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-34320][SQL] Migrate ALTER TABLE DROP COLUMNS commands to use UnresolvedTable to resolve the identifier #32854
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@cloud-fan this is an alternative approach suggested in #32542 (comment). We can migrate the alter table commands one by one via this approach. Let me know if you want to see more examples of this for other alter table commands. |
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
Outdated
Show resolved
Hide resolved
| Seq(ResolveCommandsWithIfExists) ++ | ||
| postHocResolutionRules: _*), | ||
| Batch("Normalize Alter Table Commands", Once, ResolveAlterTableCommands), | ||
| Batch("Normalize Alter Table", Once, ResolveAlterTableChanges), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can remove ResolveAlterTableChanges once all the alter table commands are migrated to ResolveAlterTableCommands.
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
Test build #139619 has finished for PR 32854 at commit
|
|
Kubernetes integration test starting |
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
Show resolved
Hide resolved
| val changes = keys.map(key => TableChange.removeProperty(key)) | ||
| AlterTableExec(table.catalog, table.identifier, changes) :: Nil | ||
|
|
||
| case AlterTableDropColumns(table: ResolvedTable, cols) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we add a base trait for all alter table commands and define a def changes: Seq[TableChanges]? Then we can handle all alter table commands in one place.
|
Kubernetes integration test status failure |
|
Test build #139651 has finished for PR 32854 at commit
|
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #140217 has finished for PR 32854 at commit
|
|
@imback82 how much work do you think is needed to migrate other ALTER TABLE commands? The framework is built, and if it's not much work, let's migrate all of them in one PR. We can also do migration one by one in followup PRs if it needs many code changes. |
|
@cloud-fan It will be many code changes if we migrate all the remaining alter table commands (4 remaining) since we need to refactor |
|
thanks, merging to master! |
…f the error message ### What changes were proposed in this pull request? This is a followup PR for SPARK-34320 (#32854). That PR changed the error message of `ALTER TABLE` but `V2JDBCTest` didn't comply with the change. ### Why are the changes needed? To fix`v2.*JDBCSuite` failure. ``` [info] - SPARK-33034: ALTER TABLE ... add new columns (173 milliseconds) [info] - SPARK-33034: ALTER TABLE ... drop column *** FAILED *** (126 milliseconds) [info] "Cannot delete missing field bad_column in postgresql.alt_table schema: root [info] |-- C2: string (nullable = true) [info] ; line 1 pos 0; [info] 'AlterTableDropColumns [unresolvedfieldname(bad_column)] [info] +- ResolvedTable org.apache.spark.sql.execution.datasources.v2.jdbc.JDBCTableCatalog7f4b7516, alt_table, JDBCTable(alt_table,StructType(StructField(C2,StringType,true)),org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions5842301d), [C2#1879] [info] " did not contain "Cannot delete missing field bad_column in alt_table schema" (V2JDBCTest.scala:106) [info] org.scalatest.exceptions.TestFailedException: [info] at org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:472) [info] at org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:471) [info] at org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1231) [info] at org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:1295) [info] at org.apache.spark.sql.jdbc.v2.V2JDBCTest.$anonfun$$init$$6(V2JDBCTest.scala:106) [info] at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) [info] at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1461) [info] at org.apache.spark.sql.test.SQLTestUtilsBase.withTable(SQLTestUtils.scala:305) [info] at org.apache.spark.sql.test.SQLTestUtilsBase.withTable$(SQLTestUtils.scala:303) [info] at org.apache.spark.sql.jdbc.DockerJDBCIntegrationSuite.withTable(DockerJDBCIntegrationSuite.scala:95) [info] at org.apache.spark.sql.jdbc.v2.V2JDBCTest.$anonfun$$init$$5(V2JDBCTest.scala:95) [info] at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85) [info] at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83) [info] at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) [info] at org.scalatest.Transformer.apply(Transformer.scala:22) [info] at org.scalatest.Transformer.apply(Transformer.scala:20) [info] at org.scalatest.funsuite.AnyFunSuiteLike$$anon$1.apply(AnyFunSuiteLike.scala:190) [info] at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:190) [info] at org.scalatest.funsuite.AnyFunSuiteLike.invokeWithFixture$1(AnyFunSuiteLike.scala:188) [info] at org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$runTest$1(AnyFunSuiteLike.scala:200) [info] at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306) [info] at org.scalatest.funsuite.AnyFunSuiteLike.runTest(AnyFunSuiteLike.scala:200) [info] at org.scalatest.funsuite.AnyFunSuiteLike.runTest$(AnyFunSuiteLike.scala:182) [info] at org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterEach$$super$runTest(SparkFunSuite.scala:62) [info] at org.scalatest.BeforeAndAfterEach.runTest(BeforeAndAfterEach.scala:234) [info] at org.scalatest.BeforeAndAfterEach.runTest$(BeforeAndAfterEach.scala:227) [info] at org.apache.spark.SparkFunSuite.runTest(SparkFunSuite.scala:62) [info] at org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$runTests$1(AnyFunSuiteLike.scala:233) [info] at org.scalatest.SuperEngine.$anonfun$runTestsInBranch$1(Engine.scala:413) [info] at scala.collection.immutable.List.foreach(List.scala:431) [info] at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401) [info] at org.scalatest.SuperEngine.runTestsInBranch(Engine.scala:396) [info] at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:475) [info] at org.scalatest.funsuite.AnyFunSuiteLike.runTests(AnyFunSuiteLike.scala:233) [info] at org.scalatest.funsuite.AnyFunSuiteLike.runTests$(AnyFunSuiteLike.scala:232) [info] at org.scalatest.funsuite.AnyFunSuite.runTests(AnyFunSuite.scala:1563) [info] at org.scalatest.Suite.run(Suite.scala:1112) [info] at org.scalatest.Suite.run$(Suite.scala:1094) [info] at org.scalatest.funsuite.AnyFunSuite.org$scalatest$funsuite$AnyFunSuiteLike$$super$run(AnyFunSuite.scala:1563) [info] at org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$run$1(AnyFunSuiteLike.scala:237) [info] at org.scalatest.SuperEngine.runImpl(Engine.scala:535) [info] at org.scalatest.funsuite.AnyFunSuiteLike.run(AnyFunSuiteLike.scala:237) [info] at org.scalatest.funsuite.AnyFunSuiteLike.run$(AnyFunSuiteLike.scala:236) [info] at org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterAll$$super$run(SparkFunSuite.scala:62) [info] at org.scalatest.BeforeAndAfterAll.liftedTree1$1(BeforeAndAfterAll.scala:213) [info] at org.scalatest.BeforeAndAfterAll.run(BeforeAndAfterAll.scala:210) [info] at org.scalatest.BeforeAndAfterAll.run$(BeforeAndAfterAll.scala:208) [info] at org.apache.spark.SparkFunSuite.run(SparkFunSuite.scala:62) [info] at org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:318) [info] at org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:513) [info] at sbt.ForkMain$Run.lambda$runTest$1(ForkMain.java:413) [info] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [info] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [info] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [info] at java.lang.Thread.run(Thread.java:748) [info] - SPARK-33034: ALTER TABLE ... update column type (122 milliseconds) [info] - SPARK-33034: ALTER TABLE ... rename column (93 milliseconds) [info] - SPARK-33034: ALTER TABLE ... update column nullability (92 milliseconds) [info] - CREATE TABLE with table comment (38 milliseconds) [info] - CREATE TABLE with table property (52 milliseconds) [info] MySQLIntegrationSuite: [info] - Basic test (61 milliseconds) [info] - Numeric types (67 milliseconds) [info] - Date types (59 milliseconds) [info] - String types (50 milliseconds) [info] - Basic write test (216 milliseconds) [info] - query JDBC option (64 milliseconds) [info] Run completed in 19 minutes, 43 seconds. [info] Total number of tests run: 89 [info] Suites: completed 14, aborted 0 [info] Tests: succeeded 84, failed 5, canceled 0, ignored 0, pending 0 [info] *** 5 TESTS FAILED *** [error] Failed tests: [error] org.apache.spark.sql.jdbc.v2.OracleIntegrationSuite [error] org.apache.spark.sql.jdbc.v2.MsSqlServerIntegrationSuite [error] org.apache.spark.sql.jdbc.v2.DB2IntegrationSuite [error] org.apache.spark.sql.jdbc.v2.MySQLIntegrationSuite [error] org.apache.spark.sql.jdbc.v2.PostgresIntegrationSuite [error] (docker-integration-tests / Test / test) sbt.TestsFailedException: Tests unsuccessful [error] Total time: 1223 s (20:23), completed Jun 25, 2021 1:31:04 AM ``` ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? docker-integration-tests on GA. Closes #33074 from sarutak/followup-SPARK-34320. Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com> Signed-off-by: Kousuke Saruta <sarutak@oss.nttdata.com>
What changes were proposed in this pull request?
This PR proposes to migrate the following
ALTER TABLE ... DROP COLUMNScommand to useUnresolvedTableas achildto resolve the table identifier. This allows consistent resolution rules (temp view first, etc.) to be applied for both v1/v2 commands. More info about the consistent resolution rule proposal can be found in JIRA or proposal doc.Why are the changes needed?
This is a part of effort to make the relation lookup behavior consistent: SPARK-29900.
Does this PR introduce any user-facing change?
After this PR, the above
ALTER TABLE ... DROP COLUMNScommands will have a consistent resolution behavior.How was this patch tested?
Updated existing tests.