-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-30183][SQL] Disallow to specify reserved properties in CREATE/ALTER NAMESPACE syntax #26806
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
cc @cloud-fan, I think this a separate issue from |
|
The problem is, we don't have a specific clause to alter COMMENT, but I do agree that it's weird to write both COMMENT and the "comment" property in CREATE TABLE. We should either treat comment specially, and allow to set it via properties, or create a special clause for altering comment. What does presto and RDBMS do? |
|
presto has no comment as database attribute https://prestodb.io/docs/current/sql/create-schema.html Mysql allows comments only for tables and columns. |
|
Test build #115022 has finished for PR 26806 at commit
|
|
presto also has the COMMENT ON command: https://prestosql.io/docs/current/sql/comment.html Can we add it first? Then it's safe to say that these properties are reserved. |
|
OK, I'll add that first |
|
Hi @cloud-fan, do we need an umbrella for the |
|
I think we only need to support TABLE and NAMESPACE |
|
How about COLUMN / VIEW? |
|
We have SQL syntax to set comment of a column, and view is not supported by v2 yet. At least we only need to support table and namespace in the near future. |
|
I got it. |
|
Applied the new resolving framework to |
|
Test build #116092 has finished for PR 26806 at commit
|
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
Outdated
Show resolved
Hide resolved
| case class CreateNamespace( | ||
| catalog: SupportsNamespaces, | ||
| namespace: Seq[String], | ||
| child: LogicalPlan, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we delay this change? We may want to create the rule to convert to v1 commands when migrating to the new framework. Let's focus on disallowing reserved properties in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
| Option(ctx.comment).map(string).map { | ||
| properties += SupportsNamespaces.PROP_COMMENT -> _ | ||
|
|
||
| if (properties.keySet.intersect(RESERVED_PROPERTIES.asScala.toSet).nonEmpty) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
properties.keySet.exists(RESERVED_PROPERTIES.contains)
| import org.apache.spark.sql.catalyst.plans.logical._ | ||
| import org.apache.spark.sql.catalyst.rules.Rule | ||
| import org.apache.spark.sql.connector.catalog.{CatalogManager, CatalogPlugin, LookupCatalog, SupportsNamespaces, Table, TableCatalog, TableChange, V1Table} | ||
| import org.apache.spark.sql.connector.catalog.CatalogV2Util.isSessionCatalog |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unnecessary change
|
Test build #116123 has finished for PR 26806 at commit
|
|
Test build #116129 has finished for PR 26806 at commit
|
|
retest this please |
|
Test build #116142 has finished for PR 26806 at commit
|
docs/sql-migration-guide.md
Outdated
|
|
||
| - Since Spark 3.0, the function `percentile_approx` and its alias `approx_percentile` only accept integral value with range in `[1, 2147483647]` as its 3rd argument `accuracy`, fractional and string types are disallowed, e.g. `percentile_approx(10.0, 0.2, 1.8D)` will cause `AnalysisException`. In Spark version 2.4 and earlier, if `accuracy` is fractional or string value, it will be coerced to an int value, `percentile_approx(10.0, 0.2, 1.8D)` is operated as `percentile_approx(10.0, 0.2, 1)` which results in `10.0`. | ||
|
|
||
| - Since Spark 3.0, the namespace properties `location` and `comment` become reserved, it will fail with `ParseException` if we use them as members of `DBPROTERTIES` in `CREATE NAMESPACE` and `ALTER NAMESPACE ... SET PROPERTIES(...)`. We need their specific clauses to specify them, e.g. `CREATE NAMESPACE a.b.c COMMENT 'any comment' LOCATION 'some path'`. We can set `spark.sql.legacy.property.nonReserved` to `true` to ignore the `ParseException`, but notice that in this case, these properties will produce side effects, e.g `SET DBPROTERTIES('location'='/tmp')` might change the location of the database. In Spark version 2.4 and earlier, these properties are neither reserved nor have side effects, e.g. `SET DBPROTERTIES('location'='/tmp')` will not change the location of the database but create a headless property just like `'a'='b'`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it's the migration guide, I think it's better to use the syntax of Spark 2.4
it will fail with `ParseException` if we use them in `CREATE DATABASE ... DBPROPERTIES` and `ALTER DATABASE ... SET DBPROPERTIES` ...
|
Test build #116224 has finished for PR 26806 at commit
|
docs/sql-migration-guide.md
Outdated
|
|
||
| - Since Spark 3.0, the function `percentile_approx` and its alias `approx_percentile` only accept integral value with range in `[1, 2147483647]` as its 3rd argument `accuracy`, fractional and string types are disallowed, e.g. `percentile_approx(10.0, 0.2, 1.8D)` will cause `AnalysisException`. In Spark version 2.4 and earlier, if `accuracy` is fractional or string value, it will be coerced to an int value, `percentile_approx(10.0, 0.2, 1.8D)` is operated as `percentile_approx(10.0, 0.2, 1)` which results in `10.0`. | ||
|
|
||
| - Since Spark 3.0, the words `location` and `comment` become reserved database properties, it will fail with `ParseException` if we use them as members of `DBPROTERTIES` in `CREATE DATABASE` and `ALTER DATABASE ... SET PROPERTIES(...)`. We need their specific clauses to specify them, e.g. `CREATE DATABASE test COMMENT 'any comment' LOCATION 'some path'`. We can set `spark.sql.legacy.property.nonReserved` to `true` to ignore the `ParseException`, but notice that in this case, these properties will be ignored too, e.g `SET DBPROTERTIES('location'='/tmp')` will affect nothing. In Spark version 2.4 and earlier, these properties are neither reserved nor have side effects, e.g. `SET DBPROTERTIES('location'='/tmp')` will not change the location of the database but only create a headless property just like `'a'='b'`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
notice that in this case, these properties will be ignored too ... We don't need to highlight it as it's the same with 2.4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not exactly the same, 2.4 will create a dbproperty, we will not do that. I'd change this to your suggestion as is in SQLConf
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
Outdated
Show resolved
Hide resolved
| } | ||
| } | ||
|
|
||
| private def checkNamespaceProperties( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not just check now, how about cleanNamespaceProperties?
| .internal() | ||
| .doc("When true, all database and table properties are not reserved and available for " + | ||
| "create/alter syntaxes. But please be aware that the reserved properties will still be " + | ||
| "used by Spark internally and will ignore their user specified values.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the reserved properties will be silently removed.
|
Test build #116227 has finished for PR 26806 at commit
|
|
Test build #116236 has finished for PR 26806 at commit
|
sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala
Outdated
Show resolved
Hide resolved
|
Test build #116249 has finished for PR 26806 at commit
|
|
Test build #116276 has finished for PR 26806 at commit
|
|
Test build #116313 has finished for PR 26806 at commit
|
|
thanks, merging to master! |
What changes were proposed in this pull request?
Currently, COMMENT and LOCATION are reserved properties for Datasource v2 namespaces. They can be set via specific clauses and via properties. And the ones specified in clauses take precede of properties. Since they are reserved, which means they are not able to visit directly. They should be used in COMMENT/LOCATION clauses ONLY.
Why are the changes needed?
make reserved properties be reserved.
Does this PR introduce any user-facing change?
yes, 'location', 'comment' are not allowed use in db properties
How was this patch tested?
UNIT tests.