Skip to content

Conversation

@cloud-fan
Copy link
Contributor

What changes were proposed in this pull request?

This is a followup of #26006

In #26006 , we merged the v1 and v2 SHOW DATABASES/NAMESPACES commands, but we missed a behavior change that the output schema of SHOW DATABASES becomes different.

This PR adds a legacy config to restore the old schema, with a migration guide item to mention this behavior change.

Why are the changes needed?

Improve backward compatibility

Does this PR introduce any user-facing change?

No (the legacy config is false by default)

How was this patch tested?

a new test

@cloud-fan
Copy link
Contributor Author

cloud-fan commented Feb 4, 2021

namespace: LogicalPlan,
pattern: Option[String]) extends Command {
pattern: Option[String],
override val output: Seq[Attribute] = ShowNamespaces.OUTPUT) extends Command {
Copy link
Contributor Author

@cloud-fan cloud-fan Feb 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to put the output field in the constructor so that it's more stable (copying the node will not regenerate the output attributes again). It also helps if we want to support self-join later.


test("SPARK-34359: keep the legacy output schema") {
withSQLConf(SQLConf.LEGACY_KEEP_COMMAND_OUTPUT_SCHEMA.key -> "true") {
assert(sql("SHOW NAMESPACES").schema.fieldNames.toSeq == Seq("databaseName"))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually, people don't rely on the output schema of commands, but some BI tools may rely on it and users are not able to update the BI tools. That's why I think it deserves a legacy config.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually, people don't rely on the output schema of commands, but some BI tools may rely on it and users are not able to update the BI tools. That's why I think it deserves a legacy config.

That's an important reason. +1 for this.

Copy link
Contributor

@imback82 imback82 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@SparkQA
Copy link

SparkQA commented Feb 4, 2021

Test build #134877 has finished for PR 31474 at commit e1cff3f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon HyukjinKwon changed the title [SPARK-34359][SQL] add a legacy config to restore the output schema of SHOW DATABASES [SPARK-34359][SQL] Add a legacy config to restore the output schema of SHOW DATABASES Feb 5, 2021
Co-authored-by: Hyukjin Kwon <gurwls223@gmail.com>
@cloud-fan
Copy link
Contributor Author

The last commit just updates the migration guide, I'm merging it to master, thanks all for the review! I'll backport it 3.1/3.0 soon.

@cloud-fan cloud-fan closed this in 361d702 Feb 5, 2021
cloud-fan added a commit to cloud-fan/spark that referenced this pull request Feb 5, 2021
…f SHOW DATABASES

This is a followup of apache#26006

In apache#26006 , we merged the v1 and v2 SHOW DATABASES/NAMESPACES commands, but we missed a behavior change that the output schema of SHOW DATABASES becomes different.

This PR adds a legacy config to restore the old schema, with a migration guide item to mention this behavior change.

Improve backward compatibility

No (the legacy config is false by default)

a new test

Closes apache#31474 from cloud-fan/command-schema.

Lead-authored-by: Wenchen Fan <cloud0fan@gmail.com>
Co-authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
cloud-fan added a commit that referenced this pull request Feb 5, 2021
…ema of SHOW DATABASES

This backports #31474 to 3.1/3.0

### What changes were proposed in this pull request?

This is a followup of #26006

In #26006 , we merged the v1 and v2 SHOW DATABASES/NAMESPACES commands, but we missed a behavior change that the output schema of SHOW DATABASES becomes different.

This PR adds a legacy config to restore the old schema, with a migration guide item to mention this behavior change.

### Why are the changes needed?

Improve backward compatibility

### Does this PR introduce _any_ user-facing change?

No (the legacy config is false by default)

### How was this patch tested?

a new test

Closes #31486 from cloud-fan/command-schema.

Authored-by: Wenchen Fan <cloud0fan@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
cloud-fan added a commit that referenced this pull request Feb 5, 2021
…ema of SHOW DATABASES

This backports #31474 to 3.1/3.0

This is a followup of #26006

In #26006 , we merged the v1 and v2 SHOW DATABASES/NAMESPACES commands, but we missed a behavior change that the output schema of SHOW DATABASES becomes different.

This PR adds a legacy config to restore the old schema, with a migration guide item to mention this behavior change.

Improve backward compatibility

No (the legacy config is false by default)

a new test

Closes #31486 from cloud-fan/command-schema.

Authored-by: Wenchen Fan <cloud0fan@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit 7c87b48)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants