[KYUUBI #4218] Prefer to use TableCatalog to load tables from session catalog #4219

nousot-cloud-guy · 2023-01-31T14:53:01Z

Why are the changes needed?

To close #4218 .
This change ensures BI tools can list columns on Delta Lake tables in all schemas.

How was this patch tested?

Add some test cases that check the changes thoroughly including negative and positive cases if possible
Add screenshots for manual tests if appropriate
Run test locally before make a pull request

pan3793 · 2023-02-01T02:30:17Z

...i-spark-sql-engine/src/main/scala/org/apache/kyuubi/engine/spark/shim/CatalogShim_v2_4.scala

@@ -141,7 +141,7 @@ class CatalogShim_v2_4 extends SparkCatalogShim {
      catalog.getTablesByName(identifiers).flatMap { t =>
        val tableSchema =
          if (t.provider.getOrElse("").equalsIgnoreCase("delta")) {
-            spark.table(t.identifier.table).schema
+            spark.table(f"${db}.${t.identifier.table}").schema


can we use t.identifier.toString here?

pan3793 · 2023-02-01T02:31:34Z

Thanks for investigating and making this PR, could you please verify the case when the db or table name contains dots?

pan3793 · 2023-02-01T03:00:21Z

After second thought, I think we can totally remove the current workaround(revert #1476), and just invert the pattern match branch order in CatalogShim_v3_0#getColumnsByCatalog

- catalog match {
-   case builtin if builtin.name() == SESSION_CATALOG =>
-     super.getColumnsByCatalog(...)
-   case tc: TableCatalog =>
-     ...
- }

+ catalog match {
+   case tc: TableCatalog =>
+     ...
+   case builtin if builtin.name() == SESSION_CATALOG =>
+     super.getColumnsByCatalog(...)
+ }

Would you please try this way? @nousot-cloud-guy

codecov-commenter · 2023-02-01T03:44:02Z

Codecov Report

Merging #4219 (5698432) into master (7c036fb) will increase coverage by 0.43%.
The diff coverage is 0.00%.

@@             Coverage Diff              @@
##             master    #4219      +/-   ##
============================================
+ Coverage     53.07%   53.51%   +0.43%     
  Complexity       13       13              
============================================
  Files           548      560      +12     
  Lines         30010    30410     +400     
  Branches       4035     4082      +47     
============================================
+ Hits          15929    16274     +345     
- Misses        12599    12634      +35     
- Partials       1482     1502      +20

Impacted Files	Coverage Δ
...he/kyuubi/engine/spark/shim/CatalogShim_v2_4.scala	`64.06% <0.00%> (-7.58%)`	⬇️
...he/kyuubi/engine/spark/shim/CatalogShim_v3_0.scala	`80.43% <0.00%> (-6.53%)`	⬇️
...uthentication/LdapAuthenticationProviderImpl.scala	`78.04% <0.00%> (-13.96%)`	⬇️
...scala/org/apache/kyuubi/service/ServiceUtils.scala	`88.88% <0.00%> (-11.12%)`	⬇️
.../apache/kyuubi/server/api/v1/BatchesResource.scala	`70.09% <0.00%> (-2.81%)`	⬇️
...g/apache/kyuubi/operation/BatchJobSubmission.scala	`75.27% <0.00%> (-1.70%)`	⬇️
...apache/kyuubi/service/TBinaryFrontendService.scala	`48.38% <0.00%> (-1.08%)`	⬇️
...ommon/src/main/scala/org/apache/kyuubi/Utils.scala	`73.00% <0.00%> (-0.94%)`	⬇️
...mon/src/main/scala/org/apache/kyuubi/Logging.scala	`42.50% <0.00%> (-0.93%)`	⬇️
... and 31 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

pan3793 · 2023-02-03T07:23:09Z

...i-spark-sql-engine/src/main/scala/org/apache/kyuubi/engine/spark/shim/CatalogShim_v2_4.scala

@@ -141,7 +141,7 @@ class CatalogShim_v2_4 extends SparkCatalogShim {
      catalog.getTablesByName(identifiers).flatMap { t =>
        val tableSchema =
          if (t.provider.getOrElse("").equalsIgnoreCase("delta")) {
-            spark.table(t.identifier.table).schema
+            spark.table(f"${db}.${t.identifier.table}").schema


Suggested change

spark.table(f"${db}.${t.identifier.table}").schema

spark.table(t.identifier.toString).schema

This reverts commit f7e4675.

…n DBeaver." This reverts commit b87d206.

nousot-cloud-guy · 2023-02-04T19:59:34Z

That seems to be working. I've updated the PR with the reverts and new commit.

### _Why are the changes needed?_ To close #4218 . This change ensures BI tools can list columns on Delta Lake tables in all schemas. <img width="312" alt="image" src="https://user-images.githubusercontent.com/89149767/215793967-722eb5f9-ffe4-4ffb-b7f9-1ded06c146d7.png"> <img width="725" alt="image" src="https://user-images.githubusercontent.com/89149767/215794036-871f005f-1494-487d-90aa-1f99891177c2.png"> ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [x] Add screenshots for manual tests if appropriate - [ ] [Run test](https://kyuubi.readthedocs.io/en/master/develop_tools/testing.html#running-tests) locally before make a pull request Closes #4219 from nousot-cloud-guy/feature/delta-db-schema. Closes #4218 5698432 [Alex Wiss-Wolferding] Reversing match order in getColumnsByCatalog. a6d973a [Alex Wiss-Wolferding] Revert "[KYUUBI #1458] Delta lake table columns won't show up in DBeaver." 20337dc [Alex Wiss-Wolferding] Revert "Using DB and table name when checking Delta table schema." f7e4675 [Alex Wiss-Wolferding] Using DB and table name when checking Delta table schema. Authored-by: Alex Wiss-Wolferding <alex@nousot.com> Signed-off-by: Cheng Pan <chengpan@apache.org> (cherry picked from commit 2b958c6) Signed-off-by: Cheng Pan <chengpan@apache.org>

pan3793 · 2023-02-06T02:21:31Z

Merged to master/1.6, thanks @nousot-cloud-guy

Using DB and table name when checking Delta table schema.

f7e4675

github-actions bot added the module:spark label Jan 31, 2023

nousot-cloud-guy mentioned this pull request Jan 31, 2023

[Bug] Delta Lake columns only show in DBeaver/Power BI/etc. in the "default" database #4218

Closed

4 tasks

pan3793 reviewed Feb 1, 2023

View reviewed changes

pan3793 reviewed Feb 3, 2023

View reviewed changes

nousot-cloud-guy added 3 commits February 4, 2023 13:39

Revert "Using DB and table name when checking Delta table schema."

20337dc

This reverts commit f7e4675.

Revert "[KYUUBI apache#1458] Delta lake table columns won't show up i…

a6d973a

…n DBeaver." This reverts commit b87d206.

Reversing match order in getColumnsByCatalog.

5698432

nousot-cloud-guy requested a review from pan3793 February 5, 2023 17:36

pan3793 approved these changes Feb 6, 2023

View reviewed changes

pan3793 closed this in 2b958c6 Feb 6, 2023

pan3793 assigned nousot-cloud-guy Feb 6, 2023

pan3793 added this to the v1.6.2 milestone Feb 6, 2023

pan3793 changed the title ~~[KYUUBI #4218] Using DB and table name when checking Delta table schema.~~ [KYUUBI #4218] Prefer to use TableCatalog to load tables from session catalog Mar 6, 2023

pan3793 mentioned this pull request Jun 15, 2023

[KYUUBI #4937] Cleanup spark catalog shim and renamed to catalog utils #4963

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[KYUUBI #4218] Prefer to use TableCatalog to load tables from session catalog #4219

[KYUUBI #4218] Prefer to use TableCatalog to load tables from session catalog #4219

nousot-cloud-guy commented Jan 31, 2023 •

edited by bowenliang123

Loading

pan3793 Feb 1, 2023

pan3793 commented Feb 1, 2023

pan3793 commented Feb 1, 2023 •

edited

Loading

codecov-commenter commented Feb 1, 2023 •

edited

Loading

pan3793 Feb 3, 2023

nousot-cloud-guy commented Feb 4, 2023

pan3793 commented Feb 6, 2023

	spark.table(f"${db}.${t.identifier.table}").schema
	spark.table(t.identifier.toString).schema

[KYUUBI #4218] Prefer to use TableCatalog to load tables from session catalog #4219

[KYUUBI #4218] Prefer to use TableCatalog to load tables from session catalog #4219

Conversation

nousot-cloud-guy commented Jan 31, 2023 • edited by bowenliang123 Loading

Why are the changes needed?

How was this patch tested?

pan3793 Feb 1, 2023

Choose a reason for hiding this comment

pan3793 commented Feb 1, 2023

pan3793 commented Feb 1, 2023 • edited Loading

codecov-commenter commented Feb 1, 2023 • edited Loading

Codecov Report

pan3793 Feb 3, 2023

Choose a reason for hiding this comment

nousot-cloud-guy commented Feb 4, 2023

pan3793 commented Feb 6, 2023

nousot-cloud-guy commented Jan 31, 2023 •

edited by bowenliang123

Loading

pan3793 commented Feb 1, 2023 •

edited

Loading

codecov-commenter commented Feb 1, 2023 •

edited

Loading