Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improvement]: Support terminal access hive table when switch to a catalog with HMS metastore . #1266

Closed
1 task done
li36909 opened this issue Mar 23, 2023 · 3 comments · Fixed by #1264
Closed
1 task done

Comments

@li36909
Copy link
Contributor

li36909 commented Mar 23, 2023

Search before asking

  • I have searched in the issues and found no similar issues.

What would you like to be improved?

use ArcticSparkSessionCatalog for terminal when user use hive type catalog, then at terminal we run sql for origin hive tables.

How should we improve?

there are few issues need to resolve here:

  1. spark session catalog ask to use name 'spark_catalog', and at terminal we maybe need to support multiple hive catalog
  2. at sql script, user maybe write the sql with the catalog name.

to resolve these two issue:
1、when setting spark conf befor start terminal session, we can replace the catalog name to spark_catalog.
2、we need to limit the user not contain the catalog name for current catalog at the sql script.

Are you willing to submit PR?
Yes I am willing to submit a PR!

Subtasks
No response

Code of Conduct
I agree to follow this project's Code of Conduct

@baiyangtx
Copy link
Contributor

I suggest to change the title to

Support terminal access hive table when switch to a catalog with HMS metastore .

Because that is the real problem that we want to solve, and using ArcticSparkSessionCatalog is a feasible way to do this.

@li36909 li36909 changed the title [Improvement]: support terminal using ArcticSparkSessionCatalog for sparkSql backend and kyuubi backend [Improvement]: support query with hive table on terminal Mar 27, 2023
@baiyangtx baiyangtx linked a pull request Mar 30, 2023 that will close this issue
3 tasks
@baiyangtx
Copy link
Contributor

I had thought about this feature, it's not easy to solve in mulit catalogs situation. The AMS creates the spark context in it's process and the spark engine could only create one spark context in each process, it's hard to create 2 different catalog using ArcticSparkSessionCatalog.

But on the other hand, 90% users only have one catalog, I think it is valuable to solve the feature with only one catalog.

I suggest to solve this issue in this way.

  1. add a property to terminal configs to force use ArcticSparkSessionCatalog for hive.
ams.terminal.backend=local
ams.terminal.local.using-session-catalog-for-hive=true

or
ams.terminal.backend=kyuubi
ams.terminal.kyuubi.using-session-catalog-for-hive=true
  1. the TerminalManager already passed the catalog.type to TerminalSessionFactory. so we can replace the ArcticSparkCatalog in the implement class.

Both KyuubiTerminalSessionFactory and LocalTerminalSessionFactory are using SparkContextUtil.getSparkConf to generate the spark configs. Replace the catalog name and implement class here.

  1. in TerminalSession, ignore use {catalog} if current catalog is spark_catalog

@li36909 li36909 changed the title [Improvement]: support query with hive table on terminal [Improvement]: Support terminal access hive table when switch to a catalog with HMS metastore . Mar 31, 2023
@li36909
Copy link
Contributor Author

li36909 commented Mar 31, 2023

thanks for your suggestions, I will try to fix it as you said

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants