Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[KYUUBI #6003] Allow disabling user impersonation on launching engine #6003

Closed
wants to merge 7 commits into from

Conversation

pan3793
Copy link
Member

@pan3793 pan3793 commented Jan 22, 2024

🔍 Description

Issue References 🔗

HiveServer2 has a configuration hive.server2.enable.doAs to control the execution user between the session user and the server user, Kyuubi's CONNECTION and USER share levels always perform like doAs enabled do. In CDH 5/6, this is disabled by default, users who want to migrate from CDH to Kyuubi may encounter permission issues with the current implementation.

Describe Your Solution 🔧

This pull request introduces a new configuration kyuubi.engine.doAs.enabled to allow enable/disable user impersonation on launching engine. For security purpose, it's not allowed to be overridden by session conf.

The change in this PR has certain limitations:

  • only supports Spark engine
  • only supports interactive mode, specifically, it does not take effect on Spark batch mode now.

Types of changes 🔖

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Test Plan 🧪

The first step is passing all existing UTs when kyuubi.engine.doAs.enabled=true.

Tested on internal Kerberized-environment, when kyuubi.engine.share.level=CONNECTION and kyuubi.engine.doAs.enabled=false, use user 'spark' to launch engine, and the engine submitted without --proxy-user spark, thus engine launched by server user hive, then run select session_user(), current_user() and returns

+-----------------+-----------------+
| session_user()  | current_user()  |
+-----------------+-----------------+
| spark           | hive            |
+-----------------+-----------------+

And I checked the spark.app.name and registered path on Zookeeper also expected.

+-----------------+--------------------------------------------------------------------------+
|       key       |                       value                                              |
+-----------------+--------------------------------------------------------------------------+
| spark.app.name  | kyuubi_USER_SPARK_SQL_spark_default_51a416e5-6023-4bac-a964-cd9605f17c61 |
+-----------------+--------------------------------------------------------------------------+

Checklist 📝

Be nice. Be informative.

@@ -2087,20 +2087,33 @@ object KyuubiConf {
.version("1.5.0")
.fallbackConf(ENGINE_CONNECTION_URL_USE_HOSTNAME)

val ENGINE_DO_AS_ENABLED: ConfigEntry[Boolean] =
buildConf("kyuubi.engine.doAs.enabled")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @yaooqinn @zhouyifan279 WDYT of this configuration name? My another candidate is kyuubi.engine.impersonation.enabled

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for kyuubi.engine.doAs.enabled.

@pan3793 pan3793 changed the title Allow disable user impersonation on launching engine Allow disabling user impersonation on launching engine Jan 22, 2024
@codecov-commenter
Copy link

codecov-commenter commented Jan 22, 2024

Codecov Report

Attention: 34 lines in your changes are missing coverage. Please review.

Comparison is base (d474768) 61.10% compared to head (add20fd) 61.09%.

❗ Current head add20fd differs from pull request most recent head c4002fe. Consider uploading reports for the commit c4002fe to get more accurate results

Files Patch % Lines
...ain/scala/org/apache/kyuubi/engine/EngineRef.scala 34.61% 11 Missing and 6 partials ⚠️
...ache/kyuubi/engine/spark/SparkProcessBuilder.scala 28.57% 5 Missing and 5 partials ⚠️
...apache/kyuubi/engine/chat/ChatProcessBuilder.scala 0.00% 3 Missing ⚠️
...apache/kyuubi/engine/hive/HiveProcessBuilder.scala 60.00% 2 Missing ⚠️
...yuubi/engine/hive/HiveYarnModeProcessBuilder.scala 50.00% 1 Missing ⚠️
.../org/apache/kyuubi/session/KyuubiSessionImpl.scala 75.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master    #6003      +/-   ##
============================================
- Coverage     61.10%   61.09%   -0.02%     
  Complexity       23       23              
============================================
  Files           623      623              
  Lines         37103    37136      +33     
  Branches       5029     5031       +2     
============================================
+ Hits          22672    22687      +15     
- Misses        11981    12001      +20     
+ Partials       2450     2448       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -97,6 +98,8 @@ private[kyuubi] class EngineRef(
case _ => user
}

private[kyuubi] val effectiveUser: String = if (doAsEnabled) appUser else Utils.currentUser
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are three user variables in this class now,

  • user: session user
  • appUser: for engine routing
  • effectiveUser: for engine launching

those name are not much straightforward, maybe I need to rename them to

  • user => sessionUser
  • appUser => routingUser
  • effectiveUser => appUser

thought?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. It's better to let appUser be consistent with the user on YARN.

s"`kyuubi.engine.share.level`, different users will be used to launch the engine. " +
"Otherwise, Kyuubi Server's user will always be used to launch the engine.")
.version("1.8.1")
.booleanConf
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add this to 1.9.0 instead of a bugfix version

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay

@pan3793 pan3793 changed the title Allow disabling user impersonation on launching engine [KYUUBI #6003] Allow disabling user impersonation on launching engine Jan 22, 2024
@@ -2087,20 +2087,33 @@ object KyuubiConf {
.version("1.5.0")
.fallbackConf(ENGINE_CONNECTION_URL_USE_HOSTNAME)

val ENGINE_DO_AS_ENABLED: ConfigEntry[Boolean] =
buildConf("kyuubi.engine.doAs.enabled")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for kyuubi.engine.doAs.enabled.

warn(s"The session proxy user: $proxyUser is not same with " +
s"spark principal: ${ugi.getShortUserName}, so we can't support use keytab. " +
s"Fallback to use proxy user.")
None
} else if (!doAsEnabled && ugi.getShortUserName != Utils.currentUser) {
warn(s"The server's user: ${Utils.currentUser} is not same with " +
s"spark principal: ${ugi.getShortUserName}, skip to use keytab.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change skip to use keytab. to Fallback to use server's user. can avoid misunderstanding.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated. I kept both

Copy link
Member

@wForget wForget left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pan3793 pan3793 self-assigned this Jan 29, 2024
@pan3793 pan3793 added this to the v1.9.0 milestone Jan 29, 2024
warn(s"The session proxy user: $proxyUser is not same with " +
s"spark principal: ${ugi.getShortUserName}, so we can't support use keytab. " +
s"Fallback to use proxy user.")
None
} else if (!doAsEnabled && ugi.getShortUserName != Utils.currentUser) {
warn(s"The server's user: ${Utils.currentUser} is not same with " +
s"spark principal: ${ugi.getShortUserName}, skip to use keytab. " +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be skip using keytab

@zhouyifan279
Copy link
Contributor

LGTM

@pan3793 pan3793 closed this in 3f993f4 Jan 29, 2024
@pan3793
Copy link
Member Author

pan3793 commented Jan 29, 2024

Thanks, merged to master

zhaohehuhu pushed a commit to zhaohehuhu/incubator-kyuubi that referenced this pull request Feb 5, 2024
…engine

# 🔍 Description
## Issue References 🔗

HiveServer2 has a configuration `hive.server2.enable.doAs` to control the execution user between the session user and the server user, Kyuubi's CONNECTION and USER share levels always perform like doAs enabled do. In CDH 5/6, this is disabled by default, users who want to migrate from CDH to Kyuubi may encounter permission issues with the current implementation.

## Describe Your Solution 🔧

This pull request introduces a new configuration `kyuubi.engine.doAs.enabled` to allow enable/disable user impersonation on launching engine. For security purpose, it's not allowed to be overridden by session conf.

The change in this PR has certain limitations:

- only supports Spark engine
- only supports interactive mode, specifically, it does not take effect on Spark batch mode now.

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

The first step is passing all existing UTs when `kyuubi.engine.doAs.enabled=true`.

Tested on internal Kerberized-environment, when `kyuubi.engine.share.level=CONNECTION` and `kyuubi.engine.doAs.enabled=false`, use user 'spark' to launch engine, and the engine submitted without `--proxy-user spark`, thus engine launched by server user `hive`, then run `select session_user(), current_user()` and returns

```
+-----------------+-----------------+
| session_user()  | current_user()  |
+-----------------+-----------------+
| spark           | hive            |
+-----------------+-----------------+
```

And I checked the `spark.app.name` and registered path on Zookeeper also expected.
```
+-----------------+--------------------------------------------------------------------------+
|       key       |                       value                                              |
+-----------------+--------------------------------------------------------------------------+
| spark.app.name  | kyuubi_USER_SPARK_SQL_spark_default_51a416e5-6023-4bac-a964-cd9605f17c61 |
+-----------------+--------------------------------------------------------------------------+
```

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes apache#6003 from pan3793/doas.

Closes apache#6003

c4002fe [Cheng Pan] grammar
add20fd [Cheng Pan] nit
8711c22 [Cheng Pan] address comment
033a322 [Cheng Pan] 1.9.0
9273b94 [Cheng Pan] fix
a1563e1 [Cheng Pan] HadoopCredentialsManager
e982e23 [Cheng Pan] Allow disable user impersonation on launching engine

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
zhaohehuhu pushed a commit to zhaohehuhu/incubator-kyuubi that referenced this pull request Mar 21, 2024
…engine

# 🔍 Description
## Issue References 🔗

HiveServer2 has a configuration `hive.server2.enable.doAs` to control the execution user between the session user and the server user, Kyuubi's CONNECTION and USER share levels always perform like doAs enabled do. In CDH 5/6, this is disabled by default, users who want to migrate from CDH to Kyuubi may encounter permission issues with the current implementation.

## Describe Your Solution 🔧

This pull request introduces a new configuration `kyuubi.engine.doAs.enabled` to allow enable/disable user impersonation on launching engine. For security purpose, it's not allowed to be overridden by session conf.

The change in this PR has certain limitations:

- only supports Spark engine
- only supports interactive mode, specifically, it does not take effect on Spark batch mode now.

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

The first step is passing all existing UTs when `kyuubi.engine.doAs.enabled=true`.

Tested on internal Kerberized-environment, when `kyuubi.engine.share.level=CONNECTION` and `kyuubi.engine.doAs.enabled=false`, use user 'spark' to launch engine, and the engine submitted without `--proxy-user spark`, thus engine launched by server user `hive`, then run `select session_user(), current_user()` and returns

```
+-----------------+-----------------+
| session_user()  | current_user()  |
+-----------------+-----------------+
| spark           | hive            |
+-----------------+-----------------+
```

And I checked the `spark.app.name` and registered path on Zookeeper also expected.
```
+-----------------+--------------------------------------------------------------------------+
|       key       |                       value                                              |
+-----------------+--------------------------------------------------------------------------+
| spark.app.name  | kyuubi_USER_SPARK_SQL_spark_default_51a416e5-6023-4bac-a964-cd9605f17c61 |
+-----------------+--------------------------------------------------------------------------+
```

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes apache#6003 from pan3793/doas.

Closes apache#6003

c4002fe [Cheng Pan] grammar
add20fd [Cheng Pan] nit
8711c22 [Cheng Pan] address comment
033a322 [Cheng Pan] 1.9.0
9273b94 [Cheng Pan] fix
a1563e1 [Cheng Pan] HadoopCredentialsManager
e982e23 [Cheng Pan] Allow disable user impersonation on launching engine

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants