Skip to content

Release semaphore immediately after startup process exit #6463

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

ic4y
Copy link
Contributor

@ic4y ic4y commented Jun 13, 2024

🔍 Description

Issue References 🔗

The concurrency limit for the engine startup process is mainly used to avoid overload on the machine(or container) of the Kyuubi server, the current implementation holds startupProcessSemaphore until the session is established successfully. While for Spark on YARN cluster mode, some YARN queue resource insufficiency may block the subsequent Spark application submissions to other queues, significantly affecting the Kyuubi server's resource utilization.

Describe Your Solution 🔧

We should immediately release the startupProcessSemaphore after the engine startup process exits (i.e., after the spark-submit process exits) as the load has already disappeared.

Types of changes 🔖

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Test Plan 🧪

I tested it on a cluster of 50 kyuubi Servers, and kyuubi server resource utilization increased by 70%


Checklist 📝

Be nice. Be informative.

@pan3793 pan3793 changed the title [Improve][EngineRef] Optimize Engine Startup Concurrency Limit Release semaphore immediately after startup process exit Jun 13, 2024
@pan3793
Copy link
Member

pan3793 commented Jun 13, 2024

The idea makes sense, and a similar mechanism was applied to batch submission too #6028

@pan3793 pan3793 requested review from wForget and cxzl25 June 13, 2024 03:37
@codecov-commenter
Copy link

codecov-commenter commented Jun 13, 2024

Codecov Report

Attention: Patch coverage is 0% with 3 lines in your changes missing coverage. Please review.

Project coverage is 0.00%. Comparing base (7011d90) to head (f7de68c).

Files Patch % Lines
...ain/scala/org/apache/kyuubi/engine/EngineRef.scala 0.00% 3 Missing ⚠️
Additional details and impacted files
@@          Coverage Diff           @@
##           master   #6463   +/-   ##
======================================
  Coverage    0.00%   0.00%           
======================================
  Files         675     675           
  Lines       41641   41643    +2     
  Branches     5685    5685           
======================================
- Misses      41641   41643    +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@cxzl25 cxzl25 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@wForget wForget left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM

@pan3793 pan3793 closed this in 95ed748 Jun 13, 2024
pan3793 pushed a commit that referenced this pull request Jun 13, 2024
# 🔍 Description
## Issue References 🔗

The concurrency limit for the engine startup process is mainly used to avoid overload on the machine(or container) of the Kyuubi server, the current implementation holds startupProcessSemaphore until the session is established successfully. While for Spark on YARN cluster mode, some YARN queue resource insufficiency may block the subsequent Spark application submissions to other queues, significantly affecting the Kyuubi server's resource utilization.

## Describe Your Solution 🔧

We should immediately release the `startupProcessSemaphore` after the engine startup process exits (i.e., after the `spark-submit` process exits) as the load has already disappeared.

## Types of changes 🔖

- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

I tested it on a cluster of 50 kyuubi Servers, and kyuubi server resource utilization increased by 70%

---

# Checklist 📝

- [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6463 from ic4y/master-p003.

Closes #6463

f7de68c [ic4y] Improve code quality
d8b0248 [ic4y] [Improve][EngineRef] Optimize Engine Startup Concurrency Limit

Authored-by: ic4y <ic4y@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
(cherry picked from commit 95ed748)
Signed-off-by: Cheng Pan <chengpan@apache.org>
pan3793 pushed a commit that referenced this pull request Jun 13, 2024
The concurrency limit for the engine startup process is mainly used to avoid overload on the machine(or container) of the Kyuubi server, the current implementation holds startupProcessSemaphore until the session is established successfully. While for Spark on YARN cluster mode, some YARN queue resource insufficiency may block the subsequent Spark application submissions to other queues, significantly affecting the Kyuubi server's resource utilization.

We should immediately release the `startupProcessSemaphore` after the engine startup process exits (i.e., after the `spark-submit` process exits) as the load has already disappeared.

- [x] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

I tested it on a cluster of 50 kyuubi Servers, and kyuubi server resource utilization increased by 70%

---

- [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6463 from ic4y/master-p003.

Closes #6463

f7de68c [ic4y] Improve code quality
d8b0248 [ic4y] [Improve][EngineRef] Optimize Engine Startup Concurrency Limit

Authored-by: ic4y <ic4y@apache.org>
Signed-off-by: Cheng Pan <chengpan@apache.org>
(cherry picked from commit 95ed748)
Signed-off-by: Cheng Pan <chengpan@apache.org>
@pan3793 pan3793 assigned pan3793 and ic4y and unassigned pan3793 Jun 13, 2024
@pan3793 pan3793 added this to the v1.8.3 milestone Jun 13, 2024
@pan3793
Copy link
Member

pan3793 commented Jun 13, 2024

Thanks, merged to master/1.9.2/1.8.3

@pan3793
Copy link
Member

pan3793 commented Jun 13, 2024

@ic4y you may want to link the email Authored-by: ic4y <ic4y@apache.org> to your GitHub account, otherwise your contributions won't be counted by GitHub

You can check 95ed748 after updating the GitHub profile

@ic4y
Copy link
Contributor Author

ic4y commented Jun 14, 2024

@ic4y you may want to link the email Authored-by: ic4y <ic4y@apache.org> to your GitHub account, otherwise your contributions won't be counted by GitHub

You can check 95ed748 after updating the GitHub profile

I have added my email, thank you for reminding me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants