[Documentation] Add more deployment guide for Kubernetes deployment #13841

KuntaiDu · 2025-02-25T18:38:42Z

This PR adds deployment guide with Kubernetes using native Kubernetes, and using helm chart provided by vllm-project/production-stack.

Signed-off-by: KuntaiDu <kuntai@uchicago.edu>

Signed-off-by: Kuntai Du <kuntai@uchicago.edu>

…into kuntai-add-k8s-doc Signed-off-by: Kuntai Du <kuntai@uchicago.edu>

Signed-off-by: Kuntai Du <kuntai@uchicago.edu>

github-actions · 2025-02-25T18:38:54Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

Signed-off-by: Kuntai Du <kuntai@uchicago.edu>

rafvasq

Just offering a few suggestions and typos I caught 😄

docs/source/deployment/k8s.md

Signed-off-by: Kuntai Du <kuntai@uchicago.edu>

KuntaiDu · 2025-02-27T19:54:53Z

Thank you for your suggestion @rafvasq , just fixed.

rafvasq

lgtm! but you'll need a maintainer to approve it of course.

terrytangyuan

There are many different ways to deploy vLLM on K8s, including vanilla K8s, KServe, AIBrix, production stack, etc. This PR makes production stack a first-class citizen. I think this doc should remain as neutral as possible.

russellb

This needs to be reconciled with the existing helm instructions for using the helm chart included in the vllm repo: https://docs.vllm.ai/en/latest/deployment/frameworks/helm.html

mgoin

Nice doc and resources to make this easier for new users! Would you be open to submitting this as a new page for production-stack specifically? I agree vLLM on k8s is a general idea, and considering KServe is listed as a separate page already, I think we should continue to list all options. Keeping this doc "vanilla k8s" seems fair

Signed-off-by: Kuntai Du <kuntai@uchicago.edu>

mergify · 2025-03-01T00:08:31Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @KuntaiDu.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: Kuntai Du <kuntai@uchicago.edu>

KuntaiDu · 2025-03-01T00:14:52Z

Nice doc and resources to make this easier for new users! Would you be open to submitting this as a new page for production-stack specifically? I agree vLLM on k8s is a general idea, and considering KServe is listed as a separate page already, I think we should continue to list all options. Keeping this doc "vanilla k8s" seems fair

Thanks Michael. Agree that we should keep this doc vanilla k8s, and then add some hyperlinks to link to other projects at the beginning.

Signed-off-by: Kuntai Du <kuntai@uchicago.edu>

Michael's comment addressed.

simon-mo · 2025-03-01T03:39:15Z

Doc build failed. https://readthedocs.org/projects/vllm/builds/27354951/

terrytangyuan · 2025-03-01T03:52:35Z

docs/source/deployment/k8s.md

+Deploying vLLM on Kubernetes is a scalable and efficient way to serve machine learning models. This guide walks you through deploying vLLM using native Kubernetes.

-## Prerequisites
+NOTE: please make sure that there is a running Kubernetes cluster with available GPU resources. If you are new to Kubernetes, here is a [guide](https://github.com/vllm-project/production-stack/blob/main/tutorials/00-install-kubernetes-env.md) that helps you prepare the Kubernetes environment.


I don't think we should link to production stack here

Official k8s docs give people too many choices and can be confusing so this one is probably better. If you found another better installation guilde, feel free to contribute!

Maybe link to Kind instead? It's pretty straightforward to set up https://kind.sigs.k8s.io/

This introduces additional package? I agree with Terry that we could link to another clean page. Maybe we should actually copy one version of k8s tutorial to vllm documentation later.

terrytangyuan · 2025-03-01T03:53:13Z

docs/source/deployment/k8s.md

+Note that These projects are sorted chronologically.
+
+* [vLLM production-stack](https://github.com/vllm-project/production-stack): Originated from UChicago, vLLM production stack is a project that contains latest research and community effort, while still delivering production-level stability and performance. Checkout the [documentation page](https://docs.vllm.ai/en/latest/deployment/integrations/production-stack.html) for more details and examples.
+* [Aibrix](https://github.com/vllm-project/aibrix): Originated from Bytedance, Aibrix is a production-level stack that is Kubernetes-friendly and contains rich features (e.g. Lora management).


We should keep this page more neutral as mentioned above

Yes. But I don't have much expertise on Kserve, Aibrix and others. We will leave it blank here and feel free to create more PRs.

Btw I just heard from some user of production-stack there is also a KubeAI repository which also uses K8S. We potentially could add that as well.

terrytangyuan · 2025-03-01T03:54:03Z

docs/source/deployment/k8s.md

+## Pre-requisite
+
+Ensure that you have a running Kubernetes environment with GPU (you can follow [this tutorial](https://github.com/vllm-project/production-stack/blob/main/tutorials/00-install-kubernetes-env.md) to install a Kubernetes environment on a bare-medal GPU machine).


Similar comment on this one. We could just point to K8s docs directly instead of production stack docs

Official k8s docs give people too many choices and can be confusing so this one is probably better. If you found another better installation guilde, feel free to contribute!

Signed-off-by: Kuntai Du <kuntai@uchicago.edu>

…contribution to other frameworks. Signed-off-by: Kuntai Du <kuntai@uchicago.edu>

Signed-off-by: Kuntai Du <kuntai@uchicago.edu>

Hanchenli · 2025-03-01T05:38:55Z

Thanks a lot @KuntaiDu! Production-Stack greatly appreciated the effort to add our content here. They look good to us though we suggest adding the EKS/GKE tutorial and future Terraform deployment links to this documentation! We will submit more PRs for these things in the future!

KuntaiDu · 2025-03-01T05:41:30Z

Given that I only tried production stack and have only limited experience of other frameworks such as kserve, I am open to PRs that append other projects by people with more expertise. Feel free to propose PRs! @terrytangyuan @mgoin @russellb @simon-mo

Signed-off-by: Kuntai Du <kuntai@uchicago.edu>

terrytangyuan · 2025-03-01T05:51:59Z

Maybe we can just link to this page instead of mentioning the variety of frameworks/integrations out there? https://docs.vllm.ai/en/latest/deployment/integrations/index.html

terrytangyuan · 2025-03-02T00:35:23Z

It seems like some comments were not addressed so I submitted a follow-up PR: #14084

…llm-project#13841) Signed-off-by: KuntaiDu <kuntai@uchicago.edu> Signed-off-by: Kuntai Du <kuntai@uchicago.edu>

…llm-project#13841) Signed-off-by: KuntaiDu <kuntai@uchicago.edu> Signed-off-by: Kuntai Du <kuntai@uchicago.edu> Signed-off-by: Louis Ulmer <ulmerlouis@gmail.com>

…llm-project#13841) Signed-off-by: KuntaiDu <kuntai@uchicago.edu> Signed-off-by: Kuntai Du <kuntai@uchicago.edu>

KuntaiDu added 10 commits February 24, 2025 23:14

Update production stack deployment docs.

a860f6b

Signed-off-by: KuntaiDu <kuntai@uchicago.edu>

Fix bug

ee5593d

Signed-off-by: KuntaiDu <kuntai@uchicago.edu>

Bug fix

dd639d0

Signed-off-by: KuntaiDu <kuntai@uchicago.edu>

fix

9043a66

Signed-off-by: KuntaiDu <kuntai@uchicago.edu>

fix

ef54851

Signed-off-by: KuntaiDu <kuntai@uchicago.edu>

fix

8052753

Signed-off-by: KuntaiDu <kuntai@uchicago.edu>

edit the pitch

031a0c0

Signed-off-by: Kuntai Du <kuntai@uchicago.edu>

Merge branch 'kuntai-add-k8s-doc' of https://github.com/KuntaiDu/vllm …

e241b9b

…into kuntai-add-k8s-doc Signed-off-by: Kuntai Du <kuntai@uchicago.edu>

Fix

2c0fbef

Signed-off-by: Kuntai Du <kuntai@uchicago.edu>

Documentation fix

ced4b13

Signed-off-by: Kuntai Du <kuntai@uchicago.edu>

mergify bot added the documentation Improvements or additions to documentation label Feb 25, 2025

KuntaiDu added 2 commits February 25, 2025 14:09

Add "before getting started"

d17db3b

Signed-off-by: Kuntai Du <kuntai@uchicago.edu>

fix pre-commit issue

65034e3

Signed-off-by: Kuntai Du <kuntai@uchicago.edu>

KuntaiDu changed the title ~~[Documentation] Add deployment guide for Kubernetes~~ [Documentation] Add more deployment guide for Kubernetes deployment Feb 26, 2025

rafvasq suggested changes Feb 26, 2025

View reviewed changes

KuntaiDu added 2 commits February 27, 2025 11:52

incorporating suggestions

e522f01

Signed-off-by: Kuntai Du <kuntai@uchicago.edu>

incorporating suggestions

4d8ef52

Signed-off-by: Kuntai Du <kuntai@uchicago.edu>

KuntaiDu requested a review from rafvasq February 27, 2025 19:54

rafvasq approved these changes Feb 27, 2025

View reviewed changes

terrytangyuan reviewed Feb 28, 2025

View reviewed changes

russellb requested changes Feb 28, 2025

View reviewed changes

mgoin previously requested changes Feb 28, 2025

View reviewed changes

KuntaiDu added 2 commits February 28, 2025 18:07

incorporate Michael's comment

485599d

Signed-off-by: Kuntai Du <kuntai@uchicago.edu>

fix precommit config

c3f0267

Signed-off-by: Kuntai Du <kuntai@uchicago.edu>

mergify bot added the needs-rebase label Mar 1, 2025

add aibrix

a870937

Signed-off-by: Kuntai Du <kuntai@uchicago.edu>

KuntaiDu added 3 commits February 28, 2025 18:31

use separation

fc68c76

Signed-off-by: Kuntai Du <kuntai@uchicago.edu>

use separation

a06df3b

Signed-off-by: Kuntai Du <kuntai@uchicago.edu>

add back the conclusion

7904914

Signed-off-by: Kuntai Du <kuntai@uchicago.edu>

KuntaiDu requested review from mgoin and russellb March 1, 2025 00:36

simon-mo approved these changes Mar 1, 2025

View reviewed changes

terrytangyuan reviewed Mar 1, 2025

View reviewed changes

KuntaiDu added 3 commits February 28, 2025 23:21

documentation fix

2494e7c

Signed-off-by: Kuntai Du <kuntai@uchicago.edu>

removing Aibrix: I don't have expertise of Aibrix, so I'll leave the …

3cc3a18

…contribution to other frameworks. Signed-off-by: Kuntai Du <kuntai@uchicago.edu>

fix

8ea3fd7

Signed-off-by: Kuntai Du <kuntai@uchicago.edu>

KuntaiDu added 3 commits February 28, 2025 23:43

adjust wording

b4c5786

Signed-off-by: Kuntai Du <kuntai@uchicago.edu>

adjust toning

f6b20a7

Signed-off-by: Kuntai Du <kuntai@uchicago.edu>

remove redundant wording

5e121a0

Signed-off-by: Kuntai Du <kuntai@uchicago.edu>

KuntaiDu enabled auto-merge (squash) March 1, 2025 06:05

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Mar 1, 2025

KuntaiDu merged commit 8994dab into vllm-project:main Mar 1, 2025
25 of 30 checks passed

KuntaiDu deleted the kuntai-add-k8s-doc branch March 1, 2025 07:01

terrytangyuan mentioned this pull request Mar 2, 2025

[Doc] More neutral K8s deployment guide #14084

Merged

Akshat-Tripathi pushed a commit to krai/vllm that referenced this pull request Mar 3, 2025

[Documentation] Add more deployment guide for Kubernetes deployment (v…

0daae74

…llm-project#13841) Signed-off-by: KuntaiDu <kuntai@uchicago.edu> Signed-off-by: Kuntai Du <kuntai@uchicago.edu>

ckhordiasma mentioned this pull request Apr 17, 2025

[do not merge] pr test for nm changes into 2.20 red-hat-data-services/vllm#107

Closed

shreyankg pushed a commit to shreyankg/vllm that referenced this pull request May 3, 2025

[Documentation] Add more deployment guide for Kubernetes deployment (v…

53c4208

…llm-project#13841) Signed-off-by: KuntaiDu <kuntai@uchicago.edu> Signed-off-by: Kuntai Du <kuntai@uchicago.edu>

		## Pre-requisite

		Ensure that you have a running Kubernetes environment with GPU (you can follow [this tutorial](https://github.com/vllm-project/production-stack/blob/main/tutorials/00-install-kubernetes-env.md) to install a Kubernetes environment on a bare-medal GPU machine).

Uh oh!

[Documentation] Add more deployment guide for Kubernetes deployment #13841

[Documentation] Add more deployment guide for Kubernetes deployment #13841

Uh oh!

Conversation

KuntaiDu commented Feb 25, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Feb 25, 2025

Uh oh!

rafvasq left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

KuntaiDu commented Feb 27, 2025

Uh oh!

rafvasq left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

terrytangyuan left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

russellb left a comment

Choose a reason for hiding this comment

Uh oh!

mgoin left a comment

Choose a reason for hiding this comment

Uh oh!

mergify bot commented Mar 1, 2025

Uh oh!

KuntaiDu commented Mar 1, 2025

Uh oh!

simon-mo commented Mar 1, 2025

Uh oh!

terrytangyuan Mar 1, 2025

Choose a reason for hiding this comment

Uh oh!

KuntaiDu Mar 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

terrytangyuan Mar 1, 2025

Choose a reason for hiding this comment

Uh oh!

Hanchenli Mar 1, 2025

Choose a reason for hiding this comment

Uh oh!

terrytangyuan Mar 1, 2025

Choose a reason for hiding this comment

Uh oh!

KuntaiDu Mar 1, 2025

Choose a reason for hiding this comment

Uh oh!

Hanchenli Mar 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

terrytangyuan Mar 1, 2025

Choose a reason for hiding this comment

Uh oh!

KuntaiDu Mar 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Hanchenli commented Mar 1, 2025

Uh oh!

KuntaiDu commented Mar 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

terrytangyuan commented Mar 1, 2025

Uh oh!

Uh oh!

terrytangyuan commented Mar 2, 2025

KuntaiDu commented Feb 25, 2025 •

edited by github-actions bot

Loading

rafvasq left a comment •

edited

Loading

terrytangyuan left a comment •

edited

Loading

KuntaiDu Mar 1, 2025 •

edited

Loading

Hanchenli Mar 1, 2025 •

edited

Loading

KuntaiDu Mar 1, 2025 •

edited

Loading

KuntaiDu commented Mar 1, 2025 •

edited

Loading