-
-
Notifications
You must be signed in to change notification settings - Fork 11.1k
[Documentation] Add more deployment guide for Kubernetes deployment #13841
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: KuntaiDu <kuntai@uchicago.edu>
Signed-off-by: Kuntai Du <kuntai@uchicago.edu>
…into kuntai-add-k8s-doc Signed-off-by: Kuntai Du <kuntai@uchicago.edu>
Signed-off-by: Kuntai Du <kuntai@uchicago.edu>
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
Signed-off-by: Kuntai Du <kuntai@uchicago.edu>
Signed-off-by: Kuntai Du <kuntai@uchicago.edu>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just offering a few suggestions and typos I caught 😄
Signed-off-by: Kuntai Du <kuntai@uchicago.edu>
Signed-off-by: Kuntai Du <kuntai@uchicago.edu>
|
Thank you for your suggestion @rafvasq , just fixed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm! but you'll need a maintainer to approve it of course.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are many different ways to deploy vLLM on K8s, including vanilla K8s, KServe, AIBrix, production stack, etc. This PR makes production stack a first-class citizen. I think this doc should remain as neutral as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be reconciled with the existing helm instructions for using the helm chart included in the vllm repo: https://docs.vllm.ai/en/latest/deployment/frameworks/helm.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice doc and resources to make this easier for new users! Would you be open to submitting this as a new page for production-stack specifically? I agree vLLM on k8s is a general idea, and considering KServe is listed as a separate page already, I think we should continue to list all options. Keeping this doc "vanilla k8s" seems fair
Signed-off-by: Kuntai Du <kuntai@uchicago.edu>
Signed-off-by: Kuntai Du <kuntai@uchicago.edu>
|
This pull request has merge conflicts that must be resolved before it can be |
Signed-off-by: Kuntai Du <kuntai@uchicago.edu>
Thanks Michael. Agree that we should keep this doc vanilla k8s, and then add some hyperlinks to link to other projects at the beginning. |
Signed-off-by: Kuntai Du <kuntai@uchicago.edu>
Signed-off-by: Kuntai Du <kuntai@uchicago.edu>
Signed-off-by: Kuntai Du <kuntai@uchicago.edu>
|
Doc build failed. https://readthedocs.org/projects/vllm/builds/27354951/ |
docs/source/deployment/k8s.md
Outdated
| Deploying vLLM on Kubernetes is a scalable and efficient way to serve machine learning models. This guide walks you through deploying vLLM using native Kubernetes. | ||
|
|
||
| ## Prerequisites | ||
| NOTE: please make sure that there is a running Kubernetes cluster with available GPU resources. If you are new to Kubernetes, here is a [guide](https://github.com/vllm-project/production-stack/blob/main/tutorials/00-install-kubernetes-env.md) that helps you prepare the Kubernetes environment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should link to production stack here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Official k8s docs give people too many choices and can be confusing so this one is probably better. If you found another better installation guilde, feel free to contribute!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe link to Kind instead? It's pretty straightforward to set up https://kind.sigs.k8s.io/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This introduces additional package? I agree with Terry that we could link to another clean page. Maybe we should actually copy one version of k8s tutorial to vllm documentation later.
docs/source/deployment/k8s.md
Outdated
| Note that These projects are sorted chronologically. | ||
|
|
||
| * [vLLM production-stack](https://github.com/vllm-project/production-stack): Originated from UChicago, vLLM production stack is a project that contains latest research and community effort, while still delivering production-level stability and performance. Checkout the [documentation page](https://docs.vllm.ai/en/latest/deployment/integrations/production-stack.html) for more details and examples. | ||
| * [Aibrix](https://github.com/vllm-project/aibrix): Originated from Bytedance, Aibrix is a production-level stack that is Kubernetes-friendly and contains rich features (e.g. Lora management). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should keep this page more neutral as mentioned above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. But I don't have much expertise on Kserve, Aibrix and others. We will leave it blank here and feel free to create more PRs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw I just heard from some user of production-stack there is also a KubeAI repository which also uses K8S. We potentially could add that as well.
| ## Pre-requisite | ||
|
|
||
| Ensure that you have a running Kubernetes environment with GPU (you can follow [this tutorial](https://github.com/vllm-project/production-stack/blob/main/tutorials/00-install-kubernetes-env.md) to install a Kubernetes environment on a bare-medal GPU machine). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comment on this one. We could just point to K8s docs directly instead of production stack docs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Official k8s docs give people too many choices and can be confusing so this one is probably better. If you found another better installation guilde, feel free to contribute!
Signed-off-by: Kuntai Du <kuntai@uchicago.edu>
…contribution to other frameworks. Signed-off-by: Kuntai Du <kuntai@uchicago.edu>
|
Thanks a lot @KuntaiDu! Production-Stack greatly appreciated the effort to add our content here. They look good to us though we suggest adding the EKS/GKE tutorial and future Terraform deployment links to this documentation! We will submit more PRs for these things in the future! |
|
Given that I only tried production stack and have only limited experience of other frameworks such as kserve, I am open to PRs that append other projects by people with more expertise. Feel free to propose PRs! @terrytangyuan @mgoin @russellb @simon-mo |
Signed-off-by: Kuntai Du <kuntai@uchicago.edu>
Signed-off-by: Kuntai Du <kuntai@uchicago.edu>
Signed-off-by: Kuntai Du <kuntai@uchicago.edu>
|
Maybe we can just link to this page instead of mentioning the variety of frameworks/integrations out there? https://docs.vllm.ai/en/latest/deployment/integrations/index.html |
|
It seems like some comments were not addressed so I submitted a follow-up PR: #14084 |
…llm-project#13841) Signed-off-by: KuntaiDu <kuntai@uchicago.edu> Signed-off-by: Kuntai Du <kuntai@uchicago.edu>
…llm-project#13841) Signed-off-by: KuntaiDu <kuntai@uchicago.edu> Signed-off-by: Kuntai Du <kuntai@uchicago.edu> Signed-off-by: Louis Ulmer <ulmerlouis@gmail.com>
…llm-project#13841) Signed-off-by: KuntaiDu <kuntai@uchicago.edu> Signed-off-by: Kuntai Du <kuntai@uchicago.edu>
This PR adds deployment guide with Kubernetes using native Kubernetes, and using helm chart provided by
vllm-project/production-stack.