diff --git a/docs/source/deployment/k8s.md b/docs/source/deployment/k8s.md index dd3769c47fc5..b31344b19966 100644 --- a/docs/source/deployment/k8s.md +++ b/docs/source/deployment/k8s.md @@ -4,17 +4,19 @@ Deploying vLLM on Kubernetes is a scalable and efficient way to serve machine learning models. This guide walks you through deploying vLLM using native Kubernetes. --------- - -Alternatively, you can also deploy Kubernetes using [helm chart](https://docs.vllm.ai/en/latest/deployment/frameworks/helm.html). There are also open-source projects available to make your deployment even smoother. - -* [vLLM production-stack](https://github.com/vllm-project/production-stack): Born out of a Berkeley-UChicago collaboration, vLLM production stack is a project that contains latest research and community effort, while still delivering production-level stability and performance. Checkout the [documentation page](https://docs.vllm.ai/en/latest/deployment/integrations/production-stack.html) for more details and examples. - --------- +Alternatively, you can deploy vLLM to Kubernetes using any of the following: +* [Helm](frameworks/helm.md) +* [InftyAI/llmaz](integrations/llmaz.md) +* [KServe](integrations/kserve.md) +* [kubernetes-sigs/lws](frameworks/lws.md) +* [meta-llama/llama-stack](integrations/llamastack.md) +* [substratusai/kubeai](integrations/kubeai.md) +* [vllm-project/aibrix](https://github.com/vllm-project/aibrix) +* [vllm-project/production-stack](integrations/production-stack.md) ## Pre-requisite -Ensure that you have a running Kubernetes environment with GPU (you can follow [this tutorial](https://github.com/vllm-project/production-stack/blob/main/tutorials/00-install-kubernetes-env.md) to install a Kubernetes environment on a bare-metal GPU machine). +Ensure that you have a running [Kubernetes cluster with GPUs](https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/). ## Deployment using native K8s