feature: Need Multi-Node Multi-GPU to deploy one LLM with 671B DeepSeek using vllm and K8S

### Describe the feature

Need Multi-Node Multi-GPU to deploy one LLM with 671B DeepSeek using vllm and K8S
using nfs to share the parameters by pv and pvc
using vllm distributed inference and serving on Multi-Node Multi-GPU
expand the prometheus and grafana to support these

in fact it is hard for us to have a strong machine with 8* A100, we just have many machine but just one or two GPUs

### Why do you need this feature?

I have experience with Kubernetes (K8S) and have gone through the tutorials. However, most of the available guides focus on deploying models on a single node, without covering multi-node deployment, inter-node communication, or best practices for scaling in Kubernetes.

Recently, we are working on deploying DeepSeek 671B using containers and vLLM. While I have successfully deployed a 7B LLM on a single node, my approach involved using Persistent Volumes (PV) and Persistent Volume Claims (PVC) with NFS to share model parameters. This method makes it easier to manage nodes and instances.

I noticed that vLLM supports distributed deployment on bare-metal machines, but I couldn’t find clear documentation on how to achieve this in a Kubernetes environment. If there are existing tutorials on this, they are not easy to find. It would be helpful to have guides on using NFS for distributed storage and managing multi-node deployments efficiently in Kubernetes.

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feature: Need Multi-Node Multi-GPU to deploy one LLM with 671B DeepSeek using vllm and K8S #332

Describe the feature

Why do you need this feature?

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

feature: Need Multi-Node Multi-GPU to deploy one LLM with 671B DeepSeek using vllm and K8S #332

Description

Describe the feature

Why do you need this feature?

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions