feat: Inference using vLLM with RayServe on Inf2 #591
Labels
gen-ai pattern
Distributed Training and Inference Patterns for Various Generative AI Large Language Models (LLMs)
stale
Community Note
What is the outcome that you are trying to reach?
Describe the solution you would like
Create a pattern that enables serving inference using Ray with vLLM backend on Inf2.
Create Website doc with step-by-step instructions for deployment and testing of the pattern.
Describe alternatives you have considered
Additional context
https://awsdocs-neuron.readthedocs-hosted.com/en/latest/index.html
The text was updated successfully, but these errors were encountered: