feat: Inference using vLLM with RayServe on Inf2 #591

ratnopamc · 2024-07-17T00:33:39Z

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

- Create a new blueprint to showcase how to use vLLM with RayServe on AWS Inf2
- Deploy any LLM model as inference example

Describe the solution you would like

Create a pattern that enables serving inference using Ray with vLLM backend on Inf2.
Create Website doc with step-by-step instructions for deployment and testing of the pattern.

Describe alternatives you have considered

Additional context

https://awsdocs-neuron.readthedocs-hosted.com/en/latest/index.html

The text was updated successfully, but these errors were encountered:

github-actions · 2024-08-17T00:06:27Z

This issue has been automatically marked as stale because it has been open 30 days
with no activity. Remove stale label or comment or this issue will be closed in 10 days

ratnopamc · 2024-08-21T21:21:39Z

The blueprint for this is merged with PR #607.

ratnopamc added the gen-ai pattern Distributed Training and Inference Patterns for Various Generative AI Large Language Models (LLMs) label Jul 17, 2024

ratnopamc self-assigned this Jul 17, 2024

ratnopamc mentioned this issue Aug 8, 2024

feat: RayServe with vLLM using AWS Neuron on Amazon EKS #607

Merged

3 tasks

github-actions bot added the stale label Aug 17, 2024

ratnopamc closed this as completed Aug 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Inference using vLLM with RayServe on Inf2 #591

feat: Inference using vLLM with RayServe on Inf2 #591

ratnopamc commented Jul 17, 2024

github-actions bot commented Aug 17, 2024

ratnopamc commented Aug 21, 2024

feat: Inference using vLLM with RayServe on Inf2 #591

feat: Inference using vLLM with RayServe on Inf2 #591

Comments

ratnopamc commented Jul 17, 2024

Community Note

What is the outcome that you are trying to reach?

Describe the solution you would like

Describe alternatives you have considered

Additional context

github-actions bot commented Aug 17, 2024

ratnopamc commented Aug 21, 2024