Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Inference using vLLM with RayServe on Inf2 #591

Closed
ratnopamc opened this issue Jul 17, 2024 · 2 comments
Closed

feat: Inference using vLLM with RayServe on Inf2 #591

ratnopamc opened this issue Jul 17, 2024 · 2 comments
Assignees
Labels
gen-ai pattern Distributed Training and Inference Patterns for Various Generative AI Large Language Models (LLMs) stale

Comments

@ratnopamc
Copy link
Collaborator

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

- Create a new blueprint to showcase how to use vLLM with RayServe on AWS Inf2
- Deploy any LLM model as inference example

Describe the solution you would like

Create a pattern that enables serving inference using Ray with vLLM backend on Inf2.
Create Website doc with step-by-step instructions for deployment and testing of the pattern.

Describe alternatives you have considered

Additional context

https://awsdocs-neuron.readthedocs-hosted.com/en/latest/index.html

@ratnopamc ratnopamc added the gen-ai pattern Distributed Training and Inference Patterns for Various Generative AI Large Language Models (LLMs) label Jul 17, 2024
@ratnopamc ratnopamc self-assigned this Jul 17, 2024
Copy link
Contributor

This issue has been automatically marked as stale because it has been open 30 days
with no activity. Remove stale label or comment or this issue will be closed in 10 days

@github-actions github-actions bot added the stale label Aug 17, 2024
@ratnopamc
Copy link
Collaborator Author

The blueprint for this is merged with PR #607.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
gen-ai pattern Distributed Training and Inference Patterns for Various Generative AI Large Language Models (LLMs) stale
Projects
None yet
Development

No branches or pull requests

1 participant