This repository was archived by the owner on Mar 13, 2025. It is now read-only.

Description
I have tried to use ray-llm, but it doesn't work out of the box with what vllm suppose to provide. Is there any changes that ray-llm provides to the API endpoint while inference the model?
Here is comment on issue to vllm repo that was solved long time ago: vllm-project/vllm#323 (comment)