OpenAI compatible server: tokenizer arg causes issues with pooling resources amongst models #7815
thealmightygrant
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi y'all, I was testing out the OpenAI compatible server, and I saw that we can include the tokenizer directly in the vllm_backend using the model.json. Could we remove it from being a required argument for the OpenAI server?
My thought here is that this opens up serving multiple models from the same server, even if those models use different tokenizers.
Beta Was this translation helpful? Give feedback.
All reactions