-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add vLLM support to DocSum Helm chart #649
base: main
Are you sure you want to change the base?
Conversation
Setting as draft because the required image is still missing from DockerHub, and this needs retesting after currently pending DocSum changes for Comps & Examples repos have completed. |
While CI "docsum, gaudi, ci-gaudi-vllm-values" test fails as expected, due to OPEA missing There seems to be a bug in component unrelated to this PR, as also run "llm-uservice, xeon, ci-faqgen-values, common" CI test fails to a package missing from image:
=> |
Rebased to |
CI still fails. PR #659 fixes DocSum issues with updates in other repos, includes the same model ID workaround as this one, and passed CI => better to merge that first & rebase this? "docsum, gaudi, ci-gaudi-vllm-values" fails because CI registry is out of date. Although required image has been at DockerHub for 4 days [1], fetching it still fails: "docsum, xeon, ci-values" fails to connection failure:
"docsum, gaudi, ci-gaudi-tgi-values" fails to similar CI issue:
|
Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com>
Rebased to @daisy-ycguo CI "docsum, gaudi, ci-gaudi-vllm-values" still fails to CI registry not being up to date with DockerHub: https://hub.docker.com/r/opea/llm-docsum-vllm/tags @lianhao, CI "docsum, gaudi, ci-gaudi-tgi-values" test fails now to test bug? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eero-t I found pending PR opea-project/GenAIComps#1101 will make a big change, there will be no more llm-docsum-llm image any more, a single llm-docsum image will be able to talk to both tgi and vllm, so maybe we should wait until that PR to be merged first.
tgi: | ||
enabled: true | ||
LLM_MODEL_ID: Intel/neural-chat-7b-v3-3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MAX_INPUT_LENGTH
and MAX_TOTAL_TOKENS
settings for tgi is missing during the rebase. We need to set them otherwise, CI test will fail.
Description
This continues Helm vLLM support added in #610, by adding vLLM support to DocSum Helm chart.
(Similarly to how it's already done for ChatQnA app + Agent component, there are
tgi.enabled
&vllm.enabled
flags for selecting which LLM will be used.)Type of change
Dependencies
opea/llm-docsum-vllm:latest
image is currently missing from CI & DockerHub registries:opea-project/GenAIComps#961
(Although corresponding
opea/llm-docsum-tgi:latest
image for TGI, andopea/llm-vllm:latest
vLLM text-generation images already exist.)Tests
Manual testing with
opea/llm-docsum-vllm:latest
image built locally.