Add vLLM support to DocSum Helm chart #649

eero-t · 2024-12-18T18:57:33Z

Description

This continues Helm vLLM support added in #610, by adding vLLM support to DocSum Helm chart.

(Similarly to how it's already done for ChatQnA app + Agent component, there are tgi.enabled & vllm.enabled flags for selecting which LLM will be used.)

Type of change

New feature (non-breaking change which adds new functionality)

Dependencies

opea/llm-docsum-vllm:latest image is currently missing from CI & DockerHub registries:
opea-project/GenAIComps#961

(Although corresponding opea/llm-docsum-tgi:latest image for TGI, and opea/llm-vllm:latest vLLM text-generation images already exist.)

Tests

Manual testing with opea/llm-docsum-vllm:latest image built locally.

eero-t · 2024-12-18T18:58:34Z

Setting as draft because the required image is still missing from DockerHub, and this needs retesting after currently pending DocSum changes for Comps & Examples repos have completed.

eero-t · 2024-12-20T15:55:37Z

While CI "docsum, gaudi, ci-gaudi-vllm-values" test fails as expected, due to OPEA missing llm-docsum-vllm image...

There seems to be a bug in component unrelated to this PR, as also run "llm-uservice, xeon, ci-faqgen-values, common" CI test fails to a package missing from image:

[pod/llm-uservice20241218190439-5b9b7b79fd-r65l9/llm-uservice20241218190439]
...
   File "/home/user/comps/llms/faq-generation/tgi/langchain/llm.py", line 77, in stream_generator
     from langserve.serialization import WellKnownLCSerializer
   File "/home/user/.local/lib/python3.11/site-packages/langserve/__init__.py", line 8, in <module>
     from langserve.client import RemoteRunnable
   File "/home/user/.local/lib/python3.11/site-packages/langserve/client.py", line 24, in <module>
     from httpx._types import AuthTypes, CertTypes, CookieTypes, HeaderTypes, VerifyTypes
 ImportError: cannot import name 'VerifyTypes' from 'httpx._types' (/home/user/.local/lib/python3.11/site-packages/httpx/_types.py)

=> requirements.txt for llm-faqgen-tgi:latest image generation is not up to date in Comps repo?

@lianhao?

eero-t · 2024-12-30T12:57:10Z

Rebased to main + dropped "draft" status, as the required OPEA image is now available in DockerHub!

eero-t · 2024-12-30T13:27:25Z

CI still fails.

PR #659 fixes DocSum issues with updates in other repos, includes the same model ID workaround as this one, and passed CI => better to merge that first & rebase this?

"docsum, gaudi, ci-gaudi-vllm-values" fails because CI registry is out of date. Although required image has been at DockerHub for 4 days [1], fetching it still fails:
Normal BackOff 4m47s (x18 over 9m42s) kubelet Back-off pulling image "100.83.111.229:5000/opea/llm-docsum-vllm:latest"
[1] https://hub.docker.com/r/opea/llm-docsum-vllm/tags

"docsum, xeon, ci-values" fails to connection failure:

[pod/docsum20241230125518-5599c984c6-dt5fc/docsum20241230125518] aiohttp.client_exceptions.ClientConnectorError: Cannot connect to host 0.0.0.0:7066 ssl:default [Connect call failed ('0.0.0.0', 7066)]
...
testpod: Response check failed, please check the logs in artifacts!

"docsum, gaudi, ci-gaudi-tgi-values" fails to similar CI issue:

[pod/docsum20241230130614-d598c6674-qqhjf/docsum20241230130614] aiohttp.client_exceptions.ClientConnectorError: Cannot connect to host 0.0.0.0:7066 ssl:default [Connect call failed ('0.0.0.0', 7066)]
...
testpod: Response check failed, please check the logs in artifacts!

Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com>

eero-t · 2025-01-02T13:39:04Z

Rebased to main to get CI tests passing, and dropped already merged fix. However, CI is still broken.

@daisy-ycguo CI "docsum, gaudi, ci-gaudi-vllm-values" still fails to CI registry not being up to date with DockerHub: https://hub.docker.com/r/opea/llm-docsum-vllm/tags
Failed to pull image "100.83.111.229:5000/opea/llm-docsum-vllm:latest": rpc error: code = NotFound desc = failed to pull and unpack image "100.83.111.229:5000/opea/llm-docsum-vllm:latest": failed to resolve reference "100.83.111.229:5000/opea/llm-docsum-vllm:latest": 100.83.111.229:5000/opea/llm-docsum-vllm:latest: not found

@lianhao, CI "docsum, gaudi, ci-gaudi-tgi-values" test fails now to test bug?
[pod/docsum20250102125737-llm-uservice-7d6f8d968f-v9b4w/docsum20250102125737] | huggingface_hub.errors.ValidationError: Input validation error: 'inputs' tokens + 'max_new_tokens' must be <= 4096. Given: 4095 'inputs' tokens and 17 'max_new_tokens'

lianhao

@eero-t I found pending PR opea-project/GenAIComps#1101 will make a big change, there will be no more llm-docsum-llm image any more, a single llm-docsum image will be able to talk to both tgi and vllm, so maybe we should wait until that PR to be merged first.

lianhao · 2025-01-03T01:11:25Z

helm-charts/docsum/values.yaml

 tgi:
+  enabled: true
+  LLM_MODEL_ID: Intel/neural-chat-7b-v3-3


MAX_INPUT_LENGTH and MAX_TOTAL_TOKENS settings for tgi is missing during the rebase. We need to set them otherwise, CI test will fail.

eero-t requested review from yongfengdu and lianhao as code owners December 18, 2024 18:57

eero-t marked this pull request as draft December 18, 2024 18:58

eero-t mentioned this pull request Dec 30, 2024

llm-vllm: unify environment variable LLM_MODEL_ID opea-project/GenAIComps#1089

Open

eero-t force-pushed the docsum-vllm branch from aa4e01a to 613c33b Compare December 30, 2024 12:54

eero-t marked this pull request as ready for review December 30, 2024 12:54

Add vLLM support for DocSum

7491ab7

Signed-off-by: Eero Tamminen <eero.t.tamminen@intel.com>

eero-t force-pushed the docsum-vllm branch from 613c33b to 7491ab7 Compare January 2, 2025 12:57

lianhao requested changes Jan 3, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add vLLM support to DocSum Helm chart #649

Add vLLM support to DocSum Helm chart #649

eero-t commented Dec 18, 2024 •

edited

Loading

eero-t commented Dec 18, 2024 •

edited

Loading

eero-t commented Dec 20, 2024

eero-t commented Dec 30, 2024

eero-t commented Dec 30, 2024

eero-t commented Jan 2, 2025

lianhao left a comment •

edited

Loading

lianhao Jan 3, 2025

Add vLLM support to DocSum Helm chart #649

Are you sure you want to change the base?

Add vLLM support to DocSum Helm chart #649

Conversation

eero-t commented Dec 18, 2024 • edited Loading

Description

Type of change

Dependencies

Tests

eero-t commented Dec 18, 2024 • edited Loading

eero-t commented Dec 20, 2024

eero-t commented Dec 30, 2024

eero-t commented Dec 30, 2024

eero-t commented Jan 2, 2025

lianhao left a comment • edited Loading

Choose a reason for hiding this comment

lianhao Jan 3, 2025

Choose a reason for hiding this comment

eero-t commented Dec 18, 2024 •

edited

Loading

eero-t commented Dec 18, 2024 •

edited

Loading

lianhao left a comment •

edited

Loading