Enable vLLM Gaudi Support for LLM Service #126

tianyil1 · 2024-05-31T08:46:33Z

Description

This PR enabled the vLLM Gaudi support for LLM service, which leveraged the habana/vllm-fork, and will be converted to a formal pull request until the habana team officially releases the vLLM support.

Issues

n/a.

Type of change

List the type of change like below. Please delete options that are not relevant.

New feature (non-breaking change which adds new functionality)
- This PR enabled the vLLM Gaudi support for LLM service.

Dependencies

n/a.

Tests

This PR is tested in the Gaudi2 server with:
2 sockers Intel(R) Xeon(R) Platinum 8368 CPU @ 2.40GHz
8 Gaudi nodes, HL-SMI Version: hl-1.14.0-fw-48.0.1.0 Driver Version: 1.14.0-9e8ecf8

vLLM Gaudi Service
Client Curl Test

tianyil1 · 2024-05-31T08:54:07Z

This draft PR has been submitted here. Please have a review and check. @Jian-Zhang @carsonwang

Signed-off-by: tianyil1 <tianyi.liu@intel.com>

for more information, see https://pre-commit.ci

tianyil1 · 2024-06-06T06:15:03Z

Please refer to this new PR based on the officially habana vLLM: #137

Signed-off-by: lvliang-intel <liang1.lv@intel.com>

tianyil1 added 4 commits June 5, 2024 16:04

added the vllm gaudi build docker files

75f941a

Signed-off-by: tianyil1 <tianyi.liu@intel.com>

refine the docker and add the launch vllm script

18bb0f0

Signed-off-by: tianyil1 <tianyi.liu@intel.com>

refine the vllm readme for gaudi support

4d3c77d

Signed-off-by: tianyil1 <tianyi.liu@intel.com>

refine the launch vllm script

711e6f4

Signed-off-by: tianyil1 <tianyi.liu@intel.com>

tianyil1 force-pushed the vllm branch from e029419 to 711e6f4 Compare June 5, 2024 08:08

updated the gaudi version with the official version

2061e08

Signed-off-by: tianyil1 <tianyi.liu@intel.com>

tianyil1 force-pushed the vllm branch from 19fe4b8 to 2061e08 Compare June 6, 2024 05:49

[pre-commit.ci] auto fixes from pre-commit.com hooks

330c92e

for more information, see https://pre-commit.ci

tianyil1 closed this Jun 6, 2024

tianyil1 deleted the vllm branch June 6, 2024 05:52

tianyil1 restored the vllm branch June 6, 2024 05:52

tianyil1 deleted the vllm branch June 6, 2024 05:53

tianyil1 mentioned this pull request Jun 6, 2024

Enable vLLM Gaudi support for LLM service based on officially habana vllm release #137

Merged

1 task

Jian-Zhang mentioned this pull request Jun 26, 2024

LLM on Xeon by vLLM + Ray #194

Closed

lkk12014402 pushed a commit that referenced this pull request Aug 8, 2024

Update ChatQnA Guadi microservice (#126)

e080c26

Signed-off-by: lvliang-intel <liang1.lv@intel.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable vLLM Gaudi Support for LLM Service #126

Enable vLLM Gaudi Support for LLM Service #126

tianyil1 commented May 31, 2024 •

edited

Loading

tianyil1 commented May 31, 2024 •

edited

Loading

tianyil1 commented Jun 6, 2024

Enable vLLM Gaudi Support for LLM Service #126

Enable vLLM Gaudi Support for LLM Service #126

Conversation

tianyil1 commented May 31, 2024 • edited Loading

Description

Issues

Type of change

Dependencies

Tests

tianyil1 commented May 31, 2024 • edited Loading

tianyil1 commented Jun 6, 2024

tianyil1 commented May 31, 2024 •

edited

Loading

tianyil1 commented May 31, 2024 •

edited

Loading