Enable vLLM Gaudi Support for LLM Service #126
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This PR enabled the vLLM Gaudi support for LLM service, which leveraged the habana/vllm-fork, and will be converted to a formal pull request until the habana team officially releases the vLLM support.
Issues
n/a
.Type of change
List the type of change like below. Please delete options that are not relevant.
Dependencies
n/a.
Tests
This PR is tested in the Gaudi2 server with:
2 sockers Intel(R) Xeon(R) Platinum 8368 CPU @ 2.40GHz
8 Gaudi nodes, HL-SMI Version: hl-1.14.0-fw-48.0.1.0 Driver Version: 1.14.0-9e8ecf8