Skip to content

Commit

Permalink
Add batched Llama model definition using vLLM paged attention (#1134)
Browse files Browse the repository at this point in the history
* Add batched Llama model with vllm paged attention

* update core.py

* doc

* minor

* add e2e test

* mv file

* clean

* Check if TVM has been built with USE_VLLM

* update BuildArgs docstring
  • Loading branch information
masahi authored Oct 30, 2023
1 parent ba67835 commit fee2cb5
Show file tree
Hide file tree
Showing 4 changed files with 1,347 additions and 165 deletions.
Loading

0 comments on commit fee2cb5

Please sign in to comment.