feat: switch to using vllm #103

quitrk · 2024-10-03T08:19:09Z

when cuda is not available, we are still using llama.cpp as a fallback

* when cuda is not available, we are still using llama.cpp as a fallback

pyproject.toml

skynet/modules/ttt/openai_api/app.py

saghul

Left some comments, mostly just questions, excellent stuff! Can you please also update the README with what AWQ model one should run when using vllm?

run.sh

skynet/env.py

skynet/modules/ttt/openai_api/app.py

skynet/modules/ttt/summaries/processor.py

skynet/modules/ttt/summaries/v1/models.py

skynet/utils.py

quitrk · 2024-10-03T11:30:54Z

Left some comments, mostly just questions, excellent stuff! Can you please also update the README with what AWQ model one should run when using vllm?

We're not running AWQ as it is not optimised anyway, we're running raw, and still has some decent performance improvement over llama.cpp with quantized

skynet/modules/ttt/openai_api/app.py

skynet/env.py

run.sh

feat: switch to using vllm

814403b

* when cuda is not available, we are still using llama.cpp as a fallback

quitrk marked this pull request as draft October 3, 2024 08:22

quitrk marked this pull request as ready for review October 3, 2024 08:28

quitrk commented Oct 3, 2024

View reviewed changes

pyproject.toml Show resolved Hide resolved

quitrk commented Oct 3, 2024

View reviewed changes

skynet/modules/ttt/openai_api/app.py Show resolved Hide resolved

saghul reviewed Oct 3, 2024

View reviewed changes

saghul approved these changes Oct 3, 2024

View reviewed changes

saghul reviewed Oct 4, 2024

View reviewed changes

skynet/modules/ttt/openai_api/app.py Outdated Show resolved Hide resolved

saghul reviewed Oct 4, 2024

View reviewed changes

skynet/env.py Outdated Show resolved Hide resolved

saghul reviewed Oct 4, 2024

View reviewed changes

run.sh Outdated Show resolved Hide resolved

quitrk force-pushed the tavram/vll branch from 2dbb9d9 to 814403b Compare October 4, 2024 11:24

quitrk merged commit 94397c2 into master Oct 4, 2024

quitrk deleted the tavram/vll branch October 4, 2024 11:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: switch to using vllm #103

feat: switch to using vllm #103

quitrk commented Oct 3, 2024

saghul left a comment

quitrk commented Oct 3, 2024

feat: switch to using vllm #103

feat: switch to using vllm #103

Conversation

quitrk commented Oct 3, 2024

saghul left a comment

Choose a reason for hiding this comment

quitrk commented Oct 3, 2024