Skip to content

Commit

Permalink
Merge pull request vllm-project#14 from wenxcs/wenxh/fp8-on-a100-v5-pr
Browse files Browse the repository at this point in the history
0612 kernel of FP8 on A100
  • Loading branch information
xiaoxiawu-microsoft authored Jun 15, 2024
2 parents 9f42e46 + d0b7fad commit b28848e
Show file tree
Hide file tree
Showing 9 changed files with 780 additions and 446 deletions.
9 changes: 8 additions & 1 deletion requirements-cuda.txt
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,11 @@ vllm-nccl-cu12>=2.18,<2.19 # for downloading nccl library
torch == 2.2.1
xformers == 0.0.25 # Requires PyTorch 2.2.1

cupy-cuda12x
# Dependencies for pycublas-moe-groupe-gemm
gitpython
pytest
loguru
# In case of invalid url, please install from this file:
# pip install gitpython pytest loguru && pip install vllm/model_executor/layers/fused_moe/pycublas.zip
# or
# pip install gitpython pytest loguru && pip install git+https://github.com/wenxcs/pycublas.git@moe-group-gemm
Loading

0 comments on commit b28848e

Please sign in to comment.