Skip to content

Commit e7e16d8

Browse files
committed
Upgrade FlashInfer to v0.3.0
Mainly to get the GPT-OSS MXFP4 trtllm-gen MoE autotuning and the bug fix in: flashinfer-ai/flashinfer#1573 Signed-off-by: Po-Han Huang <pohanh@nvidia.com>
1 parent b5ee1e3 commit e7e16d8

File tree

2 files changed

+2
-2
lines changed

2 files changed

+2
-2
lines changed

docker/Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -375,7 +375,7 @@ RUN --mount=type=bind,from=build,src=/workspace/dist,target=/vllm-workspace/dist
375375
# Install FlashInfer from source
376376
ARG FLASHINFER_GIT_REPO="https://github.com/flashinfer-ai/flashinfer.git"
377377
# Keep this in sync with "flashinfer" extra in setup.py
378-
ARG FLASHINFER_GIT_REF="v0.2.14.post1"
378+
ARG FLASHINFER_GIT_REF="v0.3.0"
379379
# Flag to control whether to compile FlashInfer AOT kernels
380380
# Set to "true" to enable AOT compilation:
381381
# docker build --build-arg FLASHINFER_AOT_COMPILE=true ...

setup.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -694,7 +694,7 @@ def _read_requirements(filename: str) -> list[str]:
694694
"mistral_common[audio]"], # Required for audio processing
695695
"video": [], # Kept for backwards compatibility
696696
# FlashInfer should be updated together with the Dockerfile
697-
"flashinfer": ["flashinfer-python==0.2.14.post1"],
697+
"flashinfer": ["flashinfer-python==0.3.0"],
698698
# Optional deps for AMD FP4 quantization support
699699
"petit-kernel": ["petit-kernel"],
700700
},

0 commit comments

Comments
 (0)