Skip to content

Commit 269c4db

Browse files
varun-sundar-rabindranathVarun Sundar Rabindranath
andauthored
[Misc][DP] Guard mxfp4 implementation selection (#27484)
Signed-off-by: Varun Sundar Rabindranath <vsundarr@redhat.com> Co-authored-by: Varun Sundar Rabindranath <vsundarr@redhat.com>
1 parent 52efc34 commit 269c4db

File tree

1 file changed

+7
-2
lines changed
  • vllm/model_executor/layers/quantization

1 file changed

+7
-2
lines changed

vllm/model_executor/layers/quantization/mxfp4.py

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -794,7 +794,8 @@ def select_gemm_impl(
794794
)
795795
else:
796796
raise NotImplementedError(
797-
"Incompatible Mxfp4 backend for EP batched experts format"
797+
f"Incompatible Mxfp4 backend ({self.mxfp4_backend}) for "
798+
"EP batched experts format"
798799
)
799800
else:
800801
assert self.moe_quant_config is not None
@@ -813,8 +814,12 @@ def select_gemm_impl(
813814
return TrtLlmGenExperts(self.moe, self.moe_quant_config, **kwargs)
814815
elif self.mxfp4_backend == Mxfp4Backend.MARLIN:
815816
return MarlinExperts(self.moe_quant_config)
816-
else:
817+
elif self.mxfp4_backend == Mxfp4Backend.TRITON:
817818
return OAITritonExperts(self.moe_quant_config)
819+
else:
820+
raise NotImplementedError(
821+
f"Incompatible Mxfp4 backend ({self.mxfp4_backend}) for EP"
822+
)
818823

819824
def _route_and_experts(
820825
self,

0 commit comments

Comments
 (0)