Skip to content

Commit b03ac1d

Browse files
author
Pradyun Ramadorai
committed
fix: Restore critical USE_CUTLASS_MOE environment variable support
ISSUE: The USE_CUTLASS_MOE environment variable support (CLAUDE.md entry vllm-project#14) was lost during a previous merge, removing critical debugging/compatibility control. ROOT CAUSE: Upstream changes overwrote the Mantle modification that added environment variable control for CUTLASS MoE implementations. SOLUTION: Restored the missing environment variable logic: - Added `import os` to imports - Restored `default_use_cutlass` calculation with original conditions - Restored `USE_CUTLASS_MOE` environment variable with smart defaults: * USE_CUTLASS_MOE=1 forces CUTLASS MoE on (default when conditions met) * USE_CUTLASS_MOE=0 disables CUTLASS MoE, fallback to other implementations - Maintains backward compatibility with automatic detection CODE CHANGES: - File: `vllm/model_executor/layers/quantization/compressed_tensors/compressed_tensors_moe.py` - Lines: 5 (import), 547-556 (environment variable logic) - Annotation: Added comprehensive Mantle modification comments for future merge guidance TESTING: Verified import functionality and environment variable integration. This fix enables debugging and compatibility control for CUTLASS MoE implementations as documented in CLAUDE.md registry entry vllm-project#14. Signed-off-by: Pradyun Ramadorai <pradyunr@amazon.com>
1 parent 6a04908 commit b03ac1d

File tree

1 file changed

+9
-1
lines changed

1 file changed

+9
-1
lines changed

vllm/model_executor/layers/quantization/compressed_tensors/compressed_tensors_moe.py

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
# SPDX-FileCopyrightText: Copyright contributors to the vLLM project
33

44
import enum
5+
import os
56
from enum import Enum
67
from typing import Callable, Optional, Union
78

@@ -544,9 +545,16 @@ def __init__(
544545
# cutlass path
545546
self.is_fp8_w8a8_sm100 = quant_config._is_fp8_w8a8_sm100(
546547
self.weight_quant, self.input_quant)
547-
self.use_cutlass = not self.block_quant and (
548+
# ===== MANTLE CUSTOM MODIFICATION: USE_CUTLASS_MOE Environment Variable Support =====
549+
# ISSUE: Need ability to disable CUTLASS MoE via environment variable for debugging/compatibility
550+
# SOLUTION: Add USE_CUTLASS_MOE environment variable support with smart defaults
551+
# MERGE CONFLICT: Preserve environment variable logic when merging CUTLASS changes
552+
# CONTEXT: Enables debugging and compatibility control for CUTLASS MoE implementations
553+
# =======================================
554+
default_use_cutlass = not self.block_quant and (
548555
quant_config._is_fp8_w8a8_sm90(self.weight_quant, self.input_quant)
549556
or self.is_fp8_w8a8_sm100)
557+
self.use_cutlass = bool(int(os.environ.get("USE_CUTLASS_MOE", 1 if default_use_cutlass else 0)))
550558
self.disable_expert_map = False
551559

552560
def create_weights(self, layer: torch.nn.Module, num_experts: int,

0 commit comments

Comments
 (0)