Skip to content

Commit aa2fc04

Browse files
tlrmchlsmthMu Huai
authored andcommitted
Revert "Fix non-contiguous input passed to Marlin kernel (vllm-project#15319)" (vllm-project#15398)
Signed-off-by: Mu Huai <tianbowen.tbw@antgroup.com>
1 parent b17ce95 commit aa2fc04

File tree

1 file changed

+0
-4
lines changed
  • vllm/model_executor/layers/quantization/kernels/mixed_precision

1 file changed

+0
-4
lines changed

vllm/model_executor/layers/quantization/kernels/mixed_precision/marlin.py

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -115,10 +115,6 @@ def apply_weights(self,
115115
layer: torch.nn.Module,
116116
x: torch.Tensor,
117117
bias: Optional[torch.Tensor] = None) -> torch.Tensor:
118-
# marlin requires contiguous memory layout
119-
# prefix caching may cause x to be non-contiguous
120-
x = x.contiguous() # no-op if already contiguous
121-
122118
c = self.config
123119
w_q, w_s, w_zp, w_gidx = self._get_weight_params(layer)
124120

0 commit comments

Comments
 (0)