Skip to content

Commit e1be425

Browse files
committed
Incorporate the new kernel changes by not to specify the vmem_limit
Signed-off-by: Xiongfei Wei <isaacwxf23@gmail.com>
1 parent f824e43 commit e1be425

File tree

2 files changed

+5
-6
lines changed
  • requirements
  • vllm/model_executor/layers/quantization/kernels/scaled_mm

2 files changed

+5
-6
lines changed

requirements/tpu.txt

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -18,9 +18,9 @@ setuptools==78.1.0
1818
--find-links https://storage.googleapis.com/libtpu-releases/index.html
1919
--find-links https://storage.googleapis.com/jax-releases/jax_nightly_releases.html
2020
--find-links https://storage.googleapis.com/jax-releases/jaxlib_nightly_releases.html
21-
torch==2.9.0.dev20250710
22-
torchvision==0.24.0.dev20250710
23-
torch_xla[tpu, pallas] @ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.8.0.dev20250710-cp39-cp39-linux_x86_64.whl ; python_version == "3.9"
24-
torch_xla[tpu, pallas] @ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.8.0.dev20250710-cp310-cp310-linux_x86_64.whl ; python_version == "3.10"
25-
torch_xla[tpu, pallas] @ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.8.0.dev20250710-cp311-cp311-linux_x86_64.whl ; python_version == "3.11"
21+
torch==2.9.0.dev20250711
22+
torchvision==0.24.0.dev20250711
23+
torch_xla[tpu, pallas] @ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.9.0.dev20250711-cp39-cp39-linux_x86_64.whl ; python_version == "3.9"
24+
torch_xla[tpu, pallas] @ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.9.0.dev20250711-cp310-cp310-linux_x86_64.whl ; python_version == "3.10"
25+
torch_xla[tpu, pallas] @ https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.9.0.dev20250711-cp311-cp311-linux_x86_64.whl ; python_version == "3.11"
2626

vllm/model_executor/layers/quantization/kernels/scaled_mm/xla.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -97,7 +97,6 @@ def apply_weights(self,
9797
w_q,
9898
w_s,
9999
quantize_activation=True,
100-
vmem_limit_bytes=96 * 1024 * 1024,
101100
)
102101

103102
# Explicitly capture control flow to make dynamo happy.

0 commit comments

Comments
 (0)