We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent 2de12be commit aa20d10Copy full SHA for aa20d10
vllm/v1/attention/backends/triton_attn.py
@@ -376,7 +376,7 @@ def forward(
376
query.reshape(
377
(num_tokens, num_heads * head_size)).contiguous(),
378
layer._q_scale)
379
- query = query.reshape((num_tokens, num_heads, head_size))
+ query = query.reshape((num_tokens, num_heads, head_size))
380
381
use_local_attn = \
382
(self.use_irope and attn_metadata.local_attn_metadata is not None)
0 commit comments