Remove superfluous slicing of state_indices_tensor

nopperl · tdoublep · nopperl · commit ad8a800aa4d6 · 2025-09-03T07:29:18.000Z
Co-authored-by: Thomas Parnell &lt;tom.parnell@gmail.com&gt;
Signed-off-by: nopperl &lt;54780682+nopperl@users.noreply.github.com&gt;
diff --git a/vllm/model_executor/models/plamo2.py b/vllm/model_executor/models/plamo2.py
@@ -332,7 +332,7 @@ def forward_cuda(
                                          dim=0)
             # Split along batch dimension
             state_indices_tensor_d, state_indices_tensor_p = torch.split(
-                state_indices_tensor[:num_actual_tokens],
+                state_indices_tensor,
                 [num_decodes, num_prefills],
                 dim=0,
             )

Original file line number	Diff line number	Diff line change
`@@ -332,7 +332,7 @@ def forward_cuda(`
`332`	`332`	`dim=0)`
`333`	`333`	`# Split along batch dimension`
`334`	`334`	`state_indices_tensor_d, state_indices_tensor_p = torch.split(`
`335`		`- state_indices_tensor[:num_actual_tokens],`
	`335`	`+ state_indices_tensor,`
`336`	`336`	`[num_decodes, num_prefills],`
`337`	`337`	`dim=0,`
`338`	`338`	`)`