Commit dc02098
authored
Avoid unnecessarily disabling CUDA graphs (#7302)
As discussed in PR #6766, CUDA graphs were being disabled in the presence of long prompts.
This fixes the issue by avoiding the consective update counter from incrementing unnecessarily
for tokens in which cuda graphs are disabled due to batch size > 1.1 parent 344f912 commit dc02098
1 file changed
+1
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2558 | 2558 | | |
2559 | 2559 | | |
2560 | 2560 | | |
2561 | | - | |
| 2561 | + | |
2562 | 2562 | | |
2563 | 2563 | | |
2564 | 2564 | | |
| |||
0 commit comments