Commit c9c88ac
committed
Avoid unnecessarily disabling CUDA graphs
As discussed in PR ggml-org#6766, CUDA graphs were being disabled in the presence of long prompts.
This fixes the issue by avoiding the consective update counter from incrementing unnecessarily
for tokens in which cuda graphs are disabled due to batch size > 1.1 parent 583fd6b commit c9c88ac
1 file changed
+1
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2558 | 2558 | | |
2559 | 2559 | | |
2560 | 2560 | | |
2561 | | - | |
| 2561 | + | |
2562 | 2562 | | |
2563 | 2563 | | |
2564 | 2564 | | |
| |||
0 commit comments