Commit 7cd5a1f
committed
server : fix cache_tokens not getting correctly resized
Otherwise, when the "we have to evaluate at least 1 token" special case
was triggered, an extra token was kept in cache_tokens even if it was
removed from the KV cache.
For Mamba, this caused useless prompt reprocessing when the previous
request triggered the above case.1 parent 916b586 commit 7cd5a1f
1 file changed
+3
-4
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1797 | 1797 | | |
1798 | 1798 | | |
1799 | 1799 | | |
1800 | | - | |
1801 | | - | |
1802 | | - | |
1803 | 1800 | | |
1804 | 1801 | | |
1805 | 1802 | | |
| |||
1846 | 1843 | | |
1847 | 1844 | | |
1848 | 1845 | | |
1849 | | - | |
1850 | 1846 | | |
1851 | 1847 | | |
1852 | 1848 | | |
1853 | 1849 | | |
| 1850 | + | |
| 1851 | + | |
| 1852 | + | |
1854 | 1853 | | |
1855 | 1854 | | |
1856 | 1855 | | |
| |||
0 commit comments