-
Notifications
You must be signed in to change notification settings - Fork 11.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reverting generated output/user input! #604
Comments
Adding retry through ctrl + r could be great for interactive mode, especially the chat mode when AI messes up. Additionally could be neat if included as part of API |
You might want to check out https://github.com/LostRuins/llamacpp-for-kobold which has this feature, plus it caches the same tokens from the previous prompt to avoid the need for reprocessing the whole prompt if you only want to retry a single sentence. |
Is restoring old token array enough to return to the same state? |
I tested, it is not, unfortunately. >>> i.append("he then shouted:")
>>> i.tokens
[1, 354, 769, 21272, 287, 29901]
>>> tk = i.tokens
>>> i.run('\n')
' `O Lord the Great King, protect our troops in battle\' [i:thanks: ] 2 Samuel. Chapter XX, vs 59:4, is the only mention found so. The word kashrat occurs, also without further elpbr,in. 35 of. 55, is. the passage of Eph. II., "Huseths the head, forasmuch a the sore is spread all about his bed" (i. E, Hag.. gon).. This word "bedd " appears without elabornt 7t twice ove:; r, "Jews shall dwell at JerUSAH.J 5 5 , the king and governrment oi New Mexico will be in the midst o? this great. nationality, which was in its nature of the kind most conducentive to a high tone a political organization on behon, a free country to adopt as an article " of rhe faith of its\'people.\' If our nation would take its stand, for\'ever,\' to\'protecr this, which. wii,is \'JiJJj. a n- " j Jjj \' -, I j f j n 27 "'
>>> i.tokens = tk
>>> i.run('\n')
'1'
>>> i.run('\n')
'Taub 5.05a-d.'
>>> i.tokens = tk
>>> i.run(' ')
'[url=\\U[/u]. In my heart he knows. he always did[br / [/l], the guarani in Parque San Martín was always my most cheried destination of those I was allowed access. Homepage of Dr P Rathin Trivedy\nPradeenam'
>>> |
@LostRuins are you interested in elaborating how you achieved this in llamacpp-for-kobold? Or would you be able to point towards relevant code in your repo? Thanks for you work, btw. The kobald UI looks pretty clean and I'm definitely keeping an eye on it! |
I'm trying to wrap my head around how this feature would be implemented for the interactive mode. It seems like you'd need to keep track of the last message in this block. Then you'd need to catch the ctrl-r signal. It should interrupt any ongoing generation/input and remove the last message from Is it actually that simple? I'm about to go on vacation for a week, but I'll try to experiment with it when I can. |
@horenbergerb Sure, so basically what I do is reuse the old KV state from the context. The important thing to note is the So once you've reused the old state, simply tokenize the old prompt together with the new one, and compare the IDs in both tokenized arrays. Start from the beginning, n_past = position 0. For each position that matches on both arrays, increment n_past by 1, until you reach a divergence. That is the common tokens that you can preserve from the old state. Then truncate your |
Thanks for the hint! Definitely going to mess with that. |
Add doc string for n_gpu_layers argument and make -1 offload all layers
This issue was closed because it has been inactive for 14 days since being marked as stale. |
Hey!
This is a feature request for reverting input/output. One example usecase is to be able to retry generation if the response wasn't as desired.
One way of implementing this could be by adding the ability to create "snapshots" using signals(?).
Thanks a lot
niansa
The text was updated successfully, but these errors were encountered: