-
Notifications
You must be signed in to change notification settings - Fork 10.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request] --prompt-cache-all + user input #1398
Comments
Yeah, we punted on |
@ejones I confused about this as well, could please kindly provide example how to use prompt cache all, without using interactive mode ? Can I use it for like :
Or this is different than what I thought ? Thanks |
so what your saying is that I can quickly hack up a bash script and have pseudo persistent bot? |
At a basic level, the way to leverage this is to feed back the output of one call to |
@ejones Thanks for the link, I'll test it with my own prompt, so its useful for generating story |
@ejones I wondered why --prompt-cache-all not saving the last message generated by the LLM, instead we have to put the last message again, is it better that it saves also the message generated by the LLM, so that in back-forth chat session, we can just add another question and not put / copy he last message generated by the LLM Sorry if I'm wrong with this 😄 |
Yeah, I tried a version where it restored and appended to the saved prompt, but I didn't want to have to rely on the contents of the prompt cache. There's no way to inspect prompt caches (yet) and there may be cases where they don't get saved or get corrupted. So for now, the prompt argument is the source of truth and the prompt cache is just a cache. The use case I envision for this is for a script / app to manage the chat session etc. rather than repeatedly invoking main on the command line. The example I'm preparing now will illustrate this. |
ok @ejones thanks |
@ejones Looks like the PRs have been merged. Could you explain how to use this new feature? |
Good one, I'm interested in it too |
For the persistent chat script, I have a PR up at #1568 with docs on its usage. For the This usage is demonstrated in |
@ejones is it now no need to add the last output from llama for the new request using |
@divinity76 , I do not believe it is as simple as your pseudo-example would suggest. For example, when I run:
I would expect to be able to append a prompt to the cached prompt. The
The cached prompt is not loaded and there is no previous context for the response to correctly answer, "Bob.". The script outputs:
|
oh ok sorry, i may be wrong and i don't have time to investigate, nevermind |
This issue was closed because it has been inactive for 14 days since being marked as stale. |
I noticed the ' --prompt-cache-all' and ' --prompt-cache' for '--session' replacement, but the ' --prompt-cache-all' does not support user input, why not? and why not only store tokens in the context window, I would like to resume input/output with the model from a file this would be sort of like persistent memory, and would be awesome!
The text was updated successfully, but these errors were encountered: