-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Closed
Labels
enhancementNew feature or requestNew feature or request
Description
Opening this up to track the development of the new caching behaviour I'm planning to implement. This will leverage 2 significant improvements
- Reduced llama state size which is now a function of evaluated tokens
- Improved efficiency of Llama.generate which now only eval's prompt tokens that are not already in the context window
gjmulder, digiwombat, Priestru and ibehnamPriestru and xynta
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request