You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
With the llama-cpp model, after a few chat interactions, we may come across a ValueError('Requested tokens (...) exceed context window of 4096') error. Any messages after this will be responded with an AssertionError() (this is because in llama-index's messages_to_prompt function, it expects it to have alternating user and assistant chat messages and has assert statements to check this).
Note that we can avoid this by clearing the chat history by implementing a Slack shortcut (see #97), but maybe a better way to do this is that we start dropping old chat history to make sure there is enough space and not error out. Essentially have some automatic forgetting.
Maybe this change occurs in llama-index rather than here, but something to consider.
The text was updated successfully, but these errors were encountered:
With the llama-cpp model, after a few chat interactions, we may come across a
ValueError('Requested tokens (...) exceed context window of 4096')
error. Any messages after this will be responded with anAssertionError()
(this is because inllama-index
'smessages_to_prompt
function, it expects it to have alternating user and assistant chat messages and hasassert
statements to check this).Note that we can avoid this by clearing the chat history by implementing a Slack shortcut (see #97), but maybe a better way to do this is that we start dropping old chat history to make sure there is enough space and not error out. Essentially have some automatic forgetting.
Maybe this change occurs in
llama-index
rather than here, but something to consider.The text was updated successfully, but these errors were encountered: