You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
The interface lags after a conversation gets a little long (around 10-20 messages with full code text, say about 5000 tokens in each message). When this happens, the chat box seems to slowly catch up to my typing and can take several seconds to start showing up. The problem goes away if I start a new chat...so I'm suspecting the history of the chat
To Reproduce
Use with a local LLM running on oobabooga
Expected behavior
No change in experience the longer the history gets, at least not in the interface, I can understand response from LLM may lag or even break if token limits are reached, but I don't have an issue with the response. As soon as the lag goes away and my text is in the box, I can hit enter and get the same token rate I get when it's not lagging...
Screenshots
N/A
API Provider
oobabooga
Chat or Auto Complete?
Chat
Model Name
any
Desktop (please complete the following information):
OS: Mac Os
The text was updated successfully, but these errors were encountered:
Can confirm this also on Mac using Twinny chat with a locally networked Ubuntu system with ollama serving llama3;
Text input starts lagging once tokens reach ~400 and ~2000 chars.
The "Code Helper (Renderer)" process spikes in CPU usage ~80%+ until the typed text has caught up, and then drops to idle around ~40%.
The lag doesn't recover unless a new chat is started, and instantly kicks in again if the old chat is loaded from hisotory.
No local ollama installed FWIW.
Temp workaround is to instruct the bot to state a warning when the tokens/char count reach the threshold (probably different for each system) then switch to a new chat.
Thank you for all of your hard work in creating this amazing extension!
I can also confirm this is an issue I have experienced for months. Unfortunately this is the singular usability issue preventing me from using Twinny in my daily flow as the responsiveness greatly hinders the ability to continuously interact with the plugin. Note generally at that point I'm deep in a conversation with rich context history and the need to start a new conversation and reestablish the collaboration with context is quite unpleasant. See details below and thank you again!
API Provider
Ollama
Chat or Auto Complete?
Chat
Model Name
any
OS
Ubuntu 22.04
Twinny Version
Last Updated: 2024-09-17, 22:24:26
VS Code Info
Version: 1.93.1
Commit: 38c31bc77e0dd6ae88a4e9cc93428cc27a56ba40
Date: 2024-09-11T17:20:05.685Z
Electron: 30.4.0
ElectronBuildId: 10073054
Chromium: 124.0.6367.243
Node.js: 20.15.1
V8: 12.4.254.20-electron.0
OS: Linux x64 6.8.12-1-pve
Yeah, I love this extension but the chat interface is terribly slow when there is a conversation going. Any workaround on this other than starting a new chat?
Describe the bug
The interface lags after a conversation gets a little long (around 10-20 messages with full code text, say about 5000 tokens in each message). When this happens, the chat box seems to slowly catch up to my typing and can take several seconds to start showing up. The problem goes away if I start a new chat...so I'm suspecting the history of the chat
To Reproduce
Use with a local LLM running on oobabooga
Expected behavior
No change in experience the longer the history gets, at least not in the interface, I can understand response from LLM may lag or even break if token limits are reached, but I don't have an issue with the response. As soon as the lag goes away and my text is in the box, I can hit enter and get the same token rate I get when it's not lagging...
Screenshots
N/A
API Provider
oobabooga
Chat or Auto Complete?
Chat
Model Name
any
Desktop (please complete the following information):
The text was updated successfully, but these errors were encountered: