-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Closed as not planned
Closed as not planned
Copy link
Labels
Issue - Unassigned / ActionableClear and approved. Available for contributors to pick up.Clear and approved. Available for contributors to pick up.bugSomething isn't workingSomething isn't working
Description
App Version
3.16.6
API Provider
LM Studio
Model Used
Qwen3-32B
🔁 Steps to Reproduce
I am trying to use a local Qwen3-32B model via llama.cpp. To do so, I use the LMStudio integration that I point to the local server. Everything works fine, but after 10 minutes (600 seconds), the connection is dropped and I get an API Request Failed message. The inference is on cpu and quite slow, but I would be happy to let it crunch while I'm doing something else. If I use the tiny Qwen3 0.6B model, the inference is fast enough and everything works as expected (although with very mediocre results).
When it fails, llama.cpp finishes processing the prompt anyway. It succeeds on retry, the prompt being already cached.
💥 Outcome Summary (Optional)
No response
📄 Relevant Logs or Errors
dosubot
Metadata
Metadata
Assignees
Labels
Issue - Unassigned / ActionableClear and approved. Available for contributors to pick up.Clear and approved. Available for contributors to pick up.bugSomething isn't workingSomething isn't working
Type
Projects
Status
Done