Support for Kokoro-FastAPI TTS Model #8685
Replies: 9 comments 25 replies
-
i'd like to add that the onnx version of kokoro is fantastic, models are pretty small, sound really good and run fast even on cpu. I would personally love to see this built in to open-webui |
Beta Was this translation helpful? Give feedback.
-
Seems to be working fine in v0.5.6. Settings: Text-to-Speech Engine: OpenAI |
Beta Was this translation helpful? Give feedback.
-
Settings: Text-to-Speech Engine: OpenAI |
Beta Was this translation helpful? Give feedback.
-
Is there any way to set the speech speed? It is a parameter like "voice" so it should probably be in the settings page as well. |
Beta Was this translation helpful? Give feedback.
-
Can you be more specific about when you expect audio output, please.
Attached are screen grabs of what's working here. All I do is click on the
speaker icon to get an LLM response spoken.
Cheers,
Rick Emerson
…On Wed, Feb 5, 2025 at 3:51 PM Omni-NexusAI ***@***.***> wrote:
Did any of you guys who had problems getting it to work manage to find a
way to get audio output?
—
Reply to this email directly, view it on GitHub
<#8685 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/BFI36ZAGIJLE3WYDEVDS7332OJ2UBAVCNFSM6AAAAABVX4U54KVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTEMBXGQ2DSNA>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
FWIW, I moved to Kokoro FastAPI v0.1.5-pre (current release is v0.1.4), and it seems to be working well. There are two points in Open WebUI Audio which need attention:
|
Beta Was this translation helpful? Give feedback.
-
I been using this https://github.com/remsky/Kokoro-FastAPI Before the integration was even done and it was already working. CPU and GPU versions both works like a charm. |
Beta Was this translation helpful? Give feedback.
-
When using the included kokoro integration, its seem to only use CPU, as my VRAM usage stays very low and ram is on 20GB~ (no LLM yet on ollama just trying to listen to old messages) |
Beta Was this translation helpful? Give feedback.
-
At this point anything touching TTS, from Open WebUI to Win 11 has been updated with no change. |
Beta Was this translation helpful? Give feedback.
-
Feature Request
Issue: Kokoro-FastAPI theoretically should support OpenAI's endpoints out of the box. Kokoro-FastAPI is designed to be compatible with OpenAI's speech endpoints. However, when attempting to replace values under the correct fields within Open_WebUI for OpenAI TTS settings, there is no audio output after LLM generates its model response. So this suggests that a backend adjustment is needed to be able to utilize Kokoro-FastAPI in Open-WebUI's advanced voice mode.
Benefit: Kokoro-FastAPI provides high-quality, fast, and locally runnable text-to-speech. This feature would expand Open WebUI's functionality, enhancing user experience.
Beta Was this translation helpful? Give feedback.
All reactions