Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streaming #9

Open
braco opened this issue Jul 26, 2024 · 1 comment
Open

Streaming #9

braco opened this issue Jul 26, 2024 · 1 comment

Comments

@braco
Copy link

braco commented Jul 26, 2024

Wouldn't it be better to use streaming interfaces in both the llm and speech systems?

For example:

elevenlabs/elevenlabs-js#4 (comment)

vercel should support this:

https://vercel.com/docs/functions/streaming

@athrael-soju
Copy link

athrael-soju commented Jul 28, 2024

To stream via groq you'd want to set stream: true in the completions call. But then you will have to face the challenge on how TTS will handle the stream. You can't begin synthesis immediately, because that means you'll attempt to generate speech from just a few tokens that may not form a sentence. This will sound very bad to the user.

If you choose to tokenize the stream chunks into sentences you will have to add logic to queue/dequeue sentences as they come before you send them for synthesis. This would work, but adds computation overhead and complexity. I added something like this in an older project

Best case scenario here is since groq is really fast, send the text response to the speech API and just stream the Speech itself back to the front end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants