Streaming support #16

matbee-eth · 2023-08-15T16:40:46Z

Have you thought about / planned a way to support streaming audio instead of sending the entire audio clip? If its not currently supported, how would you solve it? I would appreciate some guidance to send a proper PR to support streaming, if possible.

xenova · 2023-08-15T16:46:09Z

At the moment (w/ WASM backend), the latency for the encoder is just too much to do real-time streaming. Fortunately, the onnxruntime-web team have been busy improving their WebGPU backend, and it's at a stage where we can do testing with it now.

So, we hope to add support for it soon! If you're up for the challenge, you can fork transformers.js, build onnxruntime-web from source w/ webgpu support, and replace the import with the custom onnxruntime-web build.

matbee-eth · 2023-08-15T16:51:52Z

By real-time I actually just mean streaming of an audio source (mic-in, generic audio out, file data, etc) -> whisper, making it as real-time as the tech allows, really. So basically, chunk up an audio to ~5-30 seconds each, and simply queue up the chunks to transcribe

I'll look into the webgpu, I was planning on seeing if this project would work with webgpu, so I'll take a look

xenova · 2023-08-15T17:02:50Z

Well yes, you can just send 30-second chunks of audio to whisper, but as stated above, you won't get a response for at least 1-2 seconds due to the encoder latency. Then on top of that, depending how much you are decoding, you'd have to wait for the full chunk to be decoded before merging with the current predicted text.

That said, I do think this will be feasible with webgpu, at which point I'll probably take a look at this (unless you'd be interested in starting, working with the wasm backend for now).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Streaming support #16

Streaming support #16

matbee-eth commented Aug 15, 2023

xenova commented Aug 15, 2023

matbee-eth commented Aug 15, 2023 •

edited

Loading

xenova commented Aug 15, 2023

Streaming support #16

Streaming support #16

Comments

matbee-eth commented Aug 15, 2023

xenova commented Aug 15, 2023

matbee-eth commented Aug 15, 2023 • edited Loading

xenova commented Aug 15, 2023

matbee-eth commented Aug 15, 2023 •

edited

Loading