Faster whisper serverless template #10

aaa3334 · 2024-11-12T01:20:10Z

Hi!

I happily stumbled into your video on faster-whisper and learnt runpod is a thing and that they have serverless. I am wondering if you have a guide or template on how to set up faster whisper serverless? Or if it is the same as eg the Ministral one you set up?

RonanKMcGovern · 2024-11-12T19:07:51Z

Yeah, so you can run serverless on runpod like this:

However, this doesn't do dynamic batching, it spins up a new worker for each request. I haven't found an open source library yet that easily allows dynamic batching.

But if you're anyway using small batches, this will do what you need.

the other issues is that there is a defaul tmodel set, and it's not the turbo model :( so you also have to figure out how to swap that (which may require either adding env variables OR possibly rebuilding the docker).

I'm going to do a video soon on setting up endpoints, I'll see if I can do something on serverless, depends how much work/digging it takes.

aaa3334 · 2024-11-13T02:23:27Z

Thanks for your reply!
I tried the default FasterWhisper template, but as you mentioned, it doesn't have the turbo model. Was looking at trying to rebuild using the hugging face https://huggingface.co/mobiuslabsgmbh/faster-whisper-large-v3-turbo/tree/main
but am so unsure about what the settings for that would look like to get it set up - I feel the RunPod team says they make it easy, but their documentation seems to be at the level of someone already used to setting up VM's etc. - I am familiar with hugging face, digital ocean etc. but am no expert and only got those set up by following guides. (Docker I am very familiar with though but does not always seem to be the best solution for endpoints (eg on hugging face you have gradio which is much easier and more lightweight than setting up a full docker container which feels like overkill for one endpoint).

That would be really cool! For me right now the serverless ones seem like the way to go (else I feel I could just use Hugging faces interface which I already know the setup for etc.)
Its really cool to see all these different ways to do things more easily and am so happy I ran into RunPod on your channel :)

RonanKMcGovern · 2024-11-13T18:30:51Z

this is good feedback, I'll dig in on whether it's possible or easy to port over something serverless, ronan

…

On Tue, Nov 12, 2024 at 6:23 PM aaa3334 ***@***.***> wrote: Thanks for your reply! I tried the default FasterWhisper template, but as you mentioned, it doesn't have the turbo model. Was looking at trying to rebuild using the hugging face https://huggingface.co/mobiuslabsgmbh/faster-whisper-large-v3-turbo/tree/main but am so unsure about what the settings for that would look like to get it set up - I feel the RunPod team says they make it easy, but their documentation seems to be at the level of someone already used to setting up VM's etc. - I am familiar with hugging face, digital ocean etc. but am no expert and only got those set up by following guides. (Docker I am very familiar with though but does not always seem to be the best solution for endpoints (eg on hugging face you have gradio which is much easier and more lightweight than setting up a full docker container which feels like overkill for one endpoint). That would be really cool! For me right now the serverless ones seem like the way to go (else I feel I could just use Hugging faces interface which I already know the setup for etc.) Its really cool to see all these different ways to do things more easily and am so happy I ran into RunPod on your channel :) — Reply to this email directly, view it on GitHub <#10 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ASVG6CVIONHHZDXSUO4FRR32AKZ3JAVCNFSM6AAAAABRS54YHCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINZSGIZDCMZZGM> . You are receiving this because you commented.Message ID: ***@***.***>

RonanKMcGovern · 2025-01-11T15:33:54Z

Just a note that I likely won't have time to get around to this, but I'll leave this open and have renamed it in case someone else finds this and is able to spend the time to build a faster whisper runpod instance that supports turbo.

FWIW, Fireworks provides a very fast transcription service that can take 1 GB size files AFAIK. So that may be an interim option.

RonanKMcGovern changed the title ~~Faster whisper template~~ Faster whisper serverless template Jan 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster whisper serverless template #10

Faster whisper serverless template #10

aaa3334 commented Nov 12, 2024

RonanKMcGovern commented Nov 12, 2024

aaa3334 commented Nov 13, 2024

RonanKMcGovern commented Nov 13, 2024 via email

RonanKMcGovern commented Jan 11, 2025

Faster whisper serverless template #10

Faster whisper serverless template #10

Comments

aaa3334 commented Nov 12, 2024

RonanKMcGovern commented Nov 12, 2024

aaa3334 commented Nov 13, 2024

RonanKMcGovern commented Nov 13, 2024 via email

RonanKMcGovern commented Jan 11, 2025