Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support batch processing #57

Merged
merged 1 commit into from
Feb 1, 2023
Merged

support batch processing #57

merged 1 commit into from
Feb 1, 2023

Conversation

TengdaHan
Copy link
Contributor

What it is: Parallalize the transcribe function. It gives at least 2X speedup on transcribing.
TODO: supporting temperature sweeping under batch processing.

@m-bain
Copy link
Owner

m-bain commented Feb 1, 2023

epic thank you tengda

@m-bain m-bain merged commit 29e95b7 into m-bain:main Feb 1, 2023
@Barabazs
Copy link
Contributor

Barabazs commented Feb 2, 2023

Hi @TengdaHan, can you clarify how we should specify an optimal batch size please?

It seems that a batch_size of 16 would divide the audio in 16 equally sized chunks. But does that mean that these 16 chunks will be processed in parallel?

@TengdaHan
Copy link
Contributor Author

TengdaHan commented Feb 2, 2023

@Barabazs The audio file is chunked in the other direction: a batch_size of 16 divides the audio into X batches where each has 16 audio segments. Then each batch is transcribed together: https://github.com/m-bain/whisperX/blob/main/whisperx/transcribe.py#L415
A general rule is to choose a larger batch_size that still fits in your GPU memory. I suggest starting with 16 or 32.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants