Inconsistent Transcription with Whisper Turbo on Kubernetes Using NVIDIA T4 GPU #2503
Unanswered
abhijith-zupaloop
asked this question in
Q&A
Replies: 1 comment
-
That's interesting that you observed better results on your laptop. I've also observed non-deterministic output across identical runs using T4 (in AWS Batch) and the large-v3 model. Just out of curiosity, are your Python environments, and CUDA versions equivalent on your laptop and the T4 machine? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I am experiencing inconsistent transcription results when running the Whisper Turbo model on a Kubernetes node equipped with an NVIDIA T4 (16GB) GPU. The same model, when run on my laptop with an RTX 2000 Ada (8GB) GPU, produces accurate transcripts without any issues.
On the Kubernetes setup, the transcripts often contain incorrect mappings, repeated segments, or are even blank at times. However, on my laptop, the transcriptions are accurate and as expected.
Steps Taken
Environment Details
Laptop:
Kubernetes Node:
Expected Behavior
The transcription results from the Kubernetes node should match the accuracy and quality of the results produced on the laptop GPU.
Actual Behavior
The transcripts on the Kubernetes node are inconsistent and often incorrect.
Request
I would appreciate any guidance on:
Thank you for your help!
Beta Was this translation helpful? Give feedback.
All reactions