You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Running whisper large-v3 model via whisper.cpp is significantly more performant than running it through python, VRAM wise and time wise. On a large file python implementation was taking 40GB of VRMA (using Mac Studio)
However I find that running large-v3 through whisper cpp can cause weird anomalies and repetitions that I just don't see when running it through python. Running it through python gives almost perfect accuracy with no weird hallucinations.
What am I missing, how are they so different?
Medium on whisper.cpp seems to be more accurate and hallucinates less than large-v3