-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error loading MP3 files from CommonVoice #5488
Comments
Hi @kradonneoh, thanks for reporting. Please note that to work with audio datasets (and specifically with MP3 files) we have detailed installation instructions in our docs: https://huggingface.co/docs/datasets/installation#audio
Let us know if the problem persists after having followed them. |
I saw that and have followed it (hence the Expected Behavior section of the bug report). Is there no intention of updating to the latest version? It does limit the version of |
@kradonneoh hey! actually with
But according to your traceback it seems that it tries to use fyi - we are aiming at getting rid of |
We now decode MP3 with |
Describe the bug
When loading a CommonVoice dataset with
datasets==2.9.0
andtorchaudio>=0.12.0
, I get an error reading the audio arrays:I assume this is because there's some issue with the mp3 decoding process. I've verified that I have
ffmpeg>=4
(on a Linux distro), which appears to be the fallback backend fortorchaudio,
(at least according to #4889).Steps to reproduce the bug
Expected behavior
Similar behavior to
torchaudio<0.12.0
, which doesn't result in aLibsndfileError
Environment info
datasets
version: 2.9.0The text was updated successfully, but these errors were encountered: