Error loading MP3 files from CommonVoice #5488

kradonneoh · 2023-01-31T21:25:33Z

Describe the bug

When loading a CommonVoice dataset with datasets==2.9.0 and torchaudio>=0.12.0, I get an error reading the audio arrays:

---------------------------------------------------------------------------
LibsndfileError                           Traceback (most recent call last)
~/.local/lib/python3.8/site-packages/datasets/features/audio.py in _decode_mp3(self, path_or_file)
    310             try:  # try torchaudio anyway because sometimes it works (depending on the os and os packages installed)
--> 311                 array, sampling_rate = self._decode_mp3_torchaudio(path_or_file)
    312             except RuntimeError:

~/.local/lib/python3.8/site-packages/datasets/features/audio.py in _decode_mp3_torchaudio(self, path_or_file)
    351 
--> 352         array, sampling_rate = torchaudio.load(path_or_file, format="mp3")
    353         if self.sampling_rate and self.sampling_rate != sampling_rate:

~/.local/lib/python3.8/site-packages/torchaudio/backend/soundfile_backend.py in load(filepath, frame_offset, num_frames, normalize, channels_first, format)
    204     """
--> 205     with soundfile.SoundFile(filepath, "r") as file_:
    206         if file_.format != "WAV" or normalize:

~/.local/lib/python3.8/site-packages/soundfile.py in __init__(self, file, mode, samplerate, channels, subtype, endian, format, closefd)
    654                                          format, subtype, endian)
--> 655         self._file = self._open(file, mode_int, closefd)
    656         if set(mode).issuperset('r+') and self.seekable():

~/.local/lib/python3.8/site-packages/soundfile.py in _open(self, file, mode_int, closefd)
   1212             err = _snd.sf_error(file_ptr)
-> 1213             raise LibsndfileError(err, prefix="Error opening {0!r}: ".format(self.name))
   1214         if mode_int == _snd.SFM_WRITE:

LibsndfileError: Error opening <_io.BytesIO object at 0x7fa539462090>: File contains data in an unknown format.

I assume this is because there's some issue with the mp3 decoding process. I've verified that I have ffmpeg>=4 (on a Linux distro), which appears to be the fallback backend for torchaudio, (at least according to #4889).

Steps to reproduce the bug

dataset = load_dataset("mozilla-foundation/common_voice_11_0", "be", split="train")
dataset[0]

Expected behavior

Similar behavior to torchaudio<0.12.0, which doesn't result in a LibsndfileError

Environment info

datasets version: 2.9.0
Platform: Linux-5.15.0-52-generic-x86_64-with-glibc2.29
Python version: 3.8.10
PyArrow version: 10.0.1
Pandas version: 1.5.1

The text was updated successfully, but these errors were encountered:

albertvillanova · 2023-02-01T08:07:08Z

Hi @kradonneoh, thanks for reporting.

Please note that to work with audio datasets (and specifically with MP3 files) we have detailed installation instructions in our docs: https://huggingface.co/docs/datasets/installation#audio

one of the requirements is torchaudio<0.12.0

Let us know if the problem persists after having followed them.

kradonneoh · 2023-02-01T14:52:32Z

I saw that and have followed it (hence the Expected Behavior section of the bug report).

Is there no intention of updating to the latest version? It does limit the version of torch I can use, which isn’t ideal.

polinaeterna · 2023-02-01T15:28:55Z

@kradonneoh hey! actually with ffmpeg4 loading of mp3 files should work, so this is a not expected behavior and we need to investigate it. It works on my side with torchaudio==0.13 and ffmpeg==4.2.7. Which torchaudio version do you use?

datasets should support decoding of mp3 files with torchaudio when its version is >0.12 but as you noted it requires ffmpeg>4, we need to fix this in the documentation, thank you for pointing to this!

But according to your traceback it seems that it tries to use libsndfile backend for mp3 decoding. And libsndfile library supports mp3 decoding starting from version 1.1.0 which on Linux has to be compiled from source for now afaik.

fyi - we are aiming at getting rid of torchaudio dependency at all by the next major library release in favor of libsndfile too.

mariosasko · 2023-03-02T16:25:13Z

We now decode MP3 with soundfile, so I'm closing this issue

polinaeterna mentioned this issue Feb 27, 2023

Use soundfile for mp3 decoding instead of torchaudio #5573

Merged

mariosasko closed this as completed Mar 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error loading MP3 files from CommonVoice #5488

Error loading MP3 files from CommonVoice #5488

kradonneoh commented Jan 31, 2023

albertvillanova commented Feb 1, 2023 •

edited

Loading

kradonneoh commented Feb 1, 2023

polinaeterna commented Feb 1, 2023

mariosasko commented Mar 2, 2023

Error loading MP3 files from CommonVoice #5488

Error loading MP3 files from CommonVoice #5488

Comments

kradonneoh commented Jan 31, 2023

Describe the bug

Steps to reproduce the bug

Expected behavior

Environment info

albertvillanova commented Feb 1, 2023 • edited Loading

kradonneoh commented Feb 1, 2023

polinaeterna commented Feb 1, 2023

mariosasko commented Mar 2, 2023

albertvillanova commented Feb 1, 2023 •

edited

Loading