Description
🐛 Describe the bug
By documentation of torchaudio.load()
, the expected behaviour for handling 24-bit WAV files is the following:
Since torch does not support int24 dtype, 24-bit signed PCM are converted to int32 tensors.
When calling torchaudio.load()
in Windows, using PySoundFile, on a 24-bit WAV file, the rows 222 - 228:
with soundfile.SoundFile(filepath, "r") as file_:
if file_.format != "WAV" or normalize:
dtype = "float32"
elif file_.subtype not in _SUBTYPE2DTYPE:
raise ValueError(f"Unsupported subtype: {file_.subtype}")
else:
dtype = _SUBTYPE2DTYPE[file_.subtype]
and _SUBTYPE2DTYPE
is:
_SUBTYPE2DTYPE = { "PCM_S8": "int8", "PCM_U8": "uint8", "PCM_16": "int16", "PCM_32": "int32", "FLOAT": "float32", "DOUBLE": "float64", }
are raising the following error:
ValueError: Unsupported subtype: PCM_24
Adding a simple "PCM_24:" "int32"
to _SUBTYPE2DTYPE
solves the problem.
Versions
PyTorch version: 2.3.1+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS: Microsoft Windows 10 Pro
GCC version: (MinGW.org GCC-6.3.0-1) 6.3.0
Clang version: Could not collect
CMake version: Could not collect
Libc version: N/A
Python version: 3.12.0 (tags/v3.12.0:0fb18b0, Oct 2 2023, 13:03:39) [MSC v.1935 64 bit (AMD64)] (64-bit runtime)
Python platform: Windows-10-10.0.19045-SP0
Is CUDA available: True
CUDA runtime version: 12.1.105
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: GPU 0: NVIDIA GeForce GTX 1050
Nvidia driver version: 555.99
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
CPU:
Architecture=9
CurrentClockSpeed=2801
DeviceID=CPU0
Family=198
L2CacheSize=1024
L2CacheSpeed=
Manufacturer=GenuineIntel
MaxClockSpeed=2801
Name=Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz
ProcessorType=3
Revision=
Versions of relevant libraries:
[pip3] numpy==1.26.4
[pip3] torch==2.3.1+cu121
[pip3] torchaudio==2.3.1+cu121
[pip3] torchvision==0.18.1+cu121
[conda] Could not collect