Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unsupported subtype: PCM_24 #3806

Open
nicobrb opened this issue Jul 2, 2024 · 1 comment
Open

Unsupported subtype: PCM_24 #3806

nicobrb opened this issue Jul 2, 2024 · 1 comment

Comments

@nicobrb
Copy link

nicobrb commented Jul 2, 2024

🐛 Describe the bug

By documentation of torchaudio.load(), the expected behaviour for handling 24-bit WAV files is the following:

Since torch does not support int24 dtype, 24-bit signed PCM are converted to int32 tensors.

When calling torchaudio.load() in Windows, using PySoundFile, on a 24-bit WAV file, the rows 222 - 228:


    with soundfile.SoundFile(filepath, "r") as file_:
        if file_.format != "WAV" or normalize:
            dtype = "float32"
        elif file_.subtype not in _SUBTYPE2DTYPE:
            raise ValueError(f"Unsupported subtype: {file_.subtype}")
        else:
            dtype = _SUBTYPE2DTYPE[file_.subtype]

and _SUBTYPE2DTYPE is:
_SUBTYPE2DTYPE = { "PCM_S8": "int8", "PCM_U8": "uint8", "PCM_16": "int16", "PCM_32": "int32", "FLOAT": "float32", "DOUBLE": "float64", }

are raising the following error:
ValueError: Unsupported subtype: PCM_24

Adding a simple "PCM_24:" "int32" to _SUBTYPE2DTYPE solves the problem.

Versions

PyTorch version: 2.3.1+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A

OS: Microsoft Windows 10 Pro
GCC version: (MinGW.org GCC-6.3.0-1) 6.3.0
Clang version: Could not collect
CMake version: Could not collect
Libc version: N/A

Python version: 3.12.0 (tags/v3.12.0:0fb18b0, Oct 2 2023, 13:03:39) [MSC v.1935 64 bit (AMD64)] (64-bit runtime)
Python platform: Windows-10-10.0.19045-SP0
Is CUDA available: True
CUDA runtime version: 12.1.105
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: GPU 0: NVIDIA GeForce GTX 1050
Nvidia driver version: 555.99
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Architecture=9
CurrentClockSpeed=2801
DeviceID=CPU0
Family=198
L2CacheSize=1024
L2CacheSpeed=
Manufacturer=GenuineIntel
MaxClockSpeed=2801
Name=Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz
ProcessorType=3
Revision=

Versions of relevant libraries:
[pip3] numpy==1.26.4
[pip3] torch==2.3.1+cu121
[pip3] torchaudio==2.3.1+cu121
[pip3] torchvision==0.18.1+cu121
[conda] Could not collect

@Isuxiz
Copy link

Isuxiz commented Dec 2, 2024

I think this is by design, since 24 is not a power of 2.
I encountered the same error on linux, and it was solved by using the normalize=True parameter in the call of torchaudio.load() (this converts the audio file to Float32 format, its default value is True but I manually set it to False before).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants