Implement resample transform #37

iver56 · 2020-11-10T12:32:29Z

https://github.com/adefossez/julius

mpariente · 2020-11-10T12:57:12Z

There's also torchaudio's resample, should we should between both?

iver56 · 2020-11-10T13:10:39Z

I'm not so fond of torchaudio's resample function, because it seems to be much slower than julius. Here's the result of a crude benchmark that resamples some audio from 44100 hz to 48000 hz on CPU:

librosa/resampy kaiser_fast: 4.23 s
librosa/resampy kaiser_best: 15.12 s
torchaudio kaldi-compliant LPF width=2: 22.97 s
torchaudio kaldi-compliant LPF width=6: 23.56 s
torchaudio kaldi-compliant LPF width=10: 23.99 s
julius cpu 64 zeros: 0.195 s
julius cpu 16 zeros: 0.176 s

mpariente · 2020-11-10T13:32:24Z

Ok, it's pretty clear that Julius is better, let's stick with it !

mogwai · 2020-11-11T11:30:29Z

benchmark that resamples some audio from 44100 hz to 48000 hz on CPU:

What other sample rate conversions did you try? Did you compile the Resample transform with torch.jit.script?

iver56 · 2020-11-11T11:38:30Z

In my crude benchmark, I ran it simply like this:

    import torch
    from torchaudio.compliance.kaldi import resample_waveform
    for lowpass_filter_width in (2, 6, 10):
        with timer("pytorch-audio kaldi-compliant LPF width={}".format(lowpass_filter_width)):
            pytorch_kaldi_compliant = (
                resample_waveform(
                    torch.from_numpy(samples).unsqueeze(0),
                    orig_freq=sample_rate,
                    new_freq=HIGH_SAMPLE_RATE,
                    lowpass_filter_width=lowpass_filter_width,
                )
                .squeeze()
                .numpy()
            )

I didn't try other sample rate conversions

mogwai · 2020-11-11T12:50:04Z

I've got a notebook to benchmark different methods of resampling. There are some conversions that take longer, I think, due to there being a gcd between input and output sample rates. It would be good to add julius to that list and compare results when resampling is done in batches.

https://gist.github.com/mogwai/a5df03e89ab33bc0a5648965280d5445

In your benchmark you for example load in and out of numpy which can take time.

iver56 · 2020-11-11T13:25:41Z

Yes, that would be interesting.

Re numpy: Yes, but I did the numpy conversion in the julius benchmark as well. Pytorch tensors share memory with numpy arrays when running on CPU, so the "conversion" should be quite fast.

mogwai · 2020-11-30T13:37:46Z

I've added julius to the benchmark notebook. Seems that it does produce higher quality and does so faster most of the time. I did notice that it didn't output the same length of samples as was input to it so had to add a minor hack to solve that.

https://gist.github.com/mogwai/a5df03e89ab33bc0a5648965280d5445

iver56 · 2020-11-30T14:06:31Z

Yes, I've been using fix_length from librosa to solve the length issue. (from librosa.util import fix_length)

iver56 mentioned this issue Apr 13, 2022

Get rid of librosa dependency in favor of torchaudio #134

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement resample transform #37

Implement resample transform #37

iver56 commented Nov 10, 2020

mpariente commented Nov 10, 2020

iver56 commented Nov 10, 2020

mpariente commented Nov 10, 2020

mogwai commented Nov 11, 2020

iver56 commented Nov 11, 2020

mogwai commented Nov 11, 2020 •

edited

Loading

iver56 commented Nov 11, 2020

mogwai commented Nov 30, 2020

iver56 commented Nov 30, 2020

Implement resample transform #37

Implement resample transform #37

Comments

iver56 commented Nov 10, 2020

mpariente commented Nov 10, 2020

iver56 commented Nov 10, 2020

mpariente commented Nov 10, 2020

mogwai commented Nov 11, 2020

iver56 commented Nov 11, 2020

mogwai commented Nov 11, 2020 • edited Loading

iver56 commented Nov 11, 2020

mogwai commented Nov 30, 2020

iver56 commented Nov 30, 2020

mogwai commented Nov 11, 2020 •

edited

Loading