Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement resample transform #37

Open
iver56 opened this issue Nov 10, 2020 · 9 comments
Open

Implement resample transform #37

iver56 opened this issue Nov 10, 2020 · 9 comments

Comments

@iver56
Copy link
Collaborator

iver56 commented Nov 10, 2020

https://github.com/adefossez/julius

@mpariente
Copy link
Contributor

There's also torchaudio's resample, should we should between both?

@iver56
Copy link
Collaborator Author

iver56 commented Nov 10, 2020

I'm not so fond of torchaudio's resample function, because it seems to be much slower than julius. Here's the result of a crude benchmark that resamples some audio from 44100 hz to 48000 hz on CPU:

librosa/resampy kaiser_fast: 4.23 s
librosa/resampy kaiser_best: 15.12 s
torchaudio kaldi-compliant LPF width=2: 22.97 s
torchaudio kaldi-compliant LPF width=6: 23.56 s
torchaudio kaldi-compliant LPF width=10: 23.99 s
julius cpu 64 zeros: 0.195 s
julius cpu 16 zeros: 0.176 s

@mpariente
Copy link
Contributor

Ok, it's pretty clear that Julius is better, let's stick with it !

@mogwai
Copy link
Contributor

mogwai commented Nov 11, 2020

benchmark that resamples some audio from 44100 hz to 48000 hz on CPU:

What other sample rate conversions did you try? Did you compile the Resample transform with torch.jit.script?

@iver56
Copy link
Collaborator Author

iver56 commented Nov 11, 2020

In my crude benchmark, I ran it simply like this:

    import torch
    from torchaudio.compliance.kaldi import resample_waveform
    for lowpass_filter_width in (2, 6, 10):
        with timer("pytorch-audio kaldi-compliant LPF width={}".format(lowpass_filter_width)):
            pytorch_kaldi_compliant = (
                resample_waveform(
                    torch.from_numpy(samples).unsqueeze(0),
                    orig_freq=sample_rate,
                    new_freq=HIGH_SAMPLE_RATE,
                    lowpass_filter_width=lowpass_filter_width,
                )
                .squeeze()
                .numpy()
            )

I didn't try other sample rate conversions

@mogwai
Copy link
Contributor

mogwai commented Nov 11, 2020

I've got a notebook to benchmark different methods of resampling. There are some conversions that take longer, I think, due to there being a gcd between input and output sample rates. It would be good to add julius to that list and compare results when resampling is done in batches.

https://gist.github.com/mogwai/a5df03e89ab33bc0a5648965280d5445

In your benchmark you for example load in and out of numpy which can take time.

@iver56
Copy link
Collaborator Author

iver56 commented Nov 11, 2020

Yes, that would be interesting.

Re numpy: Yes, but I did the numpy conversion in the julius benchmark as well. Pytorch tensors share memory with numpy arrays when running on CPU, so the "conversion" should be quite fast.

@mogwai
Copy link
Contributor

mogwai commented Nov 30, 2020

I've added julius to the benchmark notebook. Seems that it does produce higher quality and does so faster most of the time. I did notice that it didn't output the same length of samples as was input to it so had to add a minor hack to solve that.

https://gist.github.com/mogwai/a5df03e89ab33bc0a5648965280d5445

@iver56
Copy link
Collaborator Author

iver56 commented Nov 30, 2020

Yes, I've been using fix_length from librosa to solve the length issue. (from librosa.util import fix_length)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants