Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: differentiable PyTorch backend #31

Open
wants to merge 19 commits into
base: master
Choose a base branch
from

Conversation

yoyololicon
Copy link

Feature

An alternative and fully differentiable backend of norbert for PyTorch users. One extra dimension is required for input Tensors to make it be able to process in batch, which comes in handy for deep learning training. This implementation had been used to train Danna-Sep 12. This change is backward compatible so it won't affect any existing projects that depend on norbert.

Usage

To use the PyTorch backend, users can simply just replace each call of norbert.* into norbert.torch.*.

Available Functions

  • expectation_maximization
  • wiener
  • softmask
  • wiener_gain
  • apply_filter
  • get_mix_model
  • get_local_gaussian_model
  • residual_model
  • reduce_interferences

Note

In order to make the new backend differentiable, some in-place operations in the original code were re-written so it's not a simple one-to-one translation.
BTW, I have some problems building the docs so I'm not sure what the documentation will be like after this PR. If anyone could help me check the docs I'll appreciate it.

Footnotes

  1. Yu, Chin-Yun, and Kin-Wai Cheuk. "Danna-Sep: Unite to separate them all". https://arxiv.org/abs/2112.03752

  2. Yuki et al. "Music Demixing Challenge 2021". https://doi.org/10.3389/frsip.2021.808395

@faroit
Copy link
Member

faroit commented Feb 14, 2022

@yoyololicon thats great 🙌 Have you seen our implementation in https://github.com/sigsep/open-unmix-pytorch/blob/master/openunmix/filtering.py ?

  • Is yours using complex or real dtypes?
  • can you benchmark against the one in openunmix?
  • did you make sure the numpy and torch outputs are identical (with a margin of error)? I don't see such a regression test in this PR

@yoyololicon
Copy link
Author

@faroit

Have you seen our implementation in https://github.com/sigsep/open-unmix-pytorch/blob/master/openunmix/filtering.py ?

Thanks for the info, I haven't checked this one before.

  • Is yours using complex or real dtypes?

Did you mean using PyTorch native complex type? Yes.

  • can you benchmark against the one in openunmix?

Sure. I think we can also profile the memory usage as well.

  • did you make sure the numpy and torch outputs are identical (with a margin of error)? I don't see such a regression test in this PR

As I can remember, we haven't directly compared them before. I agree we should add a regression test and report the numbers.

@yoyololicon yoyololicon marked this pull request as draft February 14, 2022 08:18
@aliutkus
Copy link
Member

Thanks a lot, that's great !! I remember talking to you about doing a PR for norbert, that's fantastic you took time to do it.

If you could indeed do the few tests @faroit mentions, that would be perfect
Does your implementation allow backprop even with a number of iterations > 0 ?

best

@yoyololicon
Copy link
Author

Thanks a lot, that's great !! I remember talking to you about doing a PR for norbert, that's fantastic you took time to do it.

@aliutkus Yeah, I remember it, too. 😄

If you could indeed do the few tests @faroit mentions, that would be perfect Does your implementation allow backprop even with a number of iterations > 0 ?

Theoretically, it can backprop any number of iterations, but we always set it to one iteration in our training.
I'll add the tests later.

@yoyololicon
Copy link
Author

yoyololicon commented Feb 17, 2022

@faroit @aliutkus I added the regression tests and should be ready for review.
See below for the benchmarks.

Profile

Norbert Torch

-----------------------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  
                         Name    Self CPU %      Self CPU   CPU total %     CPU total  CPU time avg       CPU Mem  Self CPU Mem    # of Calls  
-----------------------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  
                  aten::empty         0.42%      67.000us         0.42%      67.000us       4.467us      17.28 Mb      17.28 Mb            15  
                    aten::mul        33.89%       5.363ms        38.71%       6.126ms     510.500us      21.13 Mb      13.31 Mb            12  
          aten::empty_strided         0.49%      77.000us         0.49%      77.000us       5.133us      10.97 Mb      10.97 Mb            15  
                aten::resize_         0.21%      33.000us         0.21%      33.000us       8.250us      10.96 Mb      10.96 Mb             4  
                    aten::div         4.28%     678.000us         4.50%     713.000us     178.250us       5.54 Mb       5.53 Mb             4  
-----------------------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  
Self CPU time total: 15.827ms

Open-Unmix

-----------------------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  
                         Name    Self CPU %      Self CPU   CPU total %     CPU total  CPU time avg       CPU Mem  Self CPU Mem    # of Calls  
-----------------------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  
                    aten::mul        19.84%      29.532ms        20.01%      29.783ms     111.131us      76.34 Mb      73.60 Mb           268  
                    aten::add         4.59%       6.826ms         4.59%       6.826ms      36.116us      47.23 Mb      47.23 Mb           189  
                  aten::empty         0.23%     343.000us         0.24%     353.000us       2.942us      25.90 Mb      25.90 Mb           120  
          aten::empty_strided         0.31%     468.000us         0.31%     468.000us       5.032us      17.42 Mb      17.42 Mb            93  
                    aten::sub         1.87%       2.782ms         1.88%       2.798ms      43.046us      14.87 Mb      14.87 Mb            65  
-----------------------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  ------------  
Self CPU time total: 148.867ms

Benchmark

Testing different combinations of sources, iterations, and threads, with other parameters fixed to a reasonable value.
It's always 2 to 3 times faster than openunmix.

[----------------- wiener -----------------]
              |  norbert torch  |  openunmix
1 threads: ---------------------------------
      [4, 0]  |        4.2      |      6.7  
      [4, 1]  |       46.8      |     80.6  
      [4, 3]  |      130.4      |    233.0  
      [8, 0]  |        4.0      |      8.2  
      [8, 1]  |       84.0      |    167.4  
      [8, 3]  |      232.6      |    470.5  
2 threads: ---------------------------------
      [4, 0]  |        2.2      |      3.7  
      [4, 1]  |       24.9      |     48.8  
      [4, 3]  |       70.5      |    138.0  
      [8, 0]  |        2.3      |      4.7  
      [8, 1]  |       46.9      |     97.8  
      [8, 3]  |      126.4      |    294.0  
4 threads: ---------------------------------
      [4, 0]  |        1.2      |      2.3  
      [4, 1]  |       16.0      |     38.3  
      [4, 3]  |       41.6      |    106.2  
      [8, 0]  |        1.9      |      3.5  
      [8, 1]  |       29.6      |     82.3  
      [8, 3]  |       80.6      |    226.7  

Times are in milliseconds (ms).
The script I used to produce those experiments
import torch
from torch.profiler import profile, record_function, ProfilerActivity
import torch.utils.benchmark as benchmark
from itertools import product

import norbert.torch as norbert
from openunmix import filtering


nb_frames = 100
nb_bins = 513
nb_channels = 2
nb_sources = [4, 8]
nb_iterations = [0, 1, 3]

x = torch.randn(nb_frames, nb_bins, nb_channels) + 1j * \
    torch.randn(nb_frames, nb_bins, nb_channels)
x_as_real = torch.view_as_real(x)
v = torch.rand(nb_frames, nb_bins, nb_channels, 4)

with profile(activities=[ProfilerActivity.CPU], profile_memory=True) as prof:
    norbert.wiener(v[None, ...], x[None, ...])
print("Norbert Torch")
print(prof.key_averages().table(sort_by="self_cpu_memory_usage", row_limit=5))

with profile(activities=[ProfilerActivity.CPU], profile_memory=True) as prof:
    filtering.wiener(v, x_as_real)
print("Open-Unmix")
print(prof.key_averages().table(sort_by="self_cpu_memory_usage", row_limit=5))


results = []
for ns, ni in product(nb_sources, nb_iterations):
    label = 'wiener'
    sub_label = f'[{ns}, {ni}]'
    v = torch.rand(nb_frames, nb_bins, nb_channels, ns)
    x = torch.randn(nb_frames, nb_bins, nb_channels) + 1j * \
        torch.randn(nb_frames, nb_bins, nb_channels)
    x_as_real = torch.view_as_real(x)

    for num_threads in [1, 2, 4]:
        results.append(benchmark.Timer(
            stmt='y = wiener(v, x, i)',
            setup='from norbert.torch import wiener',
            globals={'x': x.unsqueeze(0), 'v': v.unsqueeze(0), 'i': ni},
            label=label,
            num_threads=num_threads,
            sub_label=sub_label,
            description='norbert torch',
        ).blocked_autorange(min_run_time=1))

        results.append(benchmark.Timer(
            stmt='y = wiener(v, x, i)',
            setup='from openunmix.filtering import wiener',
            globals={'x': x_as_real, 'v': v, 'i': ni},
            label=label,
            num_threads=num_threads,
            sub_label=sub_label,
            description='openunmix',
        ).blocked_autorange(min_run_time=1))

compare = benchmark.Compare(results)
compare.print()

@yoyololicon yoyololicon marked this pull request as ready for review February 17, 2022 04:40
@faroit
Copy link
Member

faroit commented Mar 4, 2022

@yoyololicon sorry for the slow response. I will have a look next week 🙏

@turian
Copy link

turian commented Dec 4, 2022

This looks great, @faroit would be amazing to be able to pip install this

@faroit
Copy link
Member

faroit commented Dec 4, 2022

@yoyololicon Totally forgot about this one. I will have a look again

@yoyololicon
Copy link
Author

@faroit No problem. Take your time. Let's first enjoy the ISMIR conference 😆

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants