Add psd and mvdr methods to functional & Refactor PSD and MVDR module in transforms #2181

nateanl · 2022-01-27T20:53:06Z

Provide following methods in torchaudio.functional:
compute_power_spectral_density_matrix
compute_mvdr_weights_souden
compute_mvdr_weights_rtf
compute_rtf_evd
compute_rtf_power
apply_beamforming
Add doc strings for the methods above.
Refactor the current torchaudio.transforms.PSD and torchaudio.transforms.MVDR by using the added methods.

nateanl · 2022-02-09T16:55:59Z

@Emrys365 @popcornell @boeddeker Would you like to review the algorithms in the methods? Thanks!

boeddeker

I added some comments to the core code. The code feels somehow familiar, it is not too far away from our.

torchaudio/functional/functional.py

torchaudio/transforms.py

danpovey

Some small comments

torchaudio/transforms.py

mthrok · 2022-02-15T08:11:36Z

test/torchaudio_unittest/functional/autograd_impl.py

+        torch.random.manual_seed(2434)
+        channel = 4
+        psd_speech = torch.rand(64, channel, channel, dtype=torch.cfloat)
+        psd_noise = torch.rand(64, channel, channel, dtype=torch.cfloat)


Autograd tests are one of the most time-consuming ones. I think batch size 3 is suffice.

or is this frame length?

64 is the frequency dimension. We can reduce it to 5 or 10.

mthrok · 2022-02-15T08:13:08Z

test/torchaudio_unittest/functional/autograd_impl.py

+        self.assert_grad(F.compute_mvdr_weights_rtf, (rtf, psd_noise, 0))
+
+    # The eigenvector can be different in different runs, expected to fail
+    @expectedFailure


Reading https://pytorch.org/docs/stable/generated/torch.linalg.eigh.html, there seems no way to control the randomness. Since this is expected to fail, what value does this test provides?

This works as a flag showing eigenvalue decomposition itself can have different results on the same matrix.

You can normalize the rtf vector (rtf / rtf[0].conj()), then it should work.
If it still does not work, the eigenvalue gap should be larger, i.e. the difference between the largest and second-largest eigenvalue, e.g.:

import numpy as np channel = 4 samples = 100 # samples >> channel data = np.random.randn(channel, samples) cov = data @ data.T.conj() # Hermetian matrix eigenvalues, eigenvectors = np.linalg.eigh(cov) eigenvalues[...] = np.random.uniform(0, 1, size=eigenvalues.shape) # Change the eigenvalues to be between 0 and 1 eigenvalues[0] = 2 # Set first eigenvalue to be the largest cov = np.einsum('ij,j,kj->ik', eigenvectors, eigenvalues, eigenvectors.conj()) # Revert the eigh decomposition print(np.linalg.eigh(cov))

Btw. probably the gradient for eig and eigh was derived with the constraint, that you have no gradient in the direction of the length or a phase change, i.e. the objective does not change, when you replace an eigenvector with itself multiplied with any nonzero complex number.

IMO the returned eigenvalue can be different even when the eigenvectors are unique. One simple example is -eigenvalue and -eigenvector become another solution.
From PyTorch side, they seem to hard-coded check it and throw a runtime error in
https://github.com/pytorch/pytorch/blob/master/torch/csrc/autograd/FunctionsManual.cpp#L2907-L2910

@boeddeker I verified that using eigenvector / eigenvector[..., -1:].conj() passed the gradcheck. Thanks! In my case the eigenvalues are in ascending order, hence the normalization should rely on the last dimension instead of 0th?

Sorry my mistake. I converted the psd matrix to real Tensor by mistake. It still fails the gradcheck now...

I verified that using eigenvector / eigenvector[..., -1:].conj() passed the gradcheck.

Sorry, I copied my code example from above and renamed the variable.
The correct code is

... assert rtf.ndim == 1, rtf.shape reference_channel = 0 rtf = rtf / rtf[reference_channel].conj() ...

So I assumed to have a single eigenvector and normalize it by a reference channel. In this way, the eigenvectors are unique.

When I implemented the gradient check, I had issues with inputs like psd_speech.
It was necessary to include something like psd_speech = 0.5 * (psd_speech + hermitian(psd_speech)) in the checked function, otherwise the changes in the input were not correctly reflected in the output, because eigh ignores values of the matrix (e.g. all values above the diagonal aren't used). In the user code, this is not necessary, because F.compute_power_spectral_density_matrix takes care of this.

Should't be better to handle RTF normalization in F.compute_mvdr_weights_rtf with an additional argument ? Aside from testing it is handy to be able to phase-align the beamforming solution to a particular reference channel (e.g to compute losses which require sample-level alignment).

Nevermind, the rtf normalization seems is there already and is True by default.

mthrok · 2022-02-15T08:15:44Z

test/torchaudio_unittest/functional/torchscript_consistency_impl.py

@@ -644,6 +644,59 @@ def func(tensor):
        tensor = torch.view_as_complex(torch.randn(2, 1025, 400, 2))
        self._assert_consistency_complex(func, tensor)

+    def test_compute_power_spectral_density_matrix(self):
+        def func(tensor):
+            return F.compute_power_spectral_density_matrix(tensor)


There seems no benefit of defining this func.

mthrok · 2022-02-15T08:20:26Z

torchaudio/functional/functional.py

+    numerator = torch.linalg.solve(psd_n, psd_s)  # psd_n.inv() @ psd_s
+    # ws: (..., C, C) / (...,) -> (..., C, C)
+    ws = numerator / (_compute_mat_trace(numerator)[..., None, None] + eps)
+    if isinstance(reference_channel, int):


nit: Would swapping isinstance with torch.jit.isinstance help support TorchScript?

mthrok

@nateanl Once the technical discussion with experts are settled, can you split this PR into smaller PRs, each contain one addition of new function, and another PR that changes the transforms?
It's hard to see which function supports which features.

nateanl · 2022-02-15T14:42:22Z

Split the PR to the following small PRs:

pytorch-bot bot added the ciflow/default label Jan 27, 2022

facebook-github-bot added the CLA Signed label Jan 27, 2022

nateanl mentioned this pull request Jan 27, 2022

New interface for MVDR beamforming #2158

Open

nateanl force-pushed the refactor_mvdr branch 2 times, most recently from 81c0870 to 4a65e2e Compare February 9, 2022 16:52

nateanl changed the title ~~[WIP] Add psd and mvdr methods to functional & Refactor PSD and MVDR module in transforms~~ Add psd and mvdr methods to functional & Refactor PSD and MVDR module in transforms Feb 9, 2022

nateanl marked this pull request as ready for review February 9, 2022 16:53

nateanl requested review from mthrok, carolineechen and hwangjeff February 9, 2022 16:53

boeddeker reviewed Feb 9, 2022

View reviewed changes

nateanl force-pushed the refactor_mvdr branch from 07bc845 to bad0a08 Compare February 10, 2022 20:39

danpovey reviewed Feb 11, 2022

View reviewed changes

torchaudio/transforms.py Outdated Show resolved Hide resolved

danpovey reviewed Feb 11, 2022

View reviewed changes

torchaudio/transforms.py Outdated Show resolved Hide resolved

torchaudio/transforms.py Outdated Show resolved Hide resolved

torchaudio/transforms.py Outdated Show resolved Hide resolved

nateanl force-pushed the refactor_mvdr branch from 4232058 to 4c1a539 Compare February 11, 2022 11:55

mthrok reviewed Feb 15, 2022

View reviewed changes

mthrok added this to the v0.11 milestone Feb 15, 2022

mthrok requested changes Feb 15, 2022

View reviewed changes

Emrys365 mentioned this pull request Feb 16, 2022

Refactor DNN_Beamformer in espnet2 and add new beamformers espnet/espnet#4082

Merged

move multi-channel modules to separate file

8990f38

nateanl force-pushed the refactor_mvdr branch from 4c1a539 to 8990f38 Compare May 11, 2022 09:34

nateanl closed this May 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add psd and mvdr methods to functional & Refactor PSD and MVDR module in transforms #2181

Add psd and mvdr methods to functional & Refactor PSD and MVDR module in transforms #2181

nateanl commented Jan 27, 2022

nateanl commented Feb 9, 2022

boeddeker left a comment

danpovey left a comment

mthrok Feb 15, 2022

mthrok Feb 15, 2022

nateanl Feb 15, 2022

mthrok Feb 15, 2022

nateanl Feb 15, 2022

boeddeker Feb 15, 2022

boeddeker Feb 15, 2022

nateanl Feb 15, 2022

nateanl Feb 15, 2022

nateanl Feb 15, 2022

boeddeker Feb 15, 2022

popcornell Feb 17, 2022

popcornell Feb 17, 2022

mthrok Feb 15, 2022

mthrok Feb 15, 2022

mthrok left a comment

nateanl commented Feb 15, 2022 •

edited

Loading

Add psd and mvdr methods to functional & Refactor PSD and MVDR module in transforms #2181

Add psd and mvdr methods to functional & Refactor PSD and MVDR module in transforms #2181

Conversation

nateanl commented Jan 27, 2022

nateanl commented Feb 9, 2022

boeddeker left a comment

Choose a reason for hiding this comment

danpovey left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mthrok left a comment

Choose a reason for hiding this comment

nateanl commented Feb 15, 2022 • edited Loading

nateanl commented Feb 15, 2022 •

edited

Loading