Add mvdr_weights_rtf to torchaudio.functional #2229

nateanl · 2022-02-15T15:28:28Z

This PR adds mvdr_weights_rtf method to torchaudio.functional.
It computes the MVDR weight matrix based on the solution that applies relative transfer function (RTF). See the paper for the reference.
The input arguments are the complex-valued RTF Tensor of the target speech, power spectral density (PSD) matrix of noise, int or one-hot Tensor to indicate the reference channel, respectively.

test/torchaudio_unittest/functional/autograd_impl.py

torchaudio/functional/functional.py

mthrok · 2022-02-25T04:10:46Z

test/torchaudio_unittest/functional/functional_impl.py

@@ -582,6 +583,45 @@ def test_rnnt_loss_costs_and_gradients_random_data_with_numpy_fp32(self):
            ref_costs, ref_gradients = rnnt_utils.compute_with_numpy_transducer(data=data)
            self._test_costs_and_gradients(data=data, ref_costs=ref_costs, ref_gradients=ref_gradients)

+    def test_mvdr_weights_rtf(self):


docstring please.

mthrok · 2022-02-25T04:14:55Z

torchaudio/functional/functional.py

+            reference_channel = reference_channel.to(psd_n.dtype)
+            scale = torch.einsum("...c,...c->...", [rtf.conj(), reference_channel[..., None, :]])
+        else:
+            raise TypeError(f"Unsupported dtype for reference_channel. Found: {type(reference_channel)}.")


Similar to https://github.com/pytorch/audio/pull/2231/files#r814462176, please write what is expected.

facebook-github-bot · 2022-02-25T10:28:05Z

@nateanl has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Summary: This PR adds ``mvdr_weights_rtf`` method to ``torchaudio.functional``. It computes the MVDR weight matrix based on the solution that applies relative transfer function (RTF). See [the paper](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.725.673&rep=rep1&type=pdf) for the reference. The input arguments are the complex-valued RTF Tensor of the target speech, power spectral density (PSD) matrix of noise, int or one-hot Tensor to indicate the reference channel, respectively. Pull Request resolved: pytorch#2229 Reviewed By: mthrok Differential Revision: D34474119 Pulled By: nateanl fbshipit-source-id: ca20eca4d071ebb99f0b6827613338796f3ec2a2

facebook-github-bot · 2022-02-25T15:56:48Z

This pull request was exported from Phabricator. Differential Revision: D34474119

Summary: This PR adds ``mvdr_weights_rtf`` method to ``torchaudio.functional``. It computes the MVDR weight matrix based on the solution that applies relative transfer function (RTF). See [the paper](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.725.673&rep=rep1&type=pdf) for the reference. The input arguments are the complex-valued RTF Tensor of the target speech, power spectral density (PSD) matrix of noise, int or one-hot Tensor to indicate the reference channel, respectively. Pull Request resolved: pytorch#2229 Reviewed By: mthrok Differential Revision: D34474119 Pulled By: nateanl fbshipit-source-id: 2d6f62cd0858f29ed6e4e03c23dcc11c816204e2

nateanl added new feature module: ops labels Feb 15, 2022

nateanl requested review from mthrok, hwangjeff and carolineechen February 15, 2022 15:28

pytorch-bot bot added the ciflow/default label Feb 15, 2022

facebook-github-bot added the CLA Signed label Feb 15, 2022

nateanl mentioned this pull request Feb 15, 2022

Add psd and mvdr methods to functional & Refactor PSD and MVDR module in transforms #2181

Closed

mthrok reviewed Feb 16, 2022

View reviewed changes

test/torchaudio_unittest/functional/autograd_impl.py Outdated Show resolved Hide resolved

nateanl added this to the v0.11 milestone Feb 16, 2022

nateanl force-pushed the refactor_mvdr_2 branch from 9fa3b44 to 6500986 Compare February 17, 2022 17:22

hwangjeff reviewed Feb 17, 2022

View reviewed changes

torchaudio/functional/functional.py Outdated Show resolved Hide resolved

torchaudio/functional/functional.py Outdated Show resolved Hide resolved

nateanl changed the title ~~Add compute_mvdr_weights_rtf to torchaudio.functional~~ Add mvdr_weights_rtf to torchaudio.functional Feb 17, 2022

nateanl force-pushed the refactor_mvdr_2 branch from cc53427 to e47e54c Compare February 17, 2022 20:49

mthrok approved these changes Feb 25, 2022

View reviewed changes

nateanl force-pushed the refactor_mvdr_2 branch from 14937a8 to dc19b0d Compare February 25, 2022 15:56

facebook-github-bot closed this in 3566ffc Feb 25, 2022

nateanl deleted the refactor_mvdr_2 branch March 1, 2022 20:53

nateanl mentioned this pull request Mar 16, 2022

[Migration] TorchAudio Beamforming Module Migration #2280

Closed

11 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add mvdr_weights_rtf to torchaudio.functional #2229

Add mvdr_weights_rtf to torchaudio.functional #2229

nateanl commented Feb 15, 2022 •

edited

Loading

mthrok Feb 25, 2022

mthrok Feb 25, 2022

facebook-github-bot commented Feb 25, 2022

facebook-github-bot commented Feb 25, 2022

Add mvdr_weights_rtf to torchaudio.functional #2229

Add mvdr_weights_rtf to torchaudio.functional #2229

Conversation

nateanl commented Feb 15, 2022 • edited Loading

mthrok Feb 25, 2022

Choose a reason for hiding this comment

mthrok Feb 25, 2022

Choose a reason for hiding this comment

facebook-github-bot commented Feb 25, 2022

facebook-github-bot commented Feb 25, 2022

nateanl commented Feb 15, 2022 •

edited

Loading