Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add diagonal_loading optional to rtf_power #2369

Closed
wants to merge 3 commits into from

Conversation

nateanl
Copy link
Member

@nateanl nateanl commented May 6, 2022

When computing the MVDR beamforming weights using the power iteration method, the PSD matrix of noise can be applied with diagonal loading to improve the robustness. This is also applicable to computing the RTF matrix (See https://github.com/espnet/espnet/blob/master/espnet2/enh/layers/beamformer.py#L614 as an example). This also aligns with current torchaudio.transforms.MVDR module to keep the consistency.

This PR adds the diagonal_loading argument with True as default value to torchaudio.functional.rtf_power.

@facebook-github-bot
Copy link
Contributor

@nateanl has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@nateanl has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Contributor

@carolineechen carolineechen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was rtf_power included in the last release? the implementation looks good but will change the default behavior of the function to produce different values, so may need to be labeled as BC-breaking

@nateanl
Copy link
Member Author

nateanl commented May 6, 2022

was rtf_power included in the last release? the implementation looks good but will change the default behavior of the function to produce different values, so may need to be labeled as BC-breaking

No, it was not in the previous release, I think we can regard the changes as BC. what do you think?

(Default: ``True``)
diag_eps (float, optional): The coefficient multiplied to the identity matrix for diagonal loading
(Default: ``1e-7``)
eps (float, optional): a value to avoid the correlation matrix is all-zero (Default: ``1e-8``)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this docstring be more helpful? does a value added to the denominator in the beamforming weight computation. from #2368 make sense here, and is it worth adding that this is only used for the case when diagonal_loading=True?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I will align the eps docstring in the functions and modules.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The eps here is for diagonal loading, which is confusing with eps in computing beamforming weight. I decided to exclude it from the API and use the default value in _tik_reg.

Copy link
Contributor

@carolineechen carolineechen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops think I was looking at the wrong branch and thought it was part of last release, that sounds good to me! just the docstring change then

@facebook-github-bot
Copy link
Contributor

@nateanl has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants