-
Notifications
You must be signed in to change notification settings - Fork 663
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add diagonal_loading optional to rtf_power #2369
Conversation
@nateanl has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@nateanl has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was rtf_power included in the last release? the implementation looks good but will change the default behavior of the function to produce different values, so may need to be labeled as BC-breaking
No, it was not in the previous release, I think we can regard the changes as BC. what do you think? |
torchaudio/functional/functional.py
Outdated
(Default: ``True``) | ||
diag_eps (float, optional): The coefficient multiplied to the identity matrix for diagonal loading | ||
(Default: ``1e-7``) | ||
eps (float, optional): a value to avoid the correlation matrix is all-zero (Default: ``1e-8``) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this docstring be more helpful? does a value added to the denominator in the beamforming weight computation.
from #2368 make sense here, and is it worth adding that this is only used for the case when diagonal_loading=True
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. I will align the eps docstring in the functions and modules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The eps
here is for diagonal loading, which is confusing with eps
in computing beamforming weight. I decided to exclude it from the API and use the default value in _tik_reg
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops think I was looking at the wrong branch and thought it was part of last release, that sounds good to me! just the docstring change then
@nateanl has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
When computing the MVDR beamforming weights using the power iteration method, the PSD matrix of noise can be applied with diagonal loading to improve the robustness. This is also applicable to computing the RTF matrix (See https://github.com/espnet/espnet/blob/master/espnet2/enh/layers/beamformer.py#L614 as an example). This also aligns with current
torchaudio.transforms.MVDR
module to keep the consistency.This PR adds the
diagonal_loading
argument withTrue
as default value totorchaudio.functional.rtf_power
.