-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
specaug speedup #6347
specaug speedup #6347
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very cool pr ! The speedup pushes this on par with the Numba method (Numba is maybe a few milliseconds faster but it doesn't matter for CPU use and the speedup is substantial for GPU anymore)
Pr looks good, but I'll ask @VahidooX for final review
@@ -44,15 +45,15 @@ def input_types(self): | |||
"""Returns definitions of module input types | |||
""" | |||
return { | |||
"input_spec": NeuralType(('B', 'D', 'T'), SpectrogramType()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you revert these changes ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done... this was black
s default (string normalization) that was done automatically
@@ -120,13 +126,13 @@ class SpecCutout(nn.Module, Typing): | |||
def input_types(self): | |||
"""Returns definitions of module input types | |||
""" | |||
return {"input_spec": NeuralType(('B', 'D', 'T'), SpectrogramType())} | |||
return {"input_spec": NeuralType(("B", "D", "T"), SpectrogramType())} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revert below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the contribution!
Seems like you need to sign your commits - https://github.com/NVIDIA/NeMo/pull/6347/checks?check_run_id=12461951009 |
…VIDIA#6346) Signed-off-by: smajumdar <titu1994@gmail.com> Signed-off-by: shane carroll <shane.carroll@utsa.edu>
Signed-off-by: shane carroll <shane.carroll@utsa.edu>
for more information, see https://pre-commit.ci Signed-off-by: shane carroll <shane.carroll@utsa.edu>
3e0e0ac
to
c44deba
Compare
Just a suggestion, please have a look at |
Torchaudio is an optional dependency in NeMo, and not installed automatically due to its dependency on fixed pytorch version and compilation. This is a more generic PR that has no requirements other than torch. |
@1-800-BAD-CODE thanks for the PR ! |
* [Core] return_config=True now extracts just config, not full tarfile (NVIDIA#6346) Signed-off-by: smajumdar <titu1994@gmail.com> Signed-off-by: shane carroll <shane.carroll@utsa.edu> * specaug speedup Signed-off-by: shane carroll <shane.carroll@utsa.edu> * comments Signed-off-by: shane carroll <shane.carroll@utsa.edu> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: shane carroll <shane.carroll@utsa.edu> --------- Signed-off-by: smajumdar <titu1994@gmail.com> Signed-off-by: shane carroll <shane.carroll@utsa.edu> Co-authored-by: Somshubra Majumdar <titu1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: hsiehjackson <c2hsieh@ucsd.edu>
What does this PR do ?
Faster implementation of non-numba specaug
Collection: ASR
Changelog
Rather than repeatedly modify the features tensor in-place, build a mask and fill the features tensor once. By my measurements, about 20x faster on GPU.
It could be faster still, but it gets hard to compare output exactly to the original implementation to verify correctness.
Also, the original implementation seems to be biased away from the upper freq bins, but again fixing it would be make it impossible to compare the outputs to the original implementation.
Usage
Running this snippet will run both the original and this implementation, verify similar output, and print latencies.
Test snippet
Expected output:
Before your PR is "Ready for review"
Pre checks:
PR Type: