-
Notifications
You must be signed in to change notification settings - Fork 27.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Wav2Vec2] Improve SpecAugment function by converting numpy based fun… #10494
Conversation
53a39ae
to
0af1ec9
Compare
mask_idcs.append(np.unique(mask_idc[mask_idc < sz])) | ||
mask_idc = torch.randperm(sz - min_len)[:num_mask] | ||
mask_idc = torch.from_numpy( | ||
np.asarray([mask_idc[j] + offset for j in range(len(mask_idc)) for offset in range(lengths[j])]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's try to not use np
here. If we use torch.from_numpy(...)
- it's not GPU-friendly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mask_idc = torch.tensor([mask_idc[j] + offset for j in range(len(mask_idc)) for offset in range(lengths[j])])
I just did this. Is this right ?
|
||
min_len = min([len(m) for m in mask_idcs]) | ||
min_len = torch.min(mask_idcs) | ||
for i, mask_idc in enumerate(mask_idcs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's try to get rid of the for-loop and do tensor operations only
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure how to do this. Something like following but not getting how should I put it ? Can you help here.
mask[i, mask_idc] = [True, torch.randperm(mask_idc)[:min_len] if torch.tensor(mask_idcs).size() > min_len]
We need to run benchmark tests to see by how much the speed improved both on CPU and GPU |
…ction to pytorch based function Implements huggingface#10459 fixes some style changes
@patrickvonplaten Can you please help with above comments ? |
Hey @punitvara, At the moment, I sadly don't have the time to handle the big chunk of the PR. It would be great if you could try to:
Taking a look at those PRs should help you: #9600, #9453, #6064 |
Closing due to inactivity. Sorry @punitvara, I saw a lot of interest from other people to open a PR and this one seems to have stalled. Feel free to re-open it and give it a second shot if you want :-) |
I got busy into some other work. I will try to work on different issue. If you get any PR, feel free to merge it |
…ction to pytorch based function
Implements #10459
What does this PR do?
Fixes # (issue)
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors which may be interested in your PR.