Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix device setting to allow using accelerator cpu #8084

Merged
merged 3 commits into from
Dec 31, 2023
Merged

Conversation

orena1
Copy link
Contributor

@orena1 orena1 commented Dec 26, 2023

What does this PR do ?

It is currently impossible to run wav2vec with trainer.accelerator=cpu if the machine has a gpu.
The padding_mask is already on the device set in trainer.accelerator but if the machine has a GPU the DEVICE variable will contain cuda and this error message will be received:

File /mnt/NVM/oren/anaconda3_10/envs/nemo/lib/python3.10/site-packages/nemo/collections/asr/modules/wav2vec_modules.py:348, in Wav2VecTransformerEncoder.create_padding_mask(self, length)
    345 padding_mask = torch.arange(max_len, device=DEVICE)
    347 # Switch to binary for transformer, 1 for valid tokens, 0 for padding
--> 348 padding_mask = (padding_mask.expand(len(length), max_len) < length.unsqueeze(1)).type(torch.uint8)
    350 return padding_mask

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

PR Type:

  • Bugfix

Additional Information

  • Related to # (issue)

@titu1994, @redoctopus, @jbalam-nv, or @okuchaiev

Signed-off-by: Oren Amsalem <oren.a4@gmail.com>
@github-actions github-actions bot added the ASR label Dec 26, 2023
Copy link
Collaborator

@titu1994 titu1994 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, thanks for the fix ! @tbartley94 for final review

@titu1994
Copy link
Collaborator

jenkins

@orena1
Copy link
Contributor Author

orena1 commented Dec 26, 2023

Thanks @titu1994 , see also here:
#8085

@titu1994
Copy link
Collaborator

jenkins

@titu1994 titu1994 merged commit c9c033d into NVIDIA:main Dec 31, 2023
9 checks passed
@tbartley94
Copy link
Collaborator

@titu1994 i know you pushed but just for record I'm not seeing any issues so review is approved

@titu1994
Copy link
Collaborator

titu1994 commented Jan 2, 2024

Oh thanks for the update, I think I misremembered as this having your approval on slack somehow, turns out it was a different topic different pr. Sorry about that.

@orena1
Copy link
Contributor Author

orena1 commented Jan 2, 2024

do you @titu1994 or @tbartley94 have any idea about this one:
#8085

@tbartley94
Copy link
Collaborator

Hmm, lemme take a look. I'd wager it's some tech debt between PTL versions. (fervor for wav2vec dipped before PTL 2.0 was released, so the code is likely working off a dated namespace.)

@orena1 orena1 deleted the patch-1 branch January 3, 2024 01:20
pzelasko pushed a commit to pzelasko/NeMo that referenced this pull request Jan 3, 2024
* fix device setting to allow using accelerator cpu

Signed-off-by: Oren Amsalem <oren.a4@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Oren Amsalem <oren.a4@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Piotr Żelasko <petezor@gmail.com>
ssh-meister pushed a commit to ssh-meister/NeMo that referenced this pull request Feb 15, 2024
* fix device setting to allow using accelerator cpu

Signed-off-by: Oren Amsalem <oren.a4@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Oren Amsalem <oren.a4@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Sasha Meister <ameister@nvidia.com>
rohitrango pushed a commit to rohitrango/NeMo that referenced this pull request Jun 25, 2024
* fix device setting to allow using accelerator cpu

Signed-off-by: Oren Amsalem <oren.a4@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Oren Amsalem <oren.a4@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants