Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support non-cuda devices for text and vision models #233

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

dvrogozh
Copy link

@dvrogozh dvrogozh commented Dec 3, 2024

This commit adds support of non-cuda pytorch backend devices to text and vision models. Commit extends existing tests to run for the externally specified device (cuda is a default). Commit verified on Llama3.2-3B-Instruct and Llama3.2-11B-Vision-Instruct models for:

  • cuda device type on NVidia A10 GPU (for no regressions)
  • cpu device type
  • xpu device type on Intel Data Center Max Series GPU (PVC)

Note that this commit requires a fix on pytorch side for gloo torch distributed backend to restore TLS on gloo working threads. This change was merged on pytorch side and should make it to pytorch 2.6.

This PR supersedes #165 from @anordin95.

Requires: pytorch/pytorch#142184

dvrogozh and others added 2 commits December 6, 2024 13:39
This commit adds support of non-cuda pytorch backend devices
to text models. Commit extends existing test to run for the
externally specified device (cuda is a default). Commit verified on
Llama3.2-3B-Instruct model for:
* "cuda" device type on NVidia A10 GPU
* "cpu" device type
* "xpu" device type on Intel Data Center Max Series GPU (PVC)

Co-authored-by: anordin95 <alexander.f.nordin@gmail.com>
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
This commit adds support of non-cuda pytorch backend devices
to vision models. Commit verified on Llama3.2-11B-Vision-Instruct
model for:
* "cuda" device type on NVidia A10 GPU
* "cpu" device type
* "xpu" device type on Intel Data Center Max Series GPU (PVC)

Note that this commit requires a fix on pytorch side for gloo
torch distributed backend to restore TLS on gloo working threads.

Requires: pytorch/pytorch#142184
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
@dvrogozh dvrogozh changed the title feat: support non-cuda devices for text models feat: support non-cuda devices for text and vision models Dec 6, 2024
@dvrogozh
Copy link
Author

dvrogozh commented Dec 6, 2024

Updated PR with the changes needed to make vision models working (see 2nd commit). For the test executed the following by commenting out skip conditions:

@unittest.skip("Disabling vision model test")
@pytest.mark.skip(reason="Disabling vision model test")
def test_run_generation(self):

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants