feat: support non-cuda devices for text and vision models #233

dvrogozh · 2024-12-03T22:05:56Z

This commit adds support of non-cuda pytorch backend devices to text and vision models. Commit extends existing tests to run for the externally specified device (cuda is a default). Commit verified on Llama3.2-3B-Instruct and Llama3.2-11B-Vision-Instruct models for:

cuda device type on NVidia A10 GPU (for no regressions)
cpu device type
xpu device type on Intel Data Center Max Series GPU (PVC)

Note that this commit requires a fix on pytorch side for gloo torch distributed backend to restore TLS on gloo working threads. This change was merged on pytorch side and should make it to pytorch 2.6.

This PR supersedes #165 from @anordin95.

Requires: pytorch/pytorch#142184

Requires: meta-llama/llama-models#233 Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

This commit adds support of non-cuda pytorch backend devices to text models. Commit extends existing test to run for the externally specified device (cuda is a default). Commit verified on Llama3.2-3B-Instruct model for: * "cuda" device type on NVidia A10 GPU * "cpu" device type * "xpu" device type on Intel Data Center Max Series GPU (PVC) Co-authored-by: anordin95 <alexander.f.nordin@gmail.com> Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

This commit adds support of non-cuda pytorch backend devices to vision models. Commit verified on Llama3.2-11B-Vision-Instruct model for: * "cuda" device type on NVidia A10 GPU * "cpu" device type * "xpu" device type on Intel Data Center Max Series GPU (PVC) Note that this commit requires a fix on pytorch side for gloo torch distributed backend to restore TLS on gloo working threads. Requires: pytorch/pytorch#142184 Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

dvrogozh · 2024-12-06T21:53:57Z

Updated PR with the changes needed to make vision models working (see 2nd commit). For the test executed the following by commenting out skip conditions:

llama-models/models/llama3/tests/api/test_generation.py

Lines 79 to 81 in 804a64f

    
           @unittest.skip("Disabling vision model test") 
        
           @pytest.mark.skip(reason="Disabling vision model test") 
        
           def test_run_generation(self):

dvrogozh requested review from ashwinb, yanxi0830, hardikjshah, dltn and raghotham as code owners December 3, 2024 22:05

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Dec 3, 2024

dvrogozh mentioned this pull request Dec 3, 2024

Add optional arg to specify device for Transformer model. #165

Closed

dvrogozh added a commit to dvrogozh/llama-stack that referenced this pull request Dec 3, 2024

feat: enable xpu support for meta-reference stack

4456af7

Requires: meta-llama/llama-models#233 Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

dvrogozh and others added 2 commits December 6, 2024 13:39

dvrogozh force-pushed the devices branch from d8a885d to 563e2a1 Compare December 6, 2024 21:41

dvrogozh changed the title ~~feat: support non-cuda devices for text models~~ feat: support non-cuda devices for text and vision models Dec 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support non-cuda devices for text and vision models #233

feat: support non-cuda devices for text and vision models #233

dvrogozh commented Dec 3, 2024 •

edited

Loading

dvrogozh commented Dec 6, 2024

feat: support non-cuda devices for text and vision models #233

Are you sure you want to change the base?

feat: support non-cuda devices for text and vision models #233

Conversation

dvrogozh commented Dec 3, 2024 • edited Loading

dvrogozh commented Dec 6, 2024

dvrogozh commented Dec 3, 2024 •

edited

Loading