pytorch: safetensors library hardcodes using CUDA if only device index is provided #499

dvrogozh · 2024-07-13T01:50:25Z

In relevance to:

cuda device is wrongly requested instead of xpu running pipeline(device_map="auto", max_memory": {0: 1.0e+10}) transformers#31941

safetensors library hardcodes returning CUDA device if only device index is provided. This causes runtime errors running huggingface models with pipeline(device_map="auto") as noted in huggingface/transformers#31941 (see this issue for repro steps). Hardcoding is happening here:

safetensors/bindings/python/src/lib.rs

Lines 296 to 297 in 079781f

    
           } else if let Ok(number) = ob.extract::<usize>() { 
        
               Ok(Device::Cuda(number))

A possible solution might be to return the device returned by torch.device(N). Note however that this will work for non-CUDA devices only after the following change in pytorch will be merged:

Refine the logic of device construction when only device index is given pytorch/pytorch#129119
This change modifies behavior for torch.device(N) to return current accelerator device instead of cuda device. This seems to be anticipated change by huggingface according to fix bug when getting the real accelerator's device number accelerate#2874 (comment).

CC: @faaany @muellerzr @SunMarc @guangyey

The text was updated successfully, but these errors were encountered:

dvrogozh · 2024-07-15T16:29:58Z

FYI, pytorch/pytorch#129119 got merged, so solution which I outlined should now be possible.

Fixes: huggingface#499 Fixes: huggingface/transformers#31941 In some cases only device index is given on querying device. In this case both PyTorch and Safetensors were returning 'cuda:N' by default. This is causing runtime failures if user actually runs something on non-cuda device and does not have cuda at all. Recently this was addressed on PyTorch side by [1]: starting from PyTorch 2.5 calling 'torch.device(N)' will return current device instead of cuda device. This commit is making similar change to Safetensors. If only device index is given, Safetensors will query and return device calling 'torch.device(N)'. This change is backward compatible since this call would return 'cuda:N' on PyTorch <=2.4 which aligns with previous Safetensors behavior. See[1]: pytorch/pytorch#129119 Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

dvrogozh · 2024-07-15T20:58:02Z

I have implemented a fix for this issue as I do see it. Please, help review #500.

Fixes: huggingface#499 Fixes: huggingface/transformers#31941 In some cases only device index is given on querying device. In this case both PyTorch and Safetensors were returning 'cuda:N' by default. This is causing runtime failures if user actually runs something on non-cuda device and does not have cuda at all. Recently this was addressed on PyTorch side by [1]: starting from PyTorch 2.5 calling 'torch.device(N)' will return current device instead of cuda device. This commit is making similar change to Safetensors. If only device index is given, Safetensors will query and return device calling 'torch.device(N)'. This change is backward compatible since this call would return 'cuda:N' on PyTorch <=2.4 which aligns with previous Safetensors behavior. See[1]: pytorch/pytorch#129119 Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

Narsil · 2024-08-01T15:27:38Z

Closed by #509

This was referenced Jul 13, 2024

cuda device is wrongly requested instead of xpu running pipeline(device_map="auto", max_memory": {0: 1.0e+10}) huggingface/transformers#31941

Closed

Refine the logic of device construction when only device index is given pytorch/pytorch#129119

Closed

dvrogozh mentioned this issue Jul 15, 2024

Query device name from pytorch if only device index is given #500

Closed

dvrogozh mentioned this issue Jul 31, 2024

Respects torch.device(0) new behavior without breaking backward compatibility #509

Merged

Narsil closed this as completed Aug 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pytorch: safetensors library hardcodes using CUDA if only device index is provided #499

pytorch: safetensors library hardcodes using CUDA if only device index is provided #499

dvrogozh commented Jul 13, 2024

dvrogozh commented Jul 15, 2024

dvrogozh commented Jul 15, 2024

Narsil commented Aug 1, 2024

pytorch: safetensors library hardcodes using CUDA if only device index is provided #499

pytorch: safetensors library hardcodes using CUDA if only device index is provided #499

Comments

dvrogozh commented Jul 13, 2024

dvrogozh commented Jul 15, 2024

dvrogozh commented Jul 15, 2024

Narsil commented Aug 1, 2024