-
Notifications
You must be signed in to change notification settings - Fork 196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pytorch: safetensors library hardcodes using CUDA if only device index is provided #499
Comments
FYI, pytorch/pytorch#129119 got merged, so solution which I outlined should now be possible. |
dvrogozh
added a commit
to dvrogozh/safetensors
that referenced
this issue
Jul 15, 2024
Fixes: huggingface#499 Fixes: huggingface/transformers#31941 In some cases only device index is given on querying device. In this case both PyTorch and Safetensors were returning 'cuda:N' by default. This is causing runtime failures if user actually runs something on non-cuda device and does not have cuda at all. Recently this was addressed on PyTorch side by [1]: starting from PyTorch 2.5 calling 'torch.device(N)' will return current device instead of cuda device. This commit is making similar change to Safetensors. If only device index is given, Safetensors will query and return device calling 'torch.device(N)'. This change is backward compatible since this call would return 'cuda:N' on PyTorch <=2.4 which aligns with previous Safetensors behavior. See[1]: pytorch/pytorch#129119 Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
I have implemented a fix for this issue as I do see it. Please, help review #500. |
dvrogozh
added a commit
to dvrogozh/safetensors
that referenced
this issue
Jul 25, 2024
Fixes: huggingface#499 Fixes: huggingface/transformers#31941 In some cases only device index is given on querying device. In this case both PyTorch and Safetensors were returning 'cuda:N' by default. This is causing runtime failures if user actually runs something on non-cuda device and does not have cuda at all. Recently this was addressed on PyTorch side by [1]: starting from PyTorch 2.5 calling 'torch.device(N)' will return current device instead of cuda device. This commit is making similar change to Safetensors. If only device index is given, Safetensors will query and return device calling 'torch.device(N)'. This change is backward compatible since this call would return 'cuda:N' on PyTorch <=2.4 which aligns with previous Safetensors behavior. See[1]: pytorch/pytorch#129119 Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
dvrogozh
added a commit
to dvrogozh/safetensors
that referenced
this issue
Jul 30, 2024
Fixes: huggingface#499 Fixes: huggingface/transformers#31941 In some cases only device index is given on querying device. In this case both PyTorch and Safetensors were returning 'cuda:N' by default. This is causing runtime failures if user actually runs something on non-cuda device and does not have cuda at all. Recently this was addressed on PyTorch side by [1]: starting from PyTorch 2.5 calling 'torch.device(N)' will return current device instead of cuda device. This commit is making similar change to Safetensors. If only device index is given, Safetensors will query and return device calling 'torch.device(N)'. This change is backward compatible since this call would return 'cuda:N' on PyTorch <=2.4 which aligns with previous Safetensors behavior. See[1]: pytorch/pytorch#129119 Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
dvrogozh
added a commit
to dvrogozh/safetensors
that referenced
this issue
Jul 31, 2024
Fixes: huggingface#499 Fixes: huggingface/transformers#31941 In some cases only device index is given on querying device. In this case both PyTorch and Safetensors were returning 'cuda:N' by default. This is causing runtime failures if user actually runs something on non-cuda device and does not have cuda at all. Recently this was addressed on PyTorch side by [1]: starting from PyTorch 2.5 calling 'torch.device(N)' will return current device instead of cuda device. This commit is making similar change to Safetensors. If only device index is given, Safetensors will query and return device calling 'torch.device(N)'. This change is backward compatible since this call would return 'cuda:N' on PyTorch <=2.4 which aligns with previous Safetensors behavior. See[1]: pytorch/pytorch#129119 Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Closed by #509 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
In relevance to:
safetensors library hardcodes returning CUDA device if only device index is provided. This causes runtime errors running huggingface models with
pipeline(device_map="auto")
as noted in huggingface/transformers#31941 (see this issue for repro steps). Hardcoding is happening here:safetensors/bindings/python/src/lib.rs
Lines 296 to 297 in 079781f
A possible solution might be to return the device returned by
torch.device(N)
. Note however that this will work for non-CUDA devices only after the following change in pytorch will be merged:This change modifies behavior for
torch.device(N)
to return current accelerator device instead of cuda device. This seems to be anticipated change by huggingface according to fix bug when getting the real accelerator's device number accelerate#2874 (comment).CC: @faaany @muellerzr @SunMarc @guangyey
The text was updated successfully, but these errors were encountered: