You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Working on a kubernetes deployment with debian + pytorch 2.4.0 + ROCm 6.1.
The deployment is using the multiple backend alpha release available in the parent bitsandbytes repo.
Reproduction
Trying to load a model with bitsandbytes fails because there is no access to rocminfo.
def get_rocm_gpu_arch() -> str:
logger = logging.getLogger(__name__)
try:
if torch.version.hip:
result = subprocess.run(["rocminfo"], capture_output=True, text=True)
match = re.search(r"Name:\s+gfx([a-zA-Z\d]+)", result.stdout)
ERROR:bitsandbytes.cuda_specs:Could not detect ROCm GPU architecture: [Errno 2] No such file or directory: 'rocminfo'
WARNING:bitsandbytes.cuda_specs:
ROCm GPU architecture detection failed despite ROCm being available.
I would prefer if I could set the architecture via an environment variable and rocminfo would be the fallback option if the env var is not set.
Here is the related cope snippet.
Happy to work on this if other people feel it is a good workaround.
The text was updated successfully, but these errors were encountered:
System Info
Working on a kubernetes deployment with debian + pytorch 2.4.0 + ROCm 6.1.
The deployment is using the multiple backend alpha release available in the parent bitsandbytes repo.
Reproduction
Trying to load a model with bitsandbytes fails because there is no access to rocminfo.
bitsandbytes/bitsandbytes/cuda_specs.py
Line 54 in 4aad810
Expected behavior
I would prefer if I could set the architecture via an environment variable and
rocminfo
would be the fallback option if the env var is not set.Here is the related cope snippet.
Happy to work on this if other people feel it is a good workaround.
The text was updated successfully, but these errors were encountered: