-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
is Xformers with ZLUDA possible? #23
Comments
Do you just need comfyui to work? If so, try WSL with ROCm. It supports Flash Attention 2. |
im trying it now.... when did this come out? |
Very recently. Are you on gfx1100? (RX 7900 XT(X), GRE, etc) |
yes, 7900xt |
So i've been testing the ROCm driver for WSL. There are sill use-cases for ZLUDA with PyTorch, particularly pertaining to https://github.com/hpcaitech/Open-Sora. seems to need CUDA. i find ROCm is about 2-3x faster than ZLUDA with Pytorch |
Could you benchmark it with the regulations on this page? |
i am trying to compile from source right now and once i get that, i have to make sure my environment is stable. if i accomplish this, i'll benchmark. |
lol...wow, i'm here again. at my own problem that i had. Nice to see you all. @lshqqytiger , first of all -- you're a god, and we need your help. I made the journey, and now I am back here. At the moment, THE BEST COMBINATION I have concluded would be ZLUDA on LINUX or WSL2. Certain packages will not compile on Windows as far as Python goes, and then ROCm is a nightmare no matter what you do. Particularly, with ROCm, 3D stuff is not available at the moment. This pertains to: https://rocm.docs.amd.com/projects/HIP/en/latest/how-to/cooperative_groups.html
projects like https://github.com/graphdeco-inria/gaussian-splatting which are dependent on I am having the above issue i mentioned months ago, again because i have made the switch back to ZLUDA from ROCm within the past few days (mostly to test the state of the community). I think the issue i'm having currently is trying to run pytorch with There are no WSL2 supported ROCm for basic image diffusion, everything works fairly well with ROCm My current issue: INFO | 2024-08-27 05:32:06 | autotrain.trainers.common:on_train_begin:230 - Starting to train...
0%| | 0/456 [00:00<?, ?it/s]ERROR | 2024-08-27 05:32:07 | autotrain.trainers.common:wrapper:120 - train has failed due to an exception: Traceback (most recent call last):
File "P:\.pytorchvenv\lib\site-packages\autotrain\trainers\common.py", line 117, in wrapper
return func(*args, **kwargs)
File "P:\.pytorchvenv\lib\site-packages\autotrain\trainers\sent_transformers\__main__.py", line 213, in train
trainer.train()
File "P:\.pytorchvenv\lib\site-packages\transformers\trainer.py", line 1938, in train
return inner_training_loop(
File "P:\.pytorchvenv\lib\site-packages\transformers\trainer.py", line 2279, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File "P:\.pytorchvenv\lib\site-packages\transformers\trainer.py", line 3318, in training_step
loss = self.compute_loss(model, inputs)
File "P:\.pytorchvenv\lib\site-packages\sentence_transformers\trainer.py", line 329, in compute_loss
loss = loss_fn(features, labels)
File "P:\.pytorchvenv\lib\site-packages\torch\nn\modules\module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "P:\.pytorchvenv\lib\site-packages\torch\nn\modules\module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "P:\.pytorchvenv\lib\site-packages\sentence_transformers\losses\CoSENTLoss.py", line 79, in forward
embeddings = [self.model(sentence_feature)["sentence_embedding"] for sentence_feature in sentence_features]
File "P:\.pytorchvenv\lib\site-packages\sentence_transformers\losses\CoSENTLoss.py", line 79, in <listcomp>
embeddings = [self.model(sentence_feature)["sentence_embedding"] for sentence_feature in sentence_features]
File "P:\.pytorchvenv\lib\site-packages\torch\nn\modules\module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "P:\.pytorchvenv\lib\site-packages\torch\nn\modules\module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "P:\.pytorchvenv\lib\site-packages\accelerate\utils\operations.py", line 819, in forward
return model_forward(*args, **kwargs)
File "P:\.pytorchvenv\lib\site-packages\accelerate\utils\operations.py", line 807, in __call__
return convert_to_fp32(self.model_forward(*args, **kwargs))
File "P:\.pytorchvenv\lib\site-packages\torch\amp\autocast_mode.py", line 44, in decorate_autocast
return func(*args, **kwargs)
File "P:\.pytorchvenv\lib\site-packages\torch\nn\modules\container.py", line 250, in forward
input = module(input)
File "P:\.pytorchvenv\lib\site-packages\torch\nn\modules\module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "P:\.pytorchvenv\lib\site-packages\torch\nn\modules\module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "P:\.pytorchvenv\lib\site-packages\sentence_transformers\models\Transformer.py", line 118, in forward
output_states = self.auto_model(**trans_features, return_dict=False)
File "P:\.pytorchvenv\lib\site-packages\torch\nn\modules\module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "P:\.pytorchvenv\lib\site-packages\torch\nn\modules\module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "P:\.pytorchvenv\lib\site-packages\transformers\models\gemma2\modeling_gemma2.py", line 803, in forward
inputs_embeds = self.embed_tokens(input_ids)
File "P:\.pytorchvenv\lib\site-packages\torch\nn\modules\module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "P:\.pytorchvenv\lib\site-packages\torch\nn\modules\module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "P:\.pytorchvenv\lib\site-packages\torch\nn\modules\sparse.py", line 190, in forward
return F.embedding(
File "P:\.pytorchvenv\lib\site-packages\torch\nn\functional.py", line 2551, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: CUDA error: operation not supported
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
ERROR | 2024-08-27 05:32:07 | autotrain.trainers.common:wrapper:121 - CUDA error: operation not supported
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions. i think this is because i have xformers installed. Any recommendations? |
and apparently it does work https://vladmandic.github.io/sd-extension-system-info/pages/benchmark.html keyword
|
I don't think xformers can work on ZLUDA without rebuilding. |
i think it actually does work fine on cu118 whl from pytorch. I believe i was using all the way up to 2.5.0, but CUDNN was causing errors.
https://github.com/ROCm/flash-attention/tree/howiejay/navi_support was what was recommended to me |
In fact, with xformers on ZLUDA, this was my error:
but i think it was working just not for that type of workload. |
i compiled ZLUDA
Finished `release` profile [optimized] target(s) in 5m 40s
i dowloaded
nccl
from NVIDIA and placed it inside of the ZLUDA directoryP:\gitrepos\ZLUDA\nccl_2.21.5-1+cuda11.0_x86_64
with
pytorch-build.bat
:is it possible with this configuration to set
torch.backends.cudnn.enabled = True
?the error i get with
torch.backends.cudnn.enabled = True
. perhaps it is unrelated, but i am just trying to allow for xformers to function.The text was updated successfully, but these errors were encountered: