"Torch is not able to use GPU" - Fresh Kicksecure (Debian) with all the necessary packages #16653
Replies: 1 comment
-
Now it's working with Debian. However, exact same steps don't work for Kicksecure. I don't care anymore, topic closed. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello!
I'm struggling for 3 days now, I checked hundreds of posts all around the internet without success.
I installed everything described in the Github page + NVIDIA driver, CUDA driver, dependencies.... But I always get the following error:
`################################################################
Install script for stable-diffusion + Web UI
Tested on Debian 11 (Bullseye), Fedora 34+ and openSUSE Leap 15.4 or newer.
################################################################
################################################################
Running on user user
################################################################
################################################################
Repo already cloned, using it as install directory
################################################################
################################################################
Create and activate python venv
################################################################
################################################################
Launching launch.py...
################################################################
glibc version is 2.36
Check TCMalloc: libtcmalloc_minimal.so.4
libtcmalloc_minimal.so.4 is linked with libc.so,execute LD_PRELOAD=/lib/x86_64-linux-gnu/libtcmalloc_minimal.so.4
Python 3.11.2 (main, Sep 14 2024, 03:00:30) [GCC 12.2.0]
Version: v1.10.1
Commit hash: 82a973c
Traceback (most recent call last):
File "/home/user/Documents/stable-diffusion-webui/launch.py", line 48, in
main()
File "/home/user/Documents/stable-diffusion-webui/launch.py", line 39, in main
prepare_environment()
File "/home/user/Documents/stable-diffusion-webui/modules/launch_utils.py", line 387, in prepare_environment
raise RuntimeError(
RuntimeError: Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check`
If I use
--skip-torch-cuda-test
parameter, then I get a different error:`################################################################
Install script for stable-diffusion + Web UI
Tested on Debian 11 (Bullseye), Fedora 34+ and openSUSE Leap 15.4 or newer.
################################################################
################################################################
Running on user user
################################################################
################################################################
Repo already cloned, using it as install directory
################################################################
################################################################
Create and activate python venv
################################################################
################################################################
Launching launch.py...
################################################################
glibc version is 2.36
Check TCMalloc: libtcmalloc_minimal.so.4
libtcmalloc_minimal.so.4 is linked with libc.so,execute LD_PRELOAD=/lib/x86_64-linux-gnu/libtcmalloc_minimal.so.4
Python 3.11.2 (main, Sep 14 2024, 03:00:30) [GCC 12.2.0]
Version: v1.10.1
Commit hash: 82a973c
Launching Web UI with arguments: --skip-torch-cuda-test
/home/user/Documents/stable-diffusion-webui/venv/lib/python3.11/site-packages/timm/models/layers/init.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers
warnings.warn(f"Importing from {name} is deprecated, please import via timm.layers", FutureWarning)
no module 'xformers'. Processing without...
no module 'xformers'. Processing without...
No module 'xformers'. Proceeding without it.
CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero.: str
Traceback (most recent call last):
File "/home/user/Documents/stable-diffusion-webui/modules/errors.py", line 98, in run
code()
File "/home/user/Documents/stable-diffusion-webui/modules/devices.py", line 106, in enable_tf32
if cuda_no_autocast():
^^^^^^^^^^^^^^^^^^
File "/home/user/Documents/stable-diffusion-webui/modules/devices.py", line 28, in cuda_no_autocast
device_id = get_cuda_device_id()
^^^^^^^^^^^^^^^^^^^^
File "/home/user/Documents/stable-diffusion-webui/modules/devices.py", line 40, in get_cuda_device_id
) or torch.cuda.current_device()
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/Documents/stable-diffusion-webui/venv/lib/python3.11/site-packages/torch/cuda/init.py", line 769, in current_device
_lazy_init()
File "/home/user/Documents/stable-diffusion-webui/venv/lib/python3.11/site-packages/torch/cuda/init.py", line 298, in _lazy_init
torch._C._cuda_init()
RuntimeError: CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/user/Documents/stable-diffusion-webui/launch.py", line 48, in
main()
File "/home/user/Documents/stable-diffusion-webui/launch.py", line 44, in main
start()
File "/home/user/Documents/stable-diffusion-webui/modules/launch_utils.py", line 465, in start
import webui
File "/home/user/Documents/stable-diffusion-webui/webui.py", line 13, in
initialize.imports()
File "/home/user/Documents/stable-diffusion-webui/modules/initialize.py", line 36, in imports
shared_init.initialize()
File "/home/user/Documents/stable-diffusion-webui/modules/shared_init.py", line 17, in initialize
from modules import options, shared_options
File "/home/user/Documents/stable-diffusion-webui/modules/shared_options.py", line 4, in
from modules import localization, ui_components, shared_items, shared, interrogate, shared_gradio_themes, util, sd_emphasis
File "/home/user/Documents/stable-diffusion-webui/modules/interrogate.py", line 13, in
from modules import devices, paths, shared, lowvram, modelloader, errors, torch_utils
File "/home/user/Documents/stable-diffusion-webui/modules/devices.py", line 113, in
errors.run(enable_tf32, "Enabling TF32")
File "/home/user/Documents/stable-diffusion-webui/modules/errors.py", line 100, in run
display(task, e)
File "/home/user/Documents/stable-diffusion-webui/modules/errors.py", line 68, in display
te = traceback.TracebackException.from_exception(e)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/traceback.py", line 788, in from_exception
return cls(type(exc), exc, exc.traceback, *args, **kwargs)
^^^^^^^^^^^^^^^^^
AttributeError: 'str' object has no attribute 'traceback'`
I have a 3080Ti and up-to-date version of Kicksecure (fresh install), but I experienced the same issue with Debian, too:
I did the following:
Installed some important packages:
sudo apt -y install wget git python3 python3.11 python3-venv python3.11-venv python3-pip python3-launchpadlib libgl1 libglib2.0-0 software-properties-common google-perftools
Installed CUDA drivers
sudo apt-get install linux-headers-$(uname -r) sudo add-apt-repository contrib sudo dpkg -i nvidia-driver-local-repo-<distro>-X.<version>*_x86_64.deb sudo cp /var/nvidia-driver-local-repo-<distro>-X.<version>/nvidia-driver-*-keyring.gpg /usr/share/keyrings/ sudo apt-get update sudo apt-get -y install cuda-drivers nvidia-cuda-toolkit sudo /sbin/reboot
Cloned, then started StableDiffusion repository
cd /home/user/Documents/ git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui cd /home/user/Documents/stable-diffusion-webui sudo chmod +x webui.sh ./webui.sh
For me it's not clear what to do, where the problem is... Before this I tried to install nvidia and CUDA drivers from debian repositories, I downloaded the drivers from NVIDIA site, nothing worked.
I have python3.11 installed, too. I read everywhere about 3.10, but that one is not supported anywhere, it's missing even from "deadsnake repository".
nvidia-smi
gives me the following information:+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.01 Driver Version: 535.183.01 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 3080 Ti Off | 00000000:01:00.0 On | N/A |
| 0% 43C P8 22W / 400W | 827MiB / 12288MiB | 3% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 1200 G /usr/lib/xorg/Xorg 362MiB |
| 0 N/A N/A 1358 G xfwm4 5MiB |
| 0 N/A N/A 10373 G /usr/lib/firefox-esr/firefox-esr 9MiB |
| 0 N/A N/A 11010 C+G /usr/lib/virtualbox/VirtualBoxVM 427MiB |
+---------------------------------------------------------------------------------------+
Do i need different CUDA version? if so, which version? Where is that stated? Different Python version? What is the currently supported versions of each components at all?
I remember last time I installed Stable Diffusion was much easier and I had 0 issues out of the box. Now I can't even make my GPU work with Stable Diffusion + proprietary driver on Debian/Kicksecure.
Beta Was this translation helpful? Give feedback.
All reactions