-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Very slow when using directml AMD Radeon RX 6800 #477
Comments
update: I followed the instructions on this page: #149 When trying to run the model I get the following error: TypeError: 'OnnxRawPipeline' object is not callable sysinfo: cmd trace: venv "C:\Users[userfolder]\StableDiffusionAI\stable-diffusion-webui-amdgpu - Copy\venv\Scripts\Python.exe" To create a public link, set
|
Hey the best way currently for AMD Users on Windows is to run Stable Diffusion via ZLUDA. ZLUDA has the best performance and compatibility and uses less vram compared to DirectML and Onnx. Here are all my AMD Guides, try the Automatic1111 with ZLUDA: |
DirectML has bad performance. Please try ZLUDA if possible. Your card is supported. |
For your last problem where OnnxRawPipeline isn't callable is probably due to SDXL being incompatible with the default ONNXRT Pipeline. I was able to use the default SD v1.5 checkpoint with ONNXRT + Olive just fine. The main problem in the logs is likely:
You will probably have to do additional research to try and enable SDXL with ONNXRT if you want to continue using DML |
If you have diffusers==0.30.0, downgrade it to 0.29.x. |
Checklist
What happened?
While I was able to install onto Windows 10 / AMD GPU (by following the below video) it is very slow to produce images, ~ 1.5s/it using the default model v1-5-pruned-emaonly.safetensors.
How can I get this to be more performant? The video below, using and older version goes from ~1it/s to 19it/s by enabling onnx. What is the procedure for doing similar on this newest version?
Installed as per this video: https://www.youtube.com/watch?v=mKxt0kxD5C0 but did not apply changes for onnx (those features are now deprecated) Most importantly:
Application functions and is using GPU but is very slow.
System specs:
Application settings:
Steps to reproduce the problem
Install the app from scratch and follow the video
What should have happened?
It should be more performant with a 16 GB Radeon 6800
What browsers do you use to access the UI ?
Mozilla Firefox
Sysinfo
{
"Platform": "Windows-10-10.0.19045-SP0",
"Python": "3.10.6",
"Version": "v1.9.3-amd-26-g50d3cf78",
"Commit": "50d3cf7852cfe07bd562440246202d8925be98a4",
"Script path": "C:\Users\[folder]\StableDiffusionAI\stable-diffusion-webui-amdgpu",
"Data path": "C:\Users\[folder]\StableDiffusionAI\stable-diffusion-webui-amdgpu",
"Extensions dir": "C:\Users\[folder]\StableDiffusionAI\stable-diffusion-webui-amdgpu\extensions",
"Checksum": "643cc8f8ddd09f90c2bfac81e3d6a4ce4ca9cf8d8844c548104e0d9a8ef2b2cc",
"Commandline": [
"launch.py",
"--use-directml"
],
"Torch env info": {
"torch_version": "2.3.1+cpu",
"is_debug_build": "False",
"cuda_compiled_version": null,
"gcc_version": "(MinGW-W64 x86_64-ucrt-posix-seh, built by Brecht Sanders) 12.2.0\r",
"clang_version": "14.0.6",
"cmake_version": null,
"os": "Microsoft Windows 10 Home",
"libc_version": "N/A",
"python_version": "3.10.6 (tags/v3.10.6:9c7b4bd, Aug 1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)] (64-bit runtime)",
"python_platform": "Windows-10-10.0.19045-SP0",
"is_cuda_available": "False",
"cuda_runtime_version": null,
"cuda_module_loading": "N/A",
"nvidia_driver_version": null,
"nvidia_gpu_models": null,
"cudnn_version": null,
"pip_version": "pip3",
"pip_packages": [
"numpy==1.26.2",
"onnx==1.16.1",
"onnxruntime==1.18.0",
"onnxruntime-directml==1.18.0",
"open-clip-torch==2.20.0",
"pytorch-lightning==1.9.4",
"torch==2.3.1",
"torch-directml==0.2.2.dev240614",
"torchdiffeq==0.2.3",
"torchmetrics==1.4.0.post0",
"torchsde==0.2.6",
"torchvision==0.18.1"
],
"conda_packages": null,
"hip_compiled_version": "N/A",
"hip_runtime_version": "N/A",
"miopen_runtime_version": "N/A",
"caching_allocator_config": "",
"is_xnnpack_available": "True",
"cpu_info": [
"Architecture=9",
"CurrentClockSpeed=4401",
"DeviceID=CPU0",
"Family=107",
"L2CacheSize=4096",
"L2CacheSpeed=",
"Manufacturer=AuthenticAMD",
"MaxClockSpeed=4401",
"Name=AMD Ryzen 7 5800X 8-Core Processor ",
"ProcessorType=3",
"Revision=8448"
]
},
"Exceptions": [],
"CPU": {
"model": "AMD64 Family 25 Model 33 Stepping 0, AuthenticAMD",
"count logical": 16,
"count physical": 8
},
"RAM": {
"total": "128GB",
"used": "24GB",
"free": "104GB"
},
"GPU": {
"model": "AMD Radeon RX 6800",
"total_memory": 1509261312
},
"Extensions": [],
"Inactive extensions": [],
"Environment": {
"COMMANDLINE_ARGS": "--use-directml",
"GRADIO_ANALYTICS_ENABLED": "False"
},
"Config": {
"ldsr_steps": 100,
"ldsr_cached": false,
"SCUNET_tile": 256,
"SCUNET_tile_overlap": 8,
"SWIN_tile": 192,
"SWIN_tile_overlap": 8,
"SWIN_torch_compile": false,
"hypertile_enable_unet": false,
"hypertile_enable_unet_secondpass": false,
"hypertile_max_depth_unet": 3,
"hypertile_max_tile_unet": 256,
"hypertile_swap_size_unet": 3,
"hypertile_enable_vae": false,
"hypertile_max_depth_vae": 3,
"hypertile_max_tile_vae": 128,
"hypertile_swap_size_vae": 3,
"sd_model_checkpoint": "v1-5-pruned-emaonly.safetensors [6ce0161689]",
"sd_checkpoint_hash": "6ce0161689b3853acaa03779ec93eafe75a02f4ced659bee03f50797806fa2fa"
},
"Startup": {
"total": 11.81886911392212,
"records": {
"initial startup": 0.027023792266845703,
"prepare environment/checks": 0.04403877258300781,
"prepare environment/git version info": 0.12510991096496582,
"prepare environment/clone repositores": 0.2562258243560791,
"prepare environment/run extensions installers": 0.004002571105957031,
"prepare environment": 16.147207736968994,
"launcher": 0.0010008811950683594,
"import torch": 0.0,
"import gradio": 0.0010008811950683594,
"setup paths": 0.0,
"import ldm": 0.0030031204223632812,
"import sgm": 0.0,
"initialize shared": 1.5313475131988525,
"other imports": 0.0590512752532959,
"opts onchange": 0.0,
"setup SD model": 0.0,
"setup codeformer": 0.0020020008087158203,
"setup gfpgan": 0.01501321792602539,
"set samplers": 0.0,
"list extensions": 0.0030024051666259766,
"restore config state file": 0.0,
"list SD models": 0.03149867057800293,
"list localizations": 0.0010008811950683594,
"load scripts/custom_code.py": 0.006005525588989258,
"load scripts/img2imgalt.py": 0.001001119613647461,
"load scripts/loopback.py": 0.0,
"load scripts/outpainting_mk_2.py": 0.0010008811950683594,
"load scripts/poor_mans_outpainting.py": 0.0,
"load scripts/postprocessing_codeformer.py": 0.0,
"load scripts/postprocessing_gfpgan.py": 0.0010004043579101562,
"load scripts/postprocessing_upscale.py": 0.0,
"load scripts/prompt_matrix.py": 0.0,
"load scripts/prompts_from_file.py": 0.0010004043579101562,
"load scripts/sd_upscale.py": 0.0,
"load scripts/xyz_grid.py": 0.002002239227294922,
"load scripts/ldsr_model.py": 0.5134520530700684,
"load scripts/lora_script.py": 0.1401228904724121,
"load scripts/scunet_model.py": 0.023020267486572266,
"load scripts/swinir_model.py": 0.01801609992980957,
"load scripts/hotkey_config.py": 0.002001523971557617,
"load scripts/extra_options_section.py": 0.002002239227294922,
"load scripts/hypertile_script.py": 0.050043582916259766,
"load scripts/hypertile_xyz.py": 0.0,
"load scripts/postprocessing_autosized_crop.py": 0.0020024776458740234,
"load scripts/postprocessing_caption.py": 0.002001047134399414,
"load scripts/postprocessing_create_flipped_copies.py": 0.0010013580322265625,
"load scripts/postprocessing_focal_crop.py": 0.05704951286315918,
"load scripts/postprocessing_split_oversized.py": 0.0010013580322265625,
"load scripts/soft_inpainting.py": 0.002001523971557617,
"load scripts/comments.py": 0.022019386291503906,
"load scripts/refiner.py": 0.0020020008087158203,
"load scripts/sampler.py": 0.0010008811950683594,
"load scripts/seed.py": 0.002001523971557617,
"load scripts": 0.852750301361084,
"load upscalers": 0.007006168365478516,
"refresh VAE": 0.0020017623901367188,
"refresh textual inversion templates": 0.0,
"scripts list_optimizers": 0.0020017623901367188,
"scripts list_unets": 0.0,
"reload hypernetworks": 0.0010006427764892578,
"initialize extra networks": 0.018016576766967773,
"scripts before_ui_callback": 0.0020012855529785156,
"create ui": 0.3503081798553467,
"gradio launch": 0.6355588436126709,
"add APIs": 0.0070056915283203125,
"app_started_callback/lora_script.py": 0.0,
"app_started_callback": 0.0
}
},
"Packages": [
"accelerate==0.21.0",
"aenum==3.1.15",
"aiofiles==23.2.1",
"aiohttp==3.9.5",
"aiosignal==1.3.1",
"alembic==1.13.1",
"altair==5.3.0",
"annotated-types==0.7.0",
"antlr4-python3-runtime==4.9.3",
"anyio==3.7.1",
"async-timeout==4.0.3",
"attrs==23.2.0",
"blendmodes==2022",
"certifi==2024.6.2",
"charset-normalizer==3.3.2",
"clean-fid==0.1.35",
"click==8.1.7",
"clip==1.0",
"colorama==0.4.6",
"coloredlogs==15.0.1",
"colorlog==6.8.2",
"contourpy==1.2.1",
"cycler==0.12.1",
"datasets==2.14.4",
"deprecation==2.1.0",
"diffusers==0.29.0",
"dill==0.3.7",
"diskcache==5.6.3",
"dnspython==2.6.1",
"einops==0.4.1",
"email-validator==2.2.0",
"exceptiongroup==1.2.1",
"facexlib==0.3.0",
"fastapi-cli==0.0.4",
"fastapi==0.94.0",
"ffmpy==0.3.2",
"filelock==3.15.3",
"filterpy==1.4.5",
"flatbuffers==24.3.25",
"fonttools==4.53.0",
"frozenlist==1.4.1",
"fsspec==2024.6.0",
"ftfy==6.2.0",
"gitdb==4.0.11",
"gitpython==3.1.32",
"gradio-client==0.5.0",
"gradio==3.41.2",
"greenlet==3.0.3",
"h11==0.12.0",
"httpcore==0.15.0",
"httptools==0.6.1",
"httpx==0.24.1",
"huggingface-hub==0.23.4",
"humanfriendly==10.0",
"idna==3.7",
"imageio==2.34.1",
"importlib-metadata==7.2.0",
"importlib-resources==6.4.0",
"inflection==0.5.1",
"intel-openmp==2021.4.0",
"jinja2==3.1.4",
"jsonmerge==1.8.0",
"jsonschema-specifications==2023.12.1",
"jsonschema==4.22.0",
"kiwisolver==1.4.5",
"kornia-rs==0.1.3",
"kornia==0.6.7",
"lark==1.1.2",
"lazy-loader==0.4",
"lightning-utilities==0.11.2",
"llvmlite==0.43.0",
"mako==1.3.5",
"markdown-it-py==3.0.0",
"markupsafe==2.1.5",
"matplotlib==3.9.0",
"mdurl==0.1.2",
"mkl==2021.4.0",
"mpmath==1.3.0",
"multidict==6.0.5",
"multiprocess==0.70.15",
"networkx==3.3",
"numba==0.60.0",
"numpy==1.26.2",
"olive-ai==0.6.2",
"omegaconf==2.2.3",
"onnx==1.16.1",
"onnxruntime-directml==1.18.0",
"onnxruntime==1.18.0",
"open-clip-torch==2.20.0",
"opencv-python==4.10.0.84",
"optimum==1.20.0",
"optuna==3.6.1",
"orjson==3.10.5",
"packaging==24.1",
"pandas==2.2.2",
"piexif==1.1.3",
"pillow-avif-plugin==1.4.3",
"pillow==9.5.0",
"pip==22.2.1",
"protobuf==3.20.3",
"psutil==5.9.5",
"pyarrow==16.1.0",
"pydantic-core==2.18.4",
"pydantic==1.10.17",
"pydub==0.25.1",
"pygments==2.18.0",
"pyparsing==3.1.2",
"pyreadline3==3.4.1",
"python-dateutil==2.9.0.post0",
"python-dotenv==1.0.1",
"python-multipart==0.0.9",
"pytorch-lightning==1.9.4",
"pytz==2024.1",
"pywavelets==1.6.0",
"pyyaml==6.0.1",
"referencing==0.35.1",
"regex==2024.5.15",
"requests==2.32.3",
"resize-right==0.0.2",
"rich==13.7.1",
"rpds-py==0.18.1",
"safetensors==0.4.2",
"scikit-image==0.21.0",
"scipy==1.13.1",
"semantic-version==2.10.0",
"sentencepiece==0.2.0",
"setuptools==69.5.1",
"shellingham==1.5.4",
"six==1.16.0",
"smmap==5.0.1",
"sniffio==1.3.1",
"spandrel==0.1.6",
"sqlalchemy==2.0.31",
"starlette==0.26.1",
"sympy==1.12.1",
"tbb==2021.13.0",
"tifffile==2024.6.18",
"timm==1.0.7",
"tokenizers==0.13.3",
"tomesd==0.1.3",
"toolz==0.12.1",
"torch-directml==0.2.2.dev240614",
"torch==2.3.1",
"torchdiffeq==0.2.3",
"torchmetrics==1.4.0.post0",
"torchsde==0.2.6",
"torchvision==0.18.1",
"tqdm==4.66.4",
"trampoline==0.1.2",
"transformers==4.30.2",
"typer==0.12.3",
"typing-extensions==4.12.2",
"tzdata==2024.1",
"ujson==5.10.0",
"urllib3==2.2.2",
"uvicorn==0.30.1",
"watchfiles==0.22.0",
"wcwidth==0.2.13",
"websockets==11.0.3",
"xxhash==3.4.1",
"yarl==1.9.4",
"zipp==3.19.2"
]
}
Console logs
Additional information
GPU driver has been updated recently to v24.4.1
The text was updated successfully, but these errors were encountered: