Working on Windows + AMD #13

TangrisJones · 2023-02-25T00:54:26Z

TangrisJones
Feb 25, 2023

Hey, thanks for this awresome web UI. Got this thing to work with AMD (tested so far on txt2img & img2img).

Thank you very much! 👍

gatttahaveit1 · 2023-02-25T03:04:22Z

gatttahaveit1
Feb 25, 2023

"make sure you have the modified repositories in stable-diffusion-webui-directml/repositories/:"
I don't know what this means?? What's a modified repository?

"Place any stable diffusion checkpoint (ckpt or safetensor) in the models/Stable-diffusion directory"
Also don't know what a stable diffusion checkpoint or safetensor is either.

5 replies

dezenhando Feb 25, 2023

"Place any stable diffusion checkpoint (ckpt or safetensor) in the models/Stable-diffusion directory"
Also don't know what a stable diffusion checkpoint or safetensor is either.

That´s the MODEL, you have to download a MODEL and copy it to your model/stable diffusion file

Reighnvhasta Feb 25, 2023

Hi gatttahaveit1 ---

I just figured this out. Checkout the folder tree under this link: https://github.com/lshqqytiger/stable-diffusion-webui-directml/tree/master/repositories

In your computer, when you download the files from stable-diffusion-webui-directml, the "repositories" folder is empty. And in the above link, there are 2 folders (namely k-diffusion and stable-diffusion-stability-ai) inside. Download these 2 folders and put them in the folder on your computer, this shall allow webui.bat to run properly.

I am running on my Windows system with AMD card.

gatttahaveit1 Feb 25, 2023

Thanks. I thought that I had done that correctly. I downloaded the 2 files. Changed their names, like the directions said, and moved the folders into the repositories folder. below is my repositories folder.

dezenhando Feb 25, 2023

Did you manage to make it work for you? I am having problems when I try to generate any image it starts but gives me that error, damn it

dezenhando Feb 25, 2023

Thanks. I thought that I had done that correctly. I downloaded the 2 files. Changed their names, like the directions said, and moved the folders into the repositories folder. below is my repositories folder.

I guess this is correct, just like mine

gatttahaveit1 · 2023-02-25T04:00:11Z

gatttahaveit1
Feb 25, 2023

This is what I get.

2 replies

dezenhando Feb 25, 2023

When you downloaded the files from GITHUB, did you "git clone" the
https://github.com/AUTOMATIC1111/stable-diffusion-webui.git
or
https://github.com/lshqqytiger/stable-diffusion-webui-directml

I downloaded the WEBUI-DIRECTML, maybe that´s why I am having problems generating images?

gatttahaveit1 Feb 25, 2023

the first one was my first install. it would work until i tried to generate an image and would then give the error code "LayerNormKernelImpl" not implemented for 'Half', because it was designed for Linx, not AMD.
the second one was my next install and it gives me the error code that it can't find stable diffusion in the directories.

matthewdeacon74 · 2023-02-25T13:41:56Z

matthewdeacon74
Feb 25, 2023

The error is for line 26 in modules\paths.py, and it shows a file path that's not valid: all those double "\" should be single "\" instead, and it should end with "\stable-diffusion-stability-ai" instead of "/stable-diffusion-stability-ai"
Basically, the folder markers are wrong - either doubled or backwards for Windows. This seems like a bug, but I'm not a Python developer so I don't know how to fix it.

1 reply

gatttahaveit1 Feb 25, 2023

I was able to fix it by changing the folder name in repository to sable-diffusion-stability-ai. but now I have a not enough GPU memory error.

Nhiira · 2023-02-25T14:46:28Z

Nhiira
Feb 25, 2023

I have 6600 XT but trying to use CPU. Any ideas how to fix that?

venv "C:\stable-diffusion-webui-directml-master\venv\Scripts\Python.exe"
Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug 1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)]
Commit hash:
Installing requirements for Web UI
Launching Web UI with arguments:
Interrogate will be fallen back to cpu. Because DirectML device does not support it.
Warning: caught exception 'Torch not compiled with CUDA enabled', memory monitor disabled
No module 'xformers'. Proceeding without it.
Loading weights [fe4efff1e1] from C:\stable-diffusion-webui-directml-master\models\Stable-diffusion\sd-v1-4.ckpt
Creating model from config: C:\stable-diffusion-webui-directml-master\configs\v1-inference.yaml
LatentDiffusion: Running in eps-prediction mode
DiffusionWrapper has 859.52 M params.
Applying cross attention optimization (InvokeAI).
Textual inversion embeddings loaded(0):
Model loaded in 4.5s (load weights from disk: 1.2s, create model: 0.5s, apply weights to model: 0.6s, apply half(): 0.6s, move model to device: 1.6s).
Running on local URL: http://127.0.0.1:7860/

6 replies

lshqqytiger Feb 26, 2023
Maintainer

Only interrogation is fallen back to cpu. Image generation uses GPU.

Nhiira Feb 26, 2023

Thanks a lot. I always thought that line causes all the problem.

I added "set COMMANDLINE_ARGS=--opt-sub-quad-attention --medvram --disable-nan-check --no-half --precision full" it helps avoid memory related problems.

Dom-pseudo-engineer Feb 27, 2023

Cool, thank you @lshqqytiger. I've been using it and I'm hitting major memory issues. Going to try what @Nhiira suggested. Seems like there's a memory leak with this right now though?

Dom-pseudo-engineer Feb 27, 2023

I'm not noticing a difference with "set COMMANDLINE_ARGS=--opt-sub-quad-attention --medvram --disable-nan-check --no-half --precision full"

Closing and re-opening releases the memory and you can keep going.

lshqqytiger Feb 27, 2023
Maintainer

It is a DirectML issue. I keep finding a way to make it release memory.

HenryReg11 · 2023-02-25T17:56:48Z

HenryReg11
Feb 25, 2023

How to solve this problem?

7 replies

Miraihi Feb 25, 2023

After set COMMANDLINE_ARGS= in webui-user.bat. Open it with a notepad

gatttahaveit1 Feb 25, 2023

Thanks for that, but it still doesn't work.

Miraihi Feb 25, 2023

Works for my AMD RX580 (8 Gig). If you have any less then use --lowvram

gatttahaveit1 Feb 25, 2023

I've only got 4 gig, tried that , still no luck.

Miraihi Feb 25, 2023

Well, you can only wait until directml developers improve memory management then.

Enferlain · 2023-02-25T22:49:24Z

Enferlain
Feb 25, 2023

Any fix to "interrogate will be fallen back to cpu"?

4 replies

Miraihi Feb 25, 2023

Interrogate basically doesn't work at all, cpu or not. Use tagger by toriato, it's practically deepbooru interrogate.

Nhiira Feb 25, 2023

Issue is people with same graphics cards like 6600 XT, 6800 XT, 7900 XTX able to use with Windows without fallen back to cpu warning. But we get it with same cards. Something must be triggering it.

Miraihi Feb 25, 2023

Mine is RX580 (8Gb). Deepbooru spews out the same tags every time no matter the image. CLIP just hangs at checking the model.

Enferlain Feb 26, 2023

Ah. So it works my gpu is just dog shit. Unlucky

RBR8man · 2023-02-26T04:24:44Z

RBR8man
Feb 26, 2023

The inpaint function isnt working, it just creates this blur
https://i.imgur.com/R3hPlLj.png
working on a 6800 XT

4 replies

Miraihi Feb 26, 2023

Try these arguments
COMMANDLINE_ARGS=--medvram --precision full --no-half --no-half-vae --opt-sub-quad-attention --opt-split-attention-v1 --disable-nan-check
Worked for me.

RBR8man Feb 26, 2023

thank you this worked

FelipeLujan Feb 27, 2023

Edit, I commented too soon.Inpainting is still not working for me.
With masked content on "fill" it generates a blurred region where the mask was; with Masked content on original or latent noise, the output image is the same as the input.
https://imgur.com/a/fdkm0Ul

I'm using these args
set COMMANDLINE_ARGS=--medvram --precision full --no-half --no-half-vae --opt-sub-quad-attention --opt-split-attention-v1 --disable-nan-check

Miraihi Feb 27, 2023

Try different models I guess. Or try removing some arguments and testing. --precision full is not always needed. --no-half is also contentious.

NicoNicoNico123 · 2023-02-28T15:37:45Z

NicoNicoNico123
Feb 28, 2023

Any tutorial of setup this enviroment? when I start it show error : import torch; assert torch.cuda.is_available(), 'Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check'")

I have no clue to work with AMD GPU setup

1 reply

Miraihi Feb 28, 2023

Literally here

Enferlain · 2023-02-28T15:55:04Z

Enferlain
Feb 28, 2023

Works fine, but I've been spoiled by the colab cards so went back to using it. It's a slight inconvenience having to swap accounts but better than waiting 5x longer for the same result. Wish amd did something more for compute on consumer cards

0 replies

amukreksa · 2023-03-08T22:56:01Z

amukreksa
Mar 8, 2023

what i need to do when i get this error message?
ERROR: Exception in ASGI application
Traceback (most recent call last):
File "C:\Users\dwiat\OneDrive\Documents#AI\stable diffusion\stable-diffusion-webui-directml\venv\lib\site-packages\uvicorn\protocols\http\h11_impl.py", line 407, in run_asgi
result = await app( # type: ignore[func-returns-value]
File "C:\Users\dwiat\OneDrive\Documents#AI\stable diffusion\stable-diffusion-webui-directml\venv\lib\site-packages\uvicorn\middleware\proxy_headers.py", line 78, in call
return await self.app(scope, receive, send)
File "C:\Users\dwiat\OneDrive\Documents#AI\stable diffusion\stable-diffusion-webui-directml\venv\lib\site-packages\fastapi\applications.py", line 271, in call
await super().call(scope, receive, send)
File "C:\Users\dwiat\OneDrive\Documents#AI\stable diffusion\stable-diffusion-webui-directml\venv\lib\site-packages\starlette\applications.py", line 125, in call
await self.middleware_stack(scope, receive, send)
File "C:\Users\dwiat\OneDrive\Documents#AI\stable diffusion\stable-diffusion-webui-directml\venv\lib\site-packages\starlette\middleware\errors.py", line 184, in call
raise exc
File "C:\Users\dwiat\OneDrive\Documents#AI\stable diffusion\stable-diffusion-webui-directml\venv\lib\site-packages\starlette\middleware\errors.py", line 162, in call
await self.app(scope, receive, _send)
File "C:\Users\dwiat\OneDrive\Documents#AI\stable diffusion\stable-diffusion-webui-directml\venv\lib\site-packages\starlette\middleware\gzip.py", line 24, in call
await responder(scope, receive, send)
File "C:\Users\dwiat\OneDrive\Documents#AI\stable diffusion\stable-diffusion-webui-directml\venv\lib\site-packages\starlette\middleware\gzip.py", line 44, in call
await self.app(scope, receive, self.send_with_gzip)
File "C:\Users\dwiat\OneDrive\Documents#AI\stable diffusion\stable-diffusion-webui-directml\venv\lib\site-packages\starlette\middleware\exceptions.py", line 79, in call
raise exc
File "C:\Users\dwiat\OneDrive\Documents#AI\stable diffusion\stable-diffusion-webui-directml\venv\lib\site-packages\starlette\middleware\exceptions.py", line 68, in call
await self.app(scope, receive, sender)
File "C:\Users\dwiat\OneDrive\Documents#AI\stable diffusion\stable-diffusion-webui-directml\venv\lib\site-packages\fastapi\middleware\asyncexitstack.py", line 21, in call
raise e
File "C:\Users\dwiat\OneDrive\Documents#AI\stable diffusion\stable-diffusion-webui-directml\venv\lib\site-packages\fastapi\middleware\asyncexitstack.py", line 18, in call
await self.app(scope, receive, send)
File "C:\Users\dwiat\OneDrive\Documents#AI\stable diffusion\stable-diffusion-webui-directml\venv\lib\site-packages\starlette\routing.py", line 706, in call
await route.handle(scope, receive, send)
File "C:\Users\dwiat\OneDrive\Documents#AI\stable diffusion\stable-diffusion-webui-directml\venv\lib\site-packages\starlette\routing.py", line 276, in handle
await self.app(scope, receive, send)
File "C:\Users\dwiat\OneDrive\Documents#AI\stable diffusion\stable-diffusion-webui-directml\venv\lib\site-packages\starlette\routing.py", line 66, in app
response = await func(request)
File "C:\Users\dwiat\OneDrive\Documents#AI\stable diffusion\stable-diffusion-webui-directml\venv\lib\site-packages\fastapi\routing.py", line 237, in app
raw_response = await run_endpoint_function(
File "C:\Users\dwiat\OneDrive\Documents#AI\stable diffusion\stable-diffusion-webui-directml\venv\lib\site-packages\fastapi\routing.py", line 163, in run_endpoint_function
return await dependant.call(**values)
File "C:\Users\dwiat\OneDrive\Documents#AI\stable diffusion\stable-diffusion-webui-directml\venv\lib\site-packages\gradio\routes.py", line 286, in file
raise ValueError(
ValueError: File cannot be fetched: C:/Users/dwiat/OneDrive/Documents/. All files must contained within the Gradio python app working directory, or be a temp file created by the Gradio python app.

0 replies

wujiji55 · 2023-03-23T14:04:44Z

wujiji55
Mar 23, 2023

how to solve this problem:Device type PRIVATEUSEONE is not supported for torch.Generator() api.
my GPU is RX6650XT THXXXXXXX

1 reply

lshqqytiger Mar 30, 2023
Maintainer

How can I reproduce that error?
Please attach steps to reproduce it.
Or, you can just attach full error traceback.

OldPixelReaper · 2023-03-30T00:31:11Z

OldPixelReaper
Mar 30, 2023

Hello, with me WEBUI has actually the last few weeks except for the known bugs everything works quite well.
Today I did a git pull on ae337fa. Now I can almost no longer use loras, I get only black images with a few exceptions. This was not the case before.
Does anyone know more, is it similar to someone? Sorry for my bad English.

Win11 Prof, up to date, 16 GB RAM, RX 5700 XT, latest Driver

set COMMANDLINE_ARGS= --medvram --precision full --no-half --no-half-vae --opt-split-attention --disable-nan-check
set SAFETENSORS_FAST_GPU=1

big thx and nice day @ all
!! Update!!
I just read that there is a bug at civitai right now and I reloaded all from there today. Maybe it is also related to that. Just read on Discord: There appears to be a widespread bug on CivitAI currently that results in 99% of models downloaded ending up as corrupt.
---final edit:
Ok, it was really up to civitai. I found old models and loras on the SSD, so it worked. Not always the obvious solution is the right solution^^
Now it works well with only:
set COMMANDLINE_ARGS= --medvram --opt-split-attention --disable-nan-check
set SAFETENSORS_FAST_GPU=1

0 replies

Aptronymist · 2023-04-12T17:57:24Z

Aptronymist
Apr 12, 2023

I've got my RX 590 8GB under Windows 10 doing at least batches of 10x 512x512 images, and 4x 768x768, (euler a, 20step) with zero NANs, (didn't try higher batch sizes than 10, couldn't consistently do higher res batches) with the following settings (not sure that clip does anything tbh, but I did read something on HF about not processing it in vram because it was only used at the start so I tried it and guessed at the module name):

set COMMANDLINE_ARGS=--medvram --use-cpu interrogate clip --opt-sub-quad-attention --sub-quad-q-chunk-size 256 --sub-quad-kv-chunk-size 256 --sub-quad-chunk-threshold 70 --disable-nan-check --no-hashing --skip-version-check --no-download-sd-model --skip-torch-cuda-test --no-half-vae

I have patched in the Negative Guidance minimum sigma #AUTOMATIC1111#9177 myself, though it was set at 0 when I tested. Upcast cross attention layer to float32 and Always discard next-to-last sigma were both checked.
This is all without the SAFETENSORS_FAST_GPU or PYTORCH_CUDA_ALLOC_CONF=garbage_collection_threshold:0.6,max_split_size_mb:128, though they should help (if they do anything of significance, though I wonder at the split size since that sounds a lot like what subquad is doing too), if I do more testing in that regard I'll post it.

I haven't done enough testing to know if those improve anything with this particular config. I did dig up some documentation and notes on how opt-sub-quad-attention works though, if you put only that main flag, according to the command line arguments file, it defaults to q-chunk-size 1024, kv-chunk-size none (0?), and threshold of 0 (none). From my experimentation, those defaults don't seem that great to me at least. I think threshold should be at least at 70 (it's a percentage of your vram), possibly lower, 60 works too, but not higher imo. The two chunk sizes will affect your speed/ram usage.
Smaller is faster but more wasteful of vram, higher is slower but more efficient.

some notes I scrounged up from code author in PR iirc:
"--sub-quad-chunk-threshold 0 disables unchunked attention and will use very small chunks to keep VRAM usage to a minimum. You can also adjust --sub-quad-q-chunk-size (default is 1024) and try values like 512 or 256 if that didn't help."
"--sub-quad-chunk-threshold receives a number between percentage values (0 to 100) as input, and is an option that uses VRAM up to that percentage."
"However, the VRAM used by the model was not taken into account, so when testing, an out-of-memory error appeared in the default value."

I also had a "conversation" with chatgpt after spitting the sub-quad code at it and doing a Q&A:

The code is implementing a sub-quadratic (sub-quad) attention mechanism in a neural network. The sub-quad attention is used to efficiently compute self-attention in a large sequence of tokens by dividing the sequence into smaller chunks and computing attention only within each chunk, instead of across the entire sequence.

The sub_quad_attention_forward function takes as input a tensor x, which represents the input sequence of tokens, a tensor context, which represents the context for the attention computation, and a mask tensor to mask out certain positions in the input. It first applies a linear transformation to x to generate query tensor q, and linear transformations to context to generate key tensor k and value tensor v. It then divides the query, key, and value tensors into chunks along the sequence dimension, and applies the sub_quad_attention function to compute attention within each chunk. The result is a tensor that represents the output of the attention mechanism.

The sub_quad_attention function takes the query, key, and value tensors as input, along with parameters q_chunk_size, kv_chunk_size, kv_chunk_size_min, and chunk_threshold, which determine the chunk sizes and the threshold for when to use chunking. The function computes attention using efficient dot product attention, with the option to use checkpointing to save memory.

The chunk sizes q_chunk_size and kv_chunk_size determine the size of each chunk for the query and key/value tensors, respectively. The parameter kv_chunk_size_min determines the minimum size for the key/value chunks. The chunk_threshold parameter determines the threshold size in bytes for when to use chunking. If the query, key, and value tensors can fit within the memory threshold, then no chunking is used. Otherwise, the tensors are divided into chunks of size no larger than the specified chunk sizes, and attention is computed within each chunk.

To determine the optimal values for q_chunk_size, kv_chunk_size, and chunk_threshold, you can experiment with different values and measure the performance of the model. You can also use the get_available_vram function to get the amount of available GPU memory and use this to estimate the optimal chunk sizes and chunk threshold.

The sub_quad_attention_forward function appears to be using the sub_quad_attention function for performing sub-quadratic attention. The sub_quad_attention function takes in query, key, and value tensors q, k, and v, along with some optional parameters. The values of the optional parameters q_chunk_size, kv_chunk_size, and chunk_threshold control how the tensors are chunked to fit within available memory during the attention calculation.

q_chunk_size controls the size of each chunk along the sequence length dimension of the query tensor q. kv_chunk_size controls the size of each chunk along the sequence length dimension of the key and value tensors k and v. If kv_chunk_size is not provided, then kv_chunk_size is set to None, and the key and value tensors are chunked to have the same number of tokens as the query tensor chunks. chunk_threshold controls the maximum amount of GPU memory usage, in bytes, for a single attention calculation. If chunk_threshold is not provided, then it is set to int(get_available_vram()*0.7), which uses 70% of the available VRAM.

Given that you have 8GB of VRAM, you can try different values of q_chunk_size, kv_chunk_size, and chunk_threshold to optimize performance. A higher q_chunk_size value will reduce the number of chunks along the query tensor sequence length dimension, but will increase the memory usage of each chunk. A higher kv_chunk_size value will reduce the number of chunks along the key and value tensor sequence length dimension, but will increase the memory usage of each chunk. A higher chunk_threshold value will allow for larger chunks, but will increase the risk of running out of memory.

To get started, you could try setting q_chunk_size and kv_chunk_size to 1024, and chunk_threshold to None or a value between 2GB and 4GB, depending on the size of your input tensors. You can then experiment with adjusting these values to find the optimal settings for your specific use case.

The downside of setting low values for q_chunk_size and kv_chunk_size is that it may result in slower computation due to the increased overhead of splitting and concatenating the tensors.

When processing a batch of data, the inputs are split into chunks and processed independently, and then the results are concatenated back together. If the chunk sizes are too small, there will be more splitting and concatenating operations which could add overhead and slow down the computation.

On the other hand, if the chunk sizes are too large, it could lead to out-of-memory errors, especially if the batch size or the input tensor size is large. Therefore, it is important to find the optimal chunk size that balances the memory usage and the computation speed.

Setting threshold to None means that the full sequence length will always be used for the attention operation, regardless of the input sequence length. This can be useful when processing sequences of varying lengths, as it ensures that all tokens are attended to, even if some of them are padded. However, it can also lead to slower performance and increased memory usage, particularly for very long sequences.

This is just my base config now, however, I'm betting you could optimize a particular startup batch file with different settings based on what you wanted to do, such as for smaller 512x512 in larger batches, or fewer 768x768 or 1024x1024 images.

Oh, I almost forgot. I have force activated (legacy) the AMD SAM (SmartAccess Memory) under Adrenaline/Performance/Tuning. I had to apply a registry file to do so on my card because of the age. Not sure if that's contributing to anything as I have not turned it back off again. However I will say that it straight up caused my system to bluescreen several times after I first activated it with the at-the-time newest (23.3.2 iirc) drivers, I had to roll back to 23.3.1. Haven't tried 23.3.4 yet.
Details here
If you have an older legacy card like mine, make sure you have a reliable way to get into safe mode to turn it back off before you screw with it.

I'm no expert, but I hope this helps, and hopefully, it'll work for you guys too.

5 replies

Miraihi Apr 12, 2023

Thank you @Aptronymist for the enlightening article. So you're running SD without --no-half argument?

Aptronymist Apr 12, 2023

Thank you @Aptronymist for the enlightening article. So you're running SD without --no-half argument?

I'm not sure I'd call it an article, but it's working pretty damn well and I have scrounged around a lot to try to mitigate the crappy performance AMD users such as myself get. As a non-programmer, this is as good as I can do.
But yes, whenever I get NANs (sometimes all nans) I just go to the Settings/Actions and unload the checkpoint, then reload it. Tends to sort it out for me, but of course... YMMV.

Aptronymist Apr 12, 2023

Oh, I should also note that I did do some relatively minor overclocking on my card, but mostly I jacked up the fan to make sure it didn't get too hot. It stays around 60C or so when it's hard at work. I'd recommend doing the same. This might be a little extreme but I'd rather have the fan going than my GPU getting too hot.

Miraihi Apr 13, 2023

Might me worth downvolting so it cools down even more. Even more. I've downvolted my RX580 from 1150 to 1100 peak to no ill effects.

Aptronymist Apr 13, 2023

Might me worth downvolting so it cools down even more. Even more. I've downvolted my RX580 from 1150 to 1100 peak to no ill effects.

I have not actually messed with voltage at all, other than cranking the "Power Limit" up to 30%, I don't find heat to be an issue with my highly aggressive fan settings but I'll take a look at it. That's one aspect of overclocking I have not really kept up with so I would have to read up on it first, so thanks for the suggestion!

aJIekc3D · 2024-01-09T14:44:38Z

aJIekc3D
Jan 9, 2024

I installed everything according to the instructions, disabled the CUDA check, but the calculations are done using the CPU, not the GPU (RX 578). Maybe I did something wrong?

2 replies

lshqqytiger Jan 11, 2024
Maintainer

Do you have --use-directml in your commandline arguments?

aJIekc3D Jan 17, 2024

Sorry for the late reply, I solved the problem. I tried using --use-directml, but it still didn't work, after a couple of days I came across the advice to add "torch-directml" to the requirements_versions file and everything worked, but after generating one or two images when using --medvram, an error about lack of memory appears, so I have to use --lowvram

Ming2k8-Coder · 2024-01-10T16:06:59Z

Ming2k8-Coder
Jan 10, 2024

what your commandline parameter in webui-user.bat Vào Th 3, 9 thg 1, 2024 vào lúc 21:45 aJIekc3D ***@***.***> đã viết:

…

I installed everything according to the instructions, disabled the CUDA check, but the calculations are done using the CPU, not the GPU (RX 578). Maybe I did something wrong? — Reply to this email directly, view it on GitHub <#13 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/A7VS7K5PMAOGB2FPBXPCFULYNVJW5AVCNFSM6AAAAAAVHQDR42VHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4DANRXGUZTG> . You are receiving this because you are subscribed to this thread.Message ID: <lshqqytiger/stable-diffusion-webui-directml/repo-discussions/13/comments/8067533 @github.com>

0 replies

Working on Windows + AMD #13

Replies: 15 comments · 38 replies

lshqqytiger Feb 26, 2023 Maintainer

lshqqytiger Feb 27, 2023 Maintainer

lshqqytiger Mar 30, 2023 Maintainer

set COMMANDLINE_ARGS= --medvram --precision full --no-half --no-half-vae --opt-split-attention --disable-nan-check set SAFETENSORS_FAST_GPU=1

lshqqytiger Jan 11, 2024 Maintainer

Replies: 15 comments 38 replies

lshqqytiger Feb 26, 2023
Maintainer

lshqqytiger Feb 27, 2023
Maintainer

lshqqytiger Mar 30, 2023
Maintainer

set COMMANDLINE_ARGS= --medvram --precision full --no-half --no-half-vae --opt-split-attention --disable-nan-check
set SAFETENSORS_FAST_GPU=1

lshqqytiger Jan 11, 2024
Maintainer