Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while training with jjtolton's fork #65

Open
Gushousekai195 opened this issue Feb 11, 2023 · 15 comments
Open

Error while training with jjtolton's fork #65

Gushousekai195 opened this issue Feb 11, 2023 · 15 comments

Comments

@Gushousekai195
Copy link

Gushousekai195 commented Feb 11, 2023

@jjtolton

Traceback (most recent call last):
File "/content/gdrive/MyDrive/sd/stable-diffusion-webui/modules/call_queue.py", line 56, in f
res = list(func(*args, **kwargs))
File "/content/gdrive/MyDrive/sd/stable-diffusion-webui/modules/call_queue.py", line 37, in f
res = func(*args, **kwargs)
File "/content/gdrive/MyDrive/sd/stable-diffusion-webui/extensions/DreamArtist-sd-webui-extension/scripts/dream_artist/ui.py", line 32, in train_embedding
embedding, filename = dream_artist.cptuning.train_embedding(*args)
File "/content/gdrive/MyDrive/sd/stable-diffusion-webui/extensions/DreamArtist-sd-webui-extension/scripts/dream_artist/cptuning.py", line 559, in train_embedding
loss.backward()
File "/usr/local/lib/python3.8/dist-packages/torch/_tensor.py", line 488, in backward
torch.autograd.backward(
File "/usr/local/lib/python3.8/dist-packages/torch/autograd/init.py", line 197, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
File "/usr/local/lib/python3.8/dist-packages/torch/autograd/function.py", line 267, in apply
return user_fn(self, *args)
File "/content/gdrive/MyDrive/sd/stablediffusion/ldm/modules/diffusionmodules/util.py", line 142, in backward
input_grads = torch.autograd.grad(
File "/usr/local/lib/python3.8/dist-packages/torch/autograd/init.py", line 300, in grad
return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
RuntimeError: One of the differentiated Tensors does not require grad

@Gushousekai195
Copy link
Author

Also, there's an issue with Starting the ui in Colab:

2023-02-11 21:35:49.693244: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-02-11 21:35:50.849376: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/lib/python3.8/dist-packages/cv2/../../lib64:/usr/lib64-nvidia
2023-02-11 21:35:50.849549: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/lib/python3.8/dist-packages/cv2/../../lib64:/usr/lib64-nvidia
2023-02-11 21:35:50.849572: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.

@jjtolton
Copy link
Contributor

Do you have more complete steps you used to setup the colab so I can test?

@Gushousekai195
Copy link
Author

Do you have more complete steps you used to setup the colab so I can test?

  1. Use https://colab.research.google.com/github/TheLastBen/fast-stable-diffusion/blob/main/fast_stable_diffusion_AUTOMATIC1111.ipynb
  2. Install DreamArtist from your commit.
  3. Set "Save an image to log directory every N steps" to a number other than 0
  4. Train Embedding

@jjtolton
Copy link
Contributor

@Gushousekai195 I notice you're using Fast Stable Diffusion -- I don't really have the capacity to support anything besides the standard version of Automatic1111's version, sorry about that. If it doesn't work with standard SD on collab I'll take a look at it.

@petersfield1
Copy link

petersfield1 commented Feb 22, 2023

I'm getting a similar error on the standard Auto1111 UI

`2023-02-22T19:11:06.323049482Z Applying xformers cross attention optimization.

2023-02-22T19:11:06.323171693Z Traceback (most recent call last):

2023-02-22T19:11:06.323190980Z File "/workspace/stable-diffusion-webui/modules/call_queue.py", line 56, in f

2023-02-22T19:11:06.323193435Z res = list(func(*args, **kwargs))

2023-02-22T19:11:06.323195109Z File "/workspace/stable-diffusion-webui/modules/call_queue.py", line 37, in f

2023-02-22T19:11:06.323196902Z res = func(*args, **kwargs)

2023-02-22T19:11:06.323198645Z File "/workspace/stable-diffusion-webui/extensions/DreamArtist-sd-webui-

extension/scripts/dream_artist/ui.py", line 32, in train_embedding

2023-02-22T19:11:06.323200468Z embedding, filename = dream_artist.cptuning.train_embedding(*args)

2023-02-22T19:11:06.323202122Z File "/workspace/stable-diffusion-webui/extensions/DreamArtist-sd-webui-

extension/scripts/dream_artist/cptuning.py", line 540, in train_embedding

2023-02-22T19:11:06.323203815Z x_samples_ddim = shared.sd_model.decode_first_stage.wrapped(shared.sd_model,

output[2]) # forward with grad

2023-02-22T19:11:06.323205518Z AttributeError: 'function' object has no attribute 'wrapped'

2023-02-22T19:11:06.323207031Z `

@jjtolton
Copy link
Contributor

@petersfield1 Thanks. Is this in cloud deploy or local?

@jjtolton
Copy link
Contributor

Ah @petersfield1 that's a different error. You use reconstruction or something, right?

@petersfield1
Copy link

Yeah I was using reconstruction - and it's in the cloud on Runpod's stack, unfortunately don't have the specs on local

@jjtolton
Copy link
Contributor

@petersfield1 okay -- can you educate me on what the purpose of reconstruction is and I'll see if I can fix it? I'm just not sure how/why it would be used

@petersfield1
Copy link

Honestly I have 0 idea myself, it just happened to be the method that has produced the best results for me in the past - I'm running it now without the box ticked and it appears to be working fine

@jjtolton
Copy link
Contributor

@petersfield1 yeah I feel you. DA was broken before I even started working on the fix so I never saw what it did while it was working. If you feel it's something you need, holler back and I'll take a look (but I'm hoping it's not 😅 )

@sneccc
Copy link

sneccc commented Mar 30, 2023

@petersfield1 yeah I feel you. DA was broken before I even started working on the fix so I never saw what it did while it was working. If you feel it's something you need, holler back and I'll take a look (but I'm hoping it's not 😅 )

@jjtolton
it seems to not work with the latest versions of webui the train_embedding code is super difference, also with 2.1 it shouldnt work right?, i tried digging into the code but there are so many errors.

image

@jjtolton
Copy link
Contributor

jjtolton commented Apr 4, 2023

Bummer. Taking a look.

@jjtolton
Copy link
Contributor

jjtolton commented Apr 4, 2023

Good news/bad news.

@petersfield1 yeah I feel you. DA was broken before I even started working on the fix so I never saw what it did while it was working. If you feel it's something you need, holler back and I'll take a look (but I'm hoping it's not sweat_smile )

@jjtolton it seems to not work with the latest versions of webui the train_embedding code is super difference, also with 2.1 it shouldnt work right?, i tried digging into the code but there are so many errors.

image

I did a clean install of WebUI commit 22bcc7be428c94e9408f589966c2040187245d81, installed my branch/fork of DA to extensions, and used one model with the following settings:
image

And it worked flawlessly. So I cannot isolate this bug to DA on my fork itself. There are probably other compounding issues, but it does work with a clean install in isolation. Sorry my news isn't better.

@jjtolton
Copy link
Contributor

jjtolton commented Apr 4, 2023

@petersfield1 yeah I feel you. DA was broken before I even started working on the fix so I never saw what it did while it was working. If you feel it's something you need, holler back and I'll take a look (but I'm hoping it's not sweat_smile )

@jjtolton it seems to not work with the latest versions of webui the train_embedding code is super difference, also with 2.1 it shouldnt work right?, i tried digging into the code but there are so many errors.

image

This is a SD 2.1 bug. I don't know enough right now to know if there are fundamental differences that make 2.1 incompatible with DA. You can partially bypass this by creating the embeddings with the Train tab, but you'll crash later since the parameterization type "v" is not accounted for by DA. I'm not sure what to make this particular line of code:

image

Or if that is even the correct methodology.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants