-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Model Isn't Learning #4
Comments
Same issue here, I'm using 3.9.12 with Torch @ 1.12.1, Cuda 11.6 |
Same issue. Whether using this repo or the official repo. |
@Ericxgao and @JulianJuaner I found a fix that worked for me. You guys can give it a go and report back.
After install, try running the script. |
Thanks! It works for me. It seems the version of xformers is essential. |
Hmm I'm still having trouble getting this version of xformers installed. What GPU and python version are you @ExponentialML @JulianJuaner ? I'm using a cloud A100. |
@Ericxgao If you're using an A100, you should be able to fit the model in 40GB of vram when training, so xformers shouldn't be needed. Is this not the case? |
I still get OOM errors - I disabled Adam 8 bit as that was also failing on my system (bitsandbytes doesn't seem to install properly) |
facebookresearch/xformers#631 |
Closing with solutions from me & @mili-inch . If there are any other issues, feel free to ask for a re-open to discuss. |
Using Stable Diffusion 1.5 on torch 1.13.1, Cuda 11.6, and the latest version of xformers==0.0.16. I cannot build torch 1.12.1 on my machine.
The model won't learn. It simply looks like the first iteration after every epoch.
(0-500 all look like this)
The text was updated successfully, but these errors were encountered: