-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Out of Memory when used with Tiled VAE #37
Comments
Additionally, |
It's unfortunate, but the current implementation of ToMe uses more memory when also using xformers/flash attn/torch 2.0 sdp attn or whatever. That leaves ToMe in an awkward spot, because it computes similarities for merging all at once, creating a Now, ToMe doesn't actually need to compute this whole matrix, so there is hope. We only need the argmax over the similarities, not the similarities themselves. I'm just not sure how to implement that in native pytorch (flash attn et al. implement it using custom cuda kernels, which I don't want to use because that's what makes it require compilation). |
In img2img, without the ToMe patch, I was able to upscale a 1024x1024 image to 2048x2048 using Tiled VAE, with
Encoder Tile Size
set to1024
andDecoder Tile Size
set to96
. The VRAM usage was around 6~7 GB.However, if I apply the ToME patch, regular generation does become faster. But when I try to upscale 1024x1024 image again, it starts throwing Out of Memory Error, even when I set the
Encoder Tile Size
lower to512
andDecoder Tile Size
to64
.The implementation I used was this, which simply calls
tomesd.apply_patch(sd_model, ratio=0.3)
insideon_model_loaded(sd_model)
.Is this a problem on my part? Did I write the implementation wrong?
Or is it something else?
Full Error Below:
The text was updated successfully, but these errors were encountered: