Closed
Description
Model/Pipeline/Scheduler description
VQ-Diffusion is based on a VQ-VAE whose latent space is modeled by a conditional variant of the recently developed Denoising Diffusion Probabilistic Model (DDPM). It produces significantly better text-to-image generation results when compared with Autoregressive models with similar numbers of parameters. Compared with previous GAN-based methods, VQ-Diffusion can handle more complex scenes and improve the synthesized image quality by a large margin.
https://github.com/microsoft/VQ-Diffusion
Open source status
- The model implementation is available
- The model weights are available (Only relevant if addition is not a scheduler).
Provide useful links for the implementation
VQ-Diffusion would be a super cool addition to diffusers
. cc @cientgu and @zzctan .
Also cc @patil-suraj here