Will latent diffusion improve on DALL-E2 text conditioning? #42
lucidrains
started this conversation in
General
Replies: 1 comment 1 reply
-
the other idea would be to try a CLIP that has fine-grained interactions between text and image tokens https://arxiv.org/abs/2111.07783 already offered at https://github.com/lucidrains/x-clip with the setting |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Is working in latent space better?
Yet another potential paper up for grabs :)
Beta Was this translation helpful? Give feedback.
All reactions