-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support to Meissonic #9875
base: main
Are you sure you want to change the base?
Add support to Meissonic #9875
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Any update or suggestion for this PR? |
@viiika You need to tag the relevant maintainers of this repository. Also, creating a separate branch of your fork for submitting the PR is recommended. Do you have any more advice? @sayakpaul
|
Thanks for your advice. I am just waiting for the approving review. |
I will let @yiyixuxu comment on deciding if this should be a core pipeline. Maybe you could supplement this PR with some examples and memory consumption numbers? @viiika Given it's masked image generation (we only have one in |
Thank you for your comments! We'd like to highlight some key differences between Meissonic and aMUSEd:
These advancements collectively contribute to Meissonic's performance being comparable to SDXL. Additionally, it seems there have been no further updates to aMUSEd, but we want to emphasize that Meissonic will continue to evolve. We plan to release Meissonic II in three months, with even more impressive performance. The original VRAM consumption values are provided below. Additionally, we have developed FP8, INT8, and INT4 versions, achieving minimal VRAM usage with a requirement as low as 4GB to generate high-quality 1024 x 1024 resolution images. |
We have also included the HPSv2.0 scores to demonstrate the performance.
|
Meissonic is a non-autoregressive mask image modeling text-to-image synthesis model that can generate high-resolution images. It is designed to run on consumer graphics cards.
The model checkpoint can be found in https://huggingface.co/MeissonFlow/Meissonic
The inference code can be found in https://github.com/viiika/Meissonic
The paper can be found in https://arxiv.org/abs/2410.08261