-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Combine UnCLIPPipeline
and StableDiffusionImageVariationPipeline
#1808
Comments
Very cool! Would you maybe like to add this as a community pipeline as explained here with you being the author? This sounds like a great contribution :-) |
Hi, I tried it as you suggested (gist here). Now the pipeline is mostly OK, but I have a minor problems: # https://gist.github.com/budui/416b82e489d341f2495b155cb9cb1914#file-stable_unclip_pipeline-py-L289-L300
pipeline = StableUnCLIPPipeline.from_pretrained(
"kakaobrain/karlo-v1-alpha",
torch_dtype=torch.float16,
decoder_pipe_kwargs=dict(
image_encoder=None,
torch_dtype=torch.float16,
),
)
pipeline.to(device)
pipeline.decoder_pipe.to(device) How can I make sure that the pipeline ( |
I know how to do it now. I will create a PR soon..., seems that I need to write some documents. |
Can you please show me how can I add a parameter to specify the image resolution? Also, the seed. |
# https://gist.github.com/budui/416b82e489d341f2495b155cb9cb1914#file-stable_unclip_pipeline-py-L288-L314
prompt = "a shiba inu wearing a beret and black turtleneck"
random_generator = torch.Generator(device=device).manual_seed(1000)
output = pipeline(prompt=prompt, generator=random_generator, width=512, height=512) |
Thanks a lot! Going to try that! |
Can I do this with Karlo because that's what I'm using. I use the following: from diffusers import UnCLIPPipeline pipe = UnCLIPPipeline.from_pretrained("kakaobrain/karlo-v1-alpha") prompt = "Cat on a yellow leaf." image = pipe([prompt]).images[0] timage.save("./pop.png") |
Model/Pipeline/Scheduler description
UnCLIPPipeline
("kakaobrain/karlo-v1-alpha") provide a prior model that can generate clip image embedding from text.StableDiffusionImageVariationPipeline
("lambdalabs/sd-image-variations-diffusers") provide a decoder model than can generate images from clip image embedding.So, I test combine
UnCLIPPipeline
andStableDiffusionImageVariationPipeline
:It works! And I got reasonable results:
this
Stable DALLE2-like Diffusion
is relatively lightweight: only need 6.5G GPU RAM use default code and 21s on half T4 GPU.Open source status
Provide useful links for the implementation
full code above: my gist
The text was updated successfully, but these errors were encountered: