-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
callback / cannot yield intermediate images on the fly during inference #9407
Comments
Since I come from other languages I find that This is more of an issue on your side or the UI side than on the diffusers API, for example, I did a demo with Gradio and to display the images in an easy way you can use But, if you use a more advanced system that lets you manually manage the state of the UI like PyQT, you can use the callbacks without a problem since you can update the UI as soon as you get the images from the callback. In short what you really need is to find a way to communicate with your UI from the callback on each step, there isn't a method to do it with cc: @yiyixuxu in case this is something planned in the future or not. |
Hi @asomoza , thanks for your reply, let me clarify or restate some points:
yes, I do know that the callback gets executed at the end of each step, hence the name
I do get the images saved on disk (hence my idea of monitoring the disk contents in a parallel process), but I can't access them in code, or at least I don't know how to do this
I have no UI, here I am working from an API endpoint. No UI involved. My use case is more like you're calling a diffusers pipeline in python and you'd like to access the intermediate images in code as soon as they're created, while the inference happens. And again, I managed this by modifying the pipeline's dunder call method code to make it yield these images, but I'd like to avoid touching diffusers code like this as it's not maintainable. |
My apologies, I think I misunderstood and over simplified your problem. I still don't understand your problem though and this is probably because, as I stated, I don't really understand and use yield unless I'm forced, because as I wrote, I'm used to do it manually by myself. Probably better if someone else tries to help you, just as a clarification, what I don' understand in your problem is this:
but then you wrote:
If you're saving them, you can access them in code, so that's why I though you need them for a specific UI which needs the images with In short I don't see why instead of saving them to disk you use them to whatever you need in your endpoint. To me this seems like the problem people have in javascript or node with async/await programing and I associated them by mistake....maybe. |
To clarify, I'd like to call the pipeline in order to make an inference, AND to access the intermediate images in code from memory at the same time, as soon as they're produced. for intermediate_image in pipeline(prompt=..., other_args..., callback_on_step_end=decode_latents):
yield intermediate_image and also somehow get the last, final image I do not knwo whether this is possible with some async code or via a thread |
yeah, that's what I answered in the first post, currently this: for intermediate_image in pipeline Is not possible since for that we need to change the return type of all the pipelines to
This I can assure you that can be done, I do it all the time in a personal app I use for testing diffusers. |
OK, if you have some material on this explaining how to set it up that's be appreciated! All I've found is people not managing and getting lost in async / threading tries thks |
I don't know your specific use case or what you need to do, but I wanted to do a simple app to test the SD3 preview and probably will expand in the future for masks and inpainting, I did a quick PoC of one with PyQT6 to demonstrate how can you use the callbacks to access the intermediate images. It's a minimal one file app, so it should be easy to understand. Here I set the callback and here I manipulate the intermediate images to display them. PyQT uses "signals" to communicate between processes and threads, if what you're using doesn't have something similar, then it will be a lot more "involved" to implement. I think this is outside the scope of diffusers though, since it's a problem on the library/app on top of it on how you can work with the callbacks. |
update, I run the pipeline call in a thread (I has some scope issues when trying to run it in a process due to having an inner function inside my callback function), pass a callback that saves latents on disk, and in the meantime I watch the disk in the main process and |
Hi,
in advance apologies if this has been asked already, or if I'm just misusing the diffusers API.
Using
diffusers==0.30.2
What API design would you like to have changed or added to the library? Why?
I will illustrate straight away the general issue with my use case: I need to call a (FLUX) diffusers pipeline from some endpoint of mine, passing a callback that decodes latents and saves on disk intermediate images obtained from them, at the end of each step. So far, so good: I do manage to get the intermediate images saved on disk. I do this using the pipeline argument
callback_on_step_end
Now, I'd like to yield (in the pythonic meaning) these intermediate images on the fly, as soon as they're available, ie at the end of each inference step. I need to do so from my endpoint. That's where my problem is.
I could not make this idea work using with diffusers callback mechanism.
I mean, I did manage that by subclassing the pipeline, copy-pasting the dunder call method code and overriding it, but this is not maintainable, especially since the FLUX code evolves rapidly nowadays.
Also, note that currently diffusers assigns the result of the call to the callback to a variable and expects it to implement the
.pop
method, which might add constraints (diffusers typically expects a kwarg dict, see here).Another approach I thought of is to monitor the disk contents in a parallel process during the call to the pipeline.
But is there an easier way?
What use case would this enable or better enable? Can you give us a code example?
This allows to manipulate the objects produced by the callback live, instead of having to wait for the whole reverse diffusion to finish.
Thank you
cc @sayakpaul @yiyixuxu
also tagging @asomoza since I saw he is the contributor to the official callback interface
The text was updated successfully, but these errors were encountered: