Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

callback / cannot yield intermediate images on the fly during inference #9407

Closed
Clement-Lelievre opened this issue Sep 10, 2024 · 8 comments

Comments

@Clement-Lelievre
Copy link
Contributor

Clement-Lelievre commented Sep 10, 2024

Hi,

in advance apologies if this has been asked already, or if I'm just misusing the diffusers API.

Using diffusers==0.30.2

What API design would you like to have changed or added to the library? Why?

I will illustrate straight away the general issue with my use case: I need to call a (FLUX) diffusers pipeline from some endpoint of mine, passing a callback that decodes latents and saves on disk intermediate images obtained from them, at the end of each step. So far, so good: I do manage to get the intermediate images saved on disk. I do this using the pipeline argument callback_on_step_end

Now, I'd like to yield (in the pythonic meaning) these intermediate images on the fly, as soon as they're available, ie at the end of each inference step. I need to do so from my endpoint. That's where my problem is.

I could not make this idea work using with diffusers callback mechanism.
I mean, I did manage that by subclassing the pipeline, copy-pasting the dunder call method code and overriding it, but this is not maintainable, especially since the FLUX code evolves rapidly nowadays.
Also, note that currently diffusers assigns the result of the call to the callback to a variable and expects it to implement the .pop method, which might add constraints (diffusers typically expects a kwarg dict, see here).

Another approach I thought of is to monitor the disk contents in a parallel process during the call to the pipeline.

But is there an easier way?

What use case would this enable or better enable? Can you give us a code example?

This allows to manipulate the objects produced by the callback live, instead of having to wait for the whole reverse diffusion to finish.

Thank you

cc @sayakpaul @yiyixuxu

also tagging @asomoza since I saw he is the contributor to the official callback interface

@asomoza
Copy link
Member

asomoza commented Sep 10, 2024

Since I come from other languages I find that yield is something really restrictive when you use async code or threads, also you're saying you need the images as soon as they're available and with the callbacks you get that too, is not that the callback only gets called after the whole generation is done but on each step too.

This is more of an issue on your side or the UI side than on the diffusers API, for example, I did a demo with Gradio and to display the images in an easy way you can use yield, so I had to do the same as you're suggesting, to change the pipeline because I can't (without manually using websockets) update the UI from the callback function.

But, if you use a more advanced system that lets you manually manage the state of the UI like PyQT, you can use the callbacks without a problem since you can update the UI as soon as you get the images from the callback.

In short what you really need is to find a way to communicate with your UI from the callback on each step, there isn't a method to do it with yield at the moment.

cc: @yiyixuxu in case this is something planned in the future or not.

@Clement-Lelievre
Copy link
Contributor Author

Clement-Lelievre commented Sep 11, 2024

Hi @asomoza , thanks for your reply, let me clarify or restate some points:

the callback only gets called after the whole generation is done but on each step too.

yes, I do know that the callback gets executed at the end of each step, hence the name on_step_end

also you're saying you need the images as soon as they're available and with the callbacks you get that too

I do get the images saved on disk (hence my idea of monitoring the disk contents in a parallel process), but I can't access them in code, or at least I don't know how to do this

In short what you really need is to find a way to communicate with your UI from the callback on each step

I have no UI, here I am working from an API endpoint. No UI involved. My use case is more like you're calling a diffusers pipeline in python and you'd like to access the intermediate images in code as soon as they're created, while the inference happens. And again, I managed this by modifying the pipeline's dunder call method code to make it yield these images, but I'd like to avoid touching diffusers code like this as it's not maintainable.

@asomoza
Copy link
Member

asomoza commented Sep 11, 2024

My apologies, I think I misunderstood and over simplified your problem.

I still don't understand your problem though and this is probably because, as I stated, I don't really understand and use yield unless I'm forced, because as I wrote, I'm used to do it manually by myself.

Probably better if someone else tries to help you, just as a clarification, what I don' understand in your problem is this:

... I'd like to yield (in the pythonic meaning) these intermediate images on the fly, as soon as they're available, ie at the end of each inference step

but then you wrote:

I do get the images saved on disk...., but I can't access them in code...

If you're saving them, you can access them in code, so that's why I though you need them for a specific UI which needs the images with yield or something similar.

In short I don't see why instead of saving them to disk you use them to whatever you need in your endpoint. To me this seems like the problem people have in javascript or node with async/await programing and I associated them by mistake....maybe.

@Clement-Lelievre
Copy link
Contributor Author

My apologies, I think I misunderstood and over simplified your problem.

I still don't understand your problem though and this is probably because, as I stated, I don't really understand and use yield unless I'm forced, because as I wrote, I'm used to do it manually by myself.

Probably better if someone else tries to help you, just as a clarification, what I don' understand in your problem is this:

... I'd like to yield (in the pythonic meaning) these intermediate images on the fly, as soon as they're available, ie at the end of each inference step

but then you wrote:

I do get the images saved on disk...., but I can't access them in code...

If you're saving them, you can access them in code, so that's why I though you need them for a specific UI which needs the images with yield or something similar.

In short I don't see why instead of saving them to disk you use them to whatever you need in your endpoint. To me this seems like the problem people have in javascript or node with async/await programing and I associated them by mistake....maybe.

To clarify, I'd like to call the pipeline in order to make an inference, AND to access the intermediate images in code from memory at the same time, as soon as they're produced.
to illustrate my need, in pure python this could be something like:

for intermediate_image in pipeline(prompt=..., other_args..., callback_on_step_end=decode_latents):
    yield intermediate_image

and also somehow get the last, final image

I do not knwo whether this is possible with some async code or via a thread

@asomoza
Copy link
Member

asomoza commented Sep 11, 2024

yeah, that's what I answered in the first post, currently this:

for intermediate_image in pipeline

Is not possible since for that we need to change the return type of all the pipelines to yield and I don't know if that is something planned in the future or not, so lets wait for that answer.

I do not knwo whether this is possible with some async code or via a thread

This I can assure you that can be done, I do it all the time in a personal app I use for testing diffusers.

@Clement-Lelievre
Copy link
Contributor Author

Clement-Lelievre commented Sep 11, 2024

I do not knwo whether this is possible with some async code or via a thread

This I can assure you that can be done, I do it all the time in a personal app I use for testing diffusers.

OK, if you have some material on this explaining how to set it up that's be appreciated! All I've found is people not managing and getting lost in async / threading tries

thks

@asomoza
Copy link
Member

asomoza commented Sep 11, 2024

I don't know your specific use case or what you need to do, but I wanted to do a simple app to test the SD3 preview and probably will expand in the future for masks and inpainting, I did a quick PoC of one with PyQT6 to demonstrate how can you use the callbacks to access the intermediate images.

It's a minimal one file app, so it should be easy to understand.

Here I set the callback and here I manipulate the intermediate images to display them.

PyQT uses "signals" to communicate between processes and threads, if what you're using doesn't have something similar, then it will be a lot more "involved" to implement.

I think this is outside the scope of diffusers though, since it's a problem on the library/app on top of it on how you can work with the callbacks.

@Clement-Lelievre
Copy link
Contributor Author

Clement-Lelievre commented Sep 25, 2024

update, I run the pipeline call in a thread (I has some scope issues when trying to run it in a process due to having an inner function inside my callback function), pass a callback that saves latents on disk, and in the meantime I watch the disk in the main process and yield new images
there are some tricky things to consider, such as avoiding to yield while an image is being written on disk by the callback function, otherwise the image yielded is partial, but apart from this it works as a charm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants