Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

provide an option so hires fix uses the final step (sort of) of the prompt schedule #12350

Conversation

rubberbaron
Copy link

@rubberbaron rubberbaron commented Aug 5, 2023

Description

The UI option for this is not in the right place, please tell me where to put it.

  • when using prompt editing, sometimes you don't want the hires fix to apply the full prompt editing schedule, you just want to continue refining the final result using the final editing state of the prompt

    • for example, if your prompt is [something that sets up the composition:something that produces the real result:0.35], then during the hires fix pass, you really only want the hires fix to use something that produces the real result, you don't want it to go back temporarily and apply something that sets up the composition for a while
    • while you can use separate hires prompt to achieve this, that doesn't work if you're using something like Dynamic Prompts or Wildcards to create prompts dynamically; those extensions would need to provide some mechanism to generate a special hires prompt, and it would be pretty tedious, whereas doing it here handles anything which modifies the prompt, whether it's an extension or a script like xyz-grid prompt SR. Even without using extensions and scripts, I know some people specifically want this behavior, so it saves them from having to manually make a hires prompt just to get this effect
    • this PR provides an optional setting that forces the highres fix to build the schedule specially and only use the final part of the prompt
    • There is special handling for [foo|bar] so it still happens during highres fix, but only if it was active at the final step. so [[foo|bar]::0.5] is treated as , but [[foo|bar]:0.5] is treated as [foo|bar]
  • a summary of changes in code

    • a new option in shared.py settings
    • get_learned_conditioning_prompt_schedule takes a new "use final state" boolean parameter that causes it to produce a "final state" schedule by only ever generating the "after" of a scheduled expression
    • get_conds_with_caching takes an optional "use final state" parameter and includes it as part of the cache key
    • normal prompts are unchanged; hires fix prompts optionally set the "use final state" parameter to True
    • plumbing to pass "use final state" parameter through
  • Addresses A patch to stop prompt editing in hires fix, so that it uses the prompt as it is at the final step of the initial render #9281, but in a more sophisticated way

  • Things that maybe should get changed before this is merged

    • Where should the option go? It's not actually User Interface, maybe under Stable Diffusion? Personally if this was just for me I'd put it in the txt2img hires fix controls themselves (e.g. where you set the scale), but personally I think this should be the default behavior that everyone gets out of the box, so my judgement is suspect, which is why I put it in settings
    • Should I make it so even if you set this option, it's ignored if you explicitly set a non-empty hires prompt?
    • Need to save this option into the png-info and make it an override so that people can reproduce the results without manually changing the setting to the correct value?
    • The new parameter in get_learned_conditioning_prompt_schedule defaults to False so extensions calling it won't break. But I didn't do that for get_conds_with_caching, should I?

Screenshots/videos:

The effect is not easily discernable in normal use, so I just made some exaggerated examples with the hires fix denoising set to 0.8, scale=1, so that the hires fix pass basically recomputes the whole image, just so you can see the effect happening.

The prompt is [man on a street:woman in a forest:0.99] with a [cat|cow|sheep|horse]. The expected behavior is that the old code will produce a man on a street, and the new code will produce a woman in a forest, and that we get a cat-cow-sheep-horse regardless, and not just a horse (which would happen if we naively just used the final prompt schedule step).

before hires fix:
00399-v1-5-pruned-2023-08-05-7-30-3127545016-before-highres-fix

hires fix before this change:
00400-v1-5-pruned-2023-08-05-7-30-3127545016

hires fix with the option provided by this PR enabled:
00408-v1-5-pruned-2023-08-05-7-30-3127545016

I put the UI option to control it here with the other hires fix options, but this is actually the User Interface options panel, and this option is not User Interface, so this doesn't seem like the right choice. (None of the options panels really seemed like a good fit, and at least here it's next to some related choices.)

image

Checklist:

Robert Barron added 2 commits August 5, 2023 08:47
…edule, instead of repeating the whole schedule

But actually it special cases [foo|bar] style prompt editing; if you have one of those active at the final prompt step,
it will be preserved for hires fix as well, so hires fix will alternate
@w-e-w

This comment was marked as outdated.

@catboxanon
Copy link
Collaborator

I think they made this PR so it interops with wildcard extensions. They mentioned the separate hires prompt already.

while you can use separate hires prompt to achieve this, that doesn't work if you're using something like Dynamic Prompts or Wildcards to create prompts dynamically; those extensions would need to provide some mechanism to generate a special hires prompt, and it would be pretty tedious, whereas doing it here handles anything which modifies the prompt, whether it's an extension or a script like xyz-grid prompt SR. Even without using extensions and scripts, I know some people specifically want this behavior, so it saves them from having to manually make a hires prompt just to get this effect

@w-e-w
Copy link
Collaborator

w-e-w commented Aug 5, 2023

I think they made this PR so it interops with wildcard extensions. They mentioned the separate hires prompt already.

while you can use separate hires prompt to achieve this, that doesn't work if you're using something like Dynamic Prompts or Wildcards to create prompts dynamically; those extensions would need to provide some mechanism to generate a special hires prompt, and it would be pretty tedious, whereas doing it here handles anything which modifies the prompt, whether it's an extension or a script like xyz-grid prompt SR. Even without using extensions and scripts, I know some people specifically want this behavior, so it saves them from having to manually make a hires prompt just to get this effect

didn't understand that first time reading I guess it's makes sense

@rubberbaron
Copy link
Author

not necessary #9281 (comment)

In fact, #9281 was the discussion I mentioned in the PR that I could no longer find; I've updated the PR correspondingly, thanks.

@w-e-w
Copy link
Collaborator

w-e-w commented Aug 5, 2023

not necessary #9281 (comment)

In fact, #9281 was the discussion I mentioned in the PR that I could no longer find; I've updated the PR correspondingly, thanks.

ya lol, I think the reply was literally the first thing I did when I woke
I should have eaten breakfast before sitting in front of my computer

@AUTOMATIC1111
Copy link
Owner

Would a more flexible solution be to assume that hires fix's steps start at 1.0 and end at 2.0 for prompt editing?

@rubberbaron
Copy link
Author

rubberbaron commented Aug 6, 2023

That definitely sounds more flexible to target from a dynamic generator, for example, and as far as I can see should absolutely address all the issues in this PR. It already will save/restore to pnginfo correctly, and doesn't need any config. And I've seen people posting issues or discussions where it seemed like this is the behavior they expected.

But it will totally change the reproducibility of hires fix images unless it's on a toggleable option. And I think it will change output for some non-hires fix prompts as well, unless I'm missing something.

The current code is:

                tree.children[-2] = float(tree.children[-2])
                if tree.children[-2] < 1:
                    tree.children[-2] *= steps
                tree.children[-2] = min(steps, int(tree.children[-2]))

I see two options for changing the <1 test:

  • "correct" way would be to check syntactically whether they're using an integer syntax or a float syntax to decide which way to interpret the input number, e.g. <1 becomes "." in tree.children[-2]. This would break reproducibility for anybody using float syntax but expecting integer behavior
  • "compatible way" would be testing <2 but special case the input 1 as being a raw step number. This would still allow you to use 1.0 distinct from 1 (which would also break reproducibility if anybody was using 1.0 as a raw step number). Arguably though if we support 1.0 it would also be nice to support 2.0, so maybe special case 1 and 2, something like float(s)<=2 and s != "1" and s != "2".

Of course of necessity both of those methods will break old prompts using say 1.5 which was interpreted as raw step number 1 the old way, and must be interpreted as 1.5 * steps the new way. (Conceptually; actually it depends on both normal and hires step counts, of course.) The two methods largely differ in interpretation of odd things like 7.5.

@AUTOMATIC1111
Copy link
Owner

The old compatibility mode will work just like before, the new will interpret <2 as ratio always and >=2 as abs number of steps.

@rubberbaron
Copy link
Author

rubberbaron commented Aug 6, 2023

  • i don't know what "old compatibility mode" means, but i'll take your word for it
  • so in this system, there's no way to express 1 as a step number? that doesn't seem great from an orthogonality/simplicity perspective. for a practical example, if i randomly generate an integer prompt edit time from 0 to 5, it'd be weird if only one of the values was intepreted differently; 0,2,3,4,5 work as expected, but 1 causes the edit to happen at steps instead.
  • the fact that you can use "1.0" but not "2.0" seems weird to me, again orthogonality and consistency. (But same is true of old interpretation of 1.0; this is subtle, but in the old way, if you have 10 steps, the test in the at_step visitor is step <= when, and the step numbers go from 1..10 not 0..9, so you can't express 'step <= 10' in a generic way not tied to the actual step numbers if you can't use 1.0 to mean that. Of course that test would always be true, which might sound useless, but 0 is supported and is equally "useless". an example of practical use is that it would be useful when randomly generating prompt edit times to be able to express the full range of possiblities, from "never" (0.0) to "always" (1.0); but the current way, you have to limit to 0.9999, which will never allow the "empty" operation. Under your suggested system, 1.0 would now work "as expected", but 2.0 would have the problem now.)

@AUTOMATIC1111
Copy link
Owner

AUTOMATIC1111 commented Aug 6, 2023

By old compatibility mode I mean a checkbox in settings.
We could also do it the way you suggested, integers as absolute step numbers and floats as ratio.

@rubberbaron
Copy link
Author

rubberbaron commented Aug 6, 2023

We could also do it the way you suggested, integers as absolute step numbers and floats as ratio.

That's my preference, but it's your project, it's absolutely your choice, I'm happy to implement whatever you like.

@AUTOMATIC1111
Copy link
Owner

No, no , let's do it your way, with integers/floats distinction. The only thing is there should be compatibility mode for reproducing old pics.

@rubberbaron
Copy link
Author

Status update: I have a patch that does the "1.0 - 2.0" thing but I want to use it in the wild for a little, when I make a PR for it, I will close this PR.

@SirVeggie
Copy link

Can you elaborate what happens with hires fix now? if 1.0-2.0 is the hires step range, does that mean [cat:frog :0.5] swaps in the middle of the initial gen while hires stays at frog? So, to emulate the old behaviour you would need to do something like [[cat:frog :0.5]:[cat:frog :1.5]:1.0]?

@rubberbaron
Copy link
Author

Can you elaborate what happens with hires fix now? if 1.0-2.0 is the hires step range, does that mean [cat:frog :0.5] swaps in the middle of the initial gen while hires stays at frog? So, to emulate the old behaviour you would need to do something like [[cat:frog :0.5]:[cat:frog :1.5]:1.0]?

Yes. There's a compatibility switch to emulate the old behavior if you need to make old images work, but if you don't want to leave it switched on, yes, you'd have to do something like that to get the same effect as before. (I'd suggested we could make it so copying the prompt into the hires prompt would have the old effect, but Auto1111 preferred that the hires fix range in the hires prompt still be 1.0-2.0, so that's no help.)

I'd be interested to know what kind of workflows or scenarios you have where you'd prefer that behavior, just so I can keep it in mind.

@SirVeggie
Copy link

Initially I was a bit against it since it's a bit unintuitive behaviour, but I like it since I can use a prompt preprocessor extension to do kinda like func(cat,frog,0.5) which would also fix the issue I had.

I'd be interested to know what kind of workflows or scenarios you have where you'd prefer that behavior, just so I can keep it in mind.

This would be the case for styles/embeddings for example. If I found that [style1:style2:0.6] results in a nice combined style, I would want the hires fix to replicate it as close as possible, and it would be the expected behaviour since hires is basically "do that again but more resolution". Doing just style2 in the hires part will most often than not remove the look of style1 especially on higher denoise.

Another thing that crossed my mind was about the disparity of percentage and steps in this pull request. Percentage is being set as 0.0-2.0, but as I understood it, setting 10 steps still works as an edit at 10 in both base and hires. It would make sense to do the same "overflow" with steps as with percentage to remain consistent: With 50 steps in base and hires, setting 10 only edits base, while 60 edits hires at step 10.

@rubberbaron
Copy link
Author

rubberbaron commented Aug 10, 2023

Thanks for explaining the scenario.

And yes, it does the same overflow in the sense that, if you have 50 steps in base and 20 steps in hires, [foo:0.5] and [foo:25] both mean the same thing, and [bar:1.5] and [bar:60] both mean the same thing.

@AUTOMATIC1111
Copy link
Owner

It would make sense to make it available as PR for people who want to test it out and tell you about if it works or not.

@rubberbaron
Copy link
Author

rubberbaron commented Aug 10, 2023

It would make sense to make it available as PR for people who want to test it out and tell you about if it works or not.

Ok! Superseded by #12457.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants