Protect alphas_cumprod during refiner switchover #14979

drhead · 2024-02-21T00:32:49Z

Description

There is currently a bug, mentioned in Make refiner switchover based on model timesteps instead of sampling steps #14978, where when the refiner switches on, the first step it performs uses the original alphas_cumprod schedule, which causes problems if zero SNR is enabled.
This stores and re-applies the model alphas_cumprod when refiner switchover happens. This fixes outputs in rare cases where the change in noise schedules is significant enough to change the called timestep to something outside of the range of the refiner.

Screenshots/videos:

Before fix, image generated on DPM++ 2M, overridden with Karras schedule and sigma_max of 1500, 50 steps:

This specific schedule causes one of the sampling steps to be changed from timestep 190 to timestep 200. The highest timestep a typical refiner is trained for is 199 (last 200, zero indexed), so this is out of the range for the model and causes extra noise in the output.

After the fix is applied:

The resulting image looks much cleaner (particularly the background at the top left).

Checklist:

I have read contributing wiki page
I have performed a self-review of my own code
My code follows the style guidelines
My code passes tests

AUTOMATIC1111 · 2024-02-26T04:07:10Z

reload_model_weights is used in many places, and if there are issues with it working, those should be solved in it, rather than outside of it.

drhead · 2024-02-26T04:50:10Z

I'm not sure this should really be considered an issue with reload_model_weights as much as it is an issue with how the main processing loop handles alphas_cumprod -- as a temporary state reflecting the current settings, for the current sampling step, which will be updated again on the next step. This isn't really applicable to any other context where reload_model_weights is called (three of which are initialization related, and one of which is xyz plot related), and intuitively I would prefer avoiding doing something that could cause problems in the areas where this doesn't apply.

I could look into moving it to within the reload function, but I think that this would have more potential to cause issues in other parts of the code and I would need to test it further to verify that it doesn't. In the mean time, I've removed protection of alphas_cumprod_original since that isn't necessary to fix the bug.

AUTOMATIC1111 · 2024-02-26T05:42:08Z

I just don't get this or why it helps. How is it happening that alphas_cumprod is incorrect after sd_models.reload_model_weights ?

Edit: I guess it's because of this code in processing.py:

                if opts.sd_noise_schedule == "Zero Terminal SNR":
                    p.extra_generation_params['Noise Schedule'] = opts.sd_noise_schedule
                    p.sd_model.alphas_cumprod = rescale_zero_terminal_snr_abar(p.sd_model.alphas_cumprod).to(shared.device)

But then there should be the same problem when switching models during hires fix.

drhead · 2024-02-26T06:02:01Z

It is related to the implementation of zero terminal SNR and the compatibility fix that came with it. Doing either of those things requires changing the model alphas_cumprod to reflect those changes, and how it is implemented currently is that this is done at runtime, before each sampling step, as an override.

When the model weights are reloaded, this override is undone, since the newly-loaded model weights come with their own alphas_cumprod value (which in almost every case will be the same as it is by default on the other model, but which will be different if we overrode the values to something other than what is in the model). And then this will result in the wrong timestep being called for the value of sigma used on that sampling step. As in my example, the sigma value of 0.5725 would correspond to timestep 190 under a zero terminal SNR schedule, but would correspond to timestep 200 on the default schedule -- which is a problem, because it means that not only is the schedule now wrong, it means the refiner is now sampling from a step it was never meant to sample from. The alphas_cumprod override will be reapplied next step, but at this point there's already going to be artifacts in the final image from the bad refiner step. I do see the logical solution to this problem to be maintaining the override through the model switch.

I'll look into whether hiresfix has the same issue, though I would expect it to be far more benign there in any case.

AUTOMATIC1111 · 2024-02-26T06:05:21Z

If I'm understanding this right, the proper fix, I think, would be to take this code from processing:

            def rescale_zero_terminal_snr_abar(alphas_cumprod):
                alphas_bar_sqrt = alphas_cumprod.sqrt()

                # Store old values.
                alphas_bar_sqrt_0 = alphas_bar_sqrt[0].clone()
                alphas_bar_sqrt_T = alphas_bar_sqrt[-1].clone()

                # Shift so the last timestep is zero.
                alphas_bar_sqrt -= (alphas_bar_sqrt_T)

                # Scale so the first timestep is back to the old value.
                alphas_bar_sqrt *= alphas_bar_sqrt_0 / (alphas_bar_sqrt_0 - alphas_bar_sqrt_T)

                # Convert alphas_bar_sqrt to betas
                alphas_bar = alphas_bar_sqrt**2  # Revert sqrt
                alphas_bar[-1] = 4.8973451890853435e-08
                return alphas_bar

            if hasattr(p.sd_model, 'alphas_cumprod') and hasattr(p.sd_model, 'alphas_cumprod_original'):
                p.sd_model.alphas_cumprod = p.sd_model.alphas_cumprod_original.to(shared.device)

                if opts.use_downcasted_alpha_bar:
                    p.extra_generation_params['Downcast alphas_cumprod'] = opts.use_downcasted_alpha_bar
                    p.sd_model.alphas_cumprod = p.sd_model.alphas_cumprod.half().to(shared.device)
                if opts.sd_noise_schedule == "Zero Terminal SNR":
                    p.extra_generation_params['Noise Schedule'] = opts.sd_noise_schedule
                    p.sd_model.alphas_cumprod = rescale_zero_terminal_snr_abar(p.sd_model.alphas_cumprod).to(shared.device)

And put it in a separate function, and call this function both in processing where it was originally and in sd_models.load_model_weights.

drhead · 2024-02-27T04:56:11Z

I'm not entirely sure where the best place to put the alpha override function is, but I had to relocate it to sd_models.py to avoid circular imports, however it can be easily moved to any other location that won't cause circular import errors if needed. The fix works as it stands.

Protect alphas_cumprod during refiner switchover

…C1111#14979

Protect alphas_cumprod during refiner switchover

9c1ece8

drhead requested a review from AUTOMATIC1111 as a code owner February 21, 2024 00:32

drhead closed this Feb 21, 2024

drhead reopened this Feb 21, 2024

drhead changed the base branch from master to dev February 21, 2024 00:37

dont need to preserve alphas_cumprod_original

648f6a8

drhead added 3 commits February 26, 2024 23:43

move refiner fix to sd_models.py

e2cd92e

move alphas cumprod override out of processing

94f23d0

remove alphas cumprod fix from samplers_common

4dae91a

AUTOMATIC1111 approved these changes Mar 2, 2024

View reviewed changes

AUTOMATIC1111 merged commit 06b9200 into AUTOMATIC1111:dev Mar 2, 2024
3 checks passed

AUTOMATIC1111 added a commit that referenced this pull request Mar 2, 2024

call apply_alpha_schedule_override in load_model_weights for #14979

1a51b16

AUTOMATIC1111 added a commit that referenced this pull request Mar 2, 2024

style changes for #14979

ee470cc

AUTOMATIC1111 added a commit that referenced this pull request Mar 2, 2024

Merge pull request #14979 from drhead/refiner_cumprod_fix

28bc85a

Protect alphas_cumprod during refiner switchover

AUTOMATIC1111 added a commit that referenced this pull request Mar 2, 2024

call apply_alpha_schedule_override in load_model_weights for #14979

da67afe

AUTOMATIC1111 added a commit that referenced this pull request Mar 2, 2024

style changes for #14979

141a17e

ruchej pushed a commit to ruchej/stable-diffusion-webui that referenced this pull request Sep 30, 2024

Merge pull request AUTOMATIC1111#14979 from drhead/refiner_cumprod_fix

3125944

Protect alphas_cumprod during refiner switchover

ruchej pushed a commit to ruchej/stable-diffusion-webui that referenced this pull request Sep 30, 2024

call apply_alpha_schedule_override in load_model_weights for AUTOMATI…

17becdd

…C1111#14979

ruchej pushed a commit to ruchej/stable-diffusion-webui that referenced this pull request Sep 30, 2024

style changes for AUTOMATIC1111#14979

44426fb

ruchej pushed a commit to ruchej/stable-diffusion-webui that referenced this pull request Sep 30, 2024

call apply_alpha_schedule_override in load_model_weights for AUTOMATI…

fe085fb

…C1111#14979

ruchej pushed a commit to ruchej/stable-diffusion-webui that referenced this pull request Sep 30, 2024

style changes for AUTOMATIC1111#14979

a17fb61

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Protect alphas_cumprod during refiner switchover #14979

Protect alphas_cumprod during refiner switchover #14979

drhead commented Feb 21, 2024 •

edited

Loading

AUTOMATIC1111 commented Feb 26, 2024

drhead commented Feb 26, 2024 •

edited

Loading

AUTOMATIC1111 commented Feb 26, 2024 •

edited

Loading

drhead commented Feb 26, 2024

AUTOMATIC1111 commented Feb 26, 2024

drhead commented Feb 27, 2024

Protect alphas_cumprod during refiner switchover #14979

Protect alphas_cumprod during refiner switchover #14979

Conversation

drhead commented Feb 21, 2024 • edited Loading

Description

Screenshots/videos:

Checklist:

AUTOMATIC1111 commented Feb 26, 2024

drhead commented Feb 26, 2024 • edited Loading

AUTOMATIC1111 commented Feb 26, 2024 • edited Loading

drhead commented Feb 26, 2024

AUTOMATIC1111 commented Feb 26, 2024

drhead commented Feb 27, 2024

drhead commented Feb 21, 2024 •

edited

Loading

drhead commented Feb 26, 2024 •

edited

Loading

AUTOMATIC1111 commented Feb 26, 2024 •

edited

Loading