Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Spandrel for upscaling and face restoration architectures #14425

Merged
merged 7 commits into from
Dec 30, 2023

Conversation

akx
Copy link
Collaborator

@akx akx commented Dec 25, 2023

Description

This PR yeets most of the copy-pasted or otherwise vendored model architectures in favor of just using
Spandrel.

  • Converted models are:

    • CodeFormer
    • ESRGAN
    • GFPGAN
    • RealESRGAN
    • ScuNET
    • SwinIR
  • Not converted is LDSR; it doesn't exist in Spandrel.

  • There's still some more cleanup that could be done – there are multiple implementations of tiled inference right now, for one, and the model loading/downloading/... code is kind of a mess (should continue where I left off with Upscaler model loading cleanup #10823), but I'll hold off on that for this PR.

  • As an added bonus, this adds (experimental, works-on-my-machine) support for HAT upscaling models.

Screenshots/videos:

No visual changes. This seems to Work On My Machine but it'd be lovely if someone else tried this out too.

Checklist:

@akx akx force-pushed the spandrel branch 2 times, most recently from b1a61e9 to e61c70b Compare December 25, 2023 13:45
@akx akx force-pushed the spandrel branch 5 times, most recently from ecee8df to 8bfe7cf Compare December 25, 2023 21:56
@akx akx marked this pull request as ready for review December 26, 2023 23:45
@@ -185,8 +180,7 @@ def on_ui_settings():

shared.opts.add_option("SWIN_tile", shared.OptionInfo(192, "Tile size for all SwinIR.", gr.Slider, {"minimum": 16, "maximum": 512, "step": 16}, section=('upscaling', "Upscaling")))
shared.opts.add_option("SWIN_tile_overlap", shared.OptionInfo(8, "Tile overlap, in pixels for SwinIR. Low values = visible seam.", gr.Slider, {"minimum": 0, "maximum": 48, "step": 1}, section=('upscaling', "Upscaling")))
if int(torch.__version__.split('.')[0]) >= 2 and platform.system() != "Windows": # torch.compile() require pytorch 2.0 or above, and not on Windows
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're always torch >= 2.0, and we now just try compile without checking the platform.

Comment on lines 54 to 55
env:
IGNORE_CMD_ARGS_ERRORS: "1"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new test will fail without this since a module attempts to read the pytest options as regular webui arguments.

img,
tile_size=opts.ESRGAN_tile,
tile_overlap=opts.ESRGAN_tile_overlap,
# TODO: `outscale`?
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to downscale too-large images here according to info.scale? AIUI, there might be some other process that also does that?

return Image.fromarray(output, 'RGB')


def upscale_with_model(model, img: Image.Image, *, tile_size: int, tile_overlap: int = 0):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is used by both esrgan and realesrgan.

Comment on lines +131 to +141
def inference(
img,
model,
*,
tile: int,
tile_overlap: int,
window_size: int,
scale: int,
device,
):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This smells like tiled_upscale, but with tile overlap handled by weight scaling.

@akx akx force-pushed the spandrel branch 2 times, most recently from ce58f5d to 37458e6 Compare December 27, 2023 09:52
@akx akx marked this pull request as ready for review December 27, 2023 10:12
@gel-crabs
Copy link
Contributor

Oh yeah, I've been using this PR for a couple days now; it works.

@akx
Copy link
Collaborator Author

akx commented Dec 29, 2023

@gel-crabs Thanks for trying it out! I (force-)pushed this branch to update spandrel to a newer version, as well as add experimental support for HAT upscalers, if you want to try that out. (You'll need to bring your own models and put them in models/HAT/.)

@gel-crabs
Copy link
Contributor

gel-crabs commented Dec 29, 2023

@gel-crabs Thanks for trying it out! I (force-)pushed this branch to update spandrel to a newer version, as well as add experimental support for HAT upscalers, if you want to try that out. (You'll need to bring your own models and put them in models/HAT/.)

It works! Admittedly it has issues with deepcache where it adds black splotches to the image during hires fix, but otherwise working.

I tried to hack in support for DAT as well by copying hat_model.py and replacing HAT with DAT, but it just made the image go full black.

Edit: It actually has nothing to do with deepcache, or any extensions at all. I'm going to try testing with different models.

I tried with a different 4x HAT upscaler and it gives full black images, so the HAT support doesn't seem to be working correctly.

@AUTOMATIC1111
Copy link
Owner

I'm generally not pumped about adding new dependencies, but this removes a lot of code we just copy pasted, so that seems nice.

Some questions:

  • what's with __init__.py?
  • what's with commented code in webui.py?
  • for tests, on the new machine (which is always the case for github servers), it looks to me that it will download the model. Maybe those testscould be disabled by default? Also since you're not actually checking any changes in faces, we could reuse the existing img2img_basic.png instead of adding a new pic.
  • what happens when you put a checkpoint in a wrong dir? Say, ESRGAN checkpoint into swinir dir. Or a codeformer model into ESRGAN dir?
  • did you test all models you converted to use spandrel?

@akx
Copy link
Collaborator Author

akx commented Dec 30, 2023

I'm generally not pumped about adding new dependencies, but this removes a lot of code we just copy pasted, so that seems nice.

I think this actually leads to less dependencies in total (I'll run the numbers later). The Spandrel folks seem nice and responsive too. :)

  • what's with __init__.py?

Autogenerated by PyCharm when refactoring code. Will yeet, my bad.

  • what's with commented code in webui.py?

Also accidentally added to this PR (since I was tired of having a gazillion WebUI tabs get auto-opened), my bad. Will yeet.

  • for tests, on the new machine (which is always the case for github servers), it looks to me that it will download the model.

I can also add an actions/cache action so we cache the models/ directory (like Spandrel's tests do).

Also since you're not actually checking any changes in faces, we could reuse the existing img2img_basic.png instead of adding a new pic.

Since we do facexlib to detect faces and only act on the face patches, using an image that doesn't have any faces will not exercise the code that would actually run the Spandrel model 😁

I'll add a simple "output image was different" check!

  • what happens when you put a checkpoint in a wrong dir? Say, ESRGAN checkpoint into swinir dir. Or a codeformer model into ESRGAN dir?

Good question - since Spandrel auto-detects the model arch from the checkpoint, it'd happily load it, and maybe fail with a parameter error down the line when we try to call the architecture with kwargs it doesn't get. I can add isinstance checks to see we loaded the correct model (and warn and fail if so) instead of just blindly forging ahead.

  • did you test all models you converted to use spandrel?

I did, on my machine (Macbook).

@AUTOMATIC1111 AUTOMATIC1111 merged commit cd12c0e into AUTOMATIC1111:dev Dec 30, 2023
2 of 3 checks passed
@akx akx deleted the spandrel branch December 30, 2023 15:08
@wcde
Copy link

wcde commented Jan 3, 2024

Looks like SwinIR x2 is not working now. I get this in any model:

File "...\modules\images.py", line 286, in resize_image
  res = resize(im, width, height)
File "...\modules\images.py", line 278, in resize
  im = upscaler.scaler.upscale(im, scale, upscaler.data_path)
File "...\modules\upscaler.py", line 65, in upscale
  img = self.do_upscale(img, selected_model)
File "...\extensions-builtin\SwinIR\scripts\swinir_model.py", line 48, in do_upscale
  img = upscaler_utils.upscale_2(
File "...\modules\upscaler_utils.py", line 181, in upscale_2
  output = tiled_upscale_2(
File "...\modules\upscaler_utils.py", line 149, in tiled_upscale_2
  ].add_(out_patch)
RuntimeError: The size of tensor a (2560) must match the size of tensor b (1280) at non-singleton dimension 3

@akx
Copy link
Collaborator Author

akx commented Jan 3, 2024

@wcde Thanks, I'll take a peek – what's your SwinIR tile size and overlap setting, and the size of the image you're trying to upscale?

@wcde
Copy link

wcde commented Jan 3, 2024

In code hardcoded scale to 4.
Should be something like that:

img = upscaler_utils.upscale_2(
    img,
    model,
    tile_size=shared.opts.SWIN_tile,
    tile_overlap=shared.opts.SWIN_tile_overlap,
    scale=model.scale,
    desc="SwinIR",
)

Second problem - model is loaded with dtype devices.dtype, but in upscale_2 input casted to fp32:

tensor = pil_image_to_torch_bgr(img).float()

Which give:

RuntimeError: Input type (float) and bias type (struct c10::Half) should be the same

@akx
Copy link
Collaborator Author

akx commented Jan 3, 2024

@wcde In fairness, scale has always been hard-coded to 4 unless I overlooked something:

I'll take a look at the half issue, thanks for pointing it out.

@light-and-ray
Copy link
Contributor

I guess it will happen with a lot of extensions after updating. Maybe it should be mentioned in changelog?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants