Skip to content

Latest commit

 

History

History
252 lines (164 loc) · 20.8 KB

README.en-US.md

File metadata and controls

252 lines (164 loc) · 20.8 KB

Real-ESRGAN GUI

build download

A cross-platform GUI for image upscaler Real-ESRGAN with additional features. Inspired by waifu2x-caffe.

README translations

Introduction

This application uses Real-ESRGAN's portable executable file (Real-ESRGAN-ncnn-vulkan) to upscale images with extremely high quality. It is written in Python and provides an user-friendly GUI with Tkinter.

Quick Start:

  • Windows 10+ Download the latest realesrgan-gui-windows-bundled-v*.7z from Release, extract the archive then launch realesrgan-gui.exe.
  • Ubuntu 22.04+ Download the latest realesrgan-gui-ubuntu-bundled-v*.tar.xz from Release, extract the archive then launch realesrgan-gui.
  • macOS Monterey+ Download the latest realesrgan-gui-macos-appbundle-v*.tar.xz from Release, extract the archive and run chmod u+x "Real-ESRGAN GUI.app/Contents/MacOS/realesrgan-gui", chmod u+x "Real-ESRGAN GUI.app/Contents/MacOS/realesrgan-ncnn-vulkan" and xattr -cr "Real-ESRGAN GUI.app" in terminal, then launch Real-ESRGAN GUI.

Tip

Real-ESRGAN-ncnn-vulkan has not been updated for a while since April 2022. You can use another actively maintained fork upscayl/upscayl-ncnn instead.

Download the latest release and extract upscayl-bin[.exe] to the directory where Real-ESRGAN GUI's executable file is located. It will be used in priority.

Notes
  • Real-ESRGAN-ncnn-vulkan's executable file and models are not contained in realesrgan-gui-windows.7z and realesrgan-gui-ubuntu.tar.xz. You have to download manually from here and extract them to the directory where Real-ESRGAN GUI's executable file is located.
  • The artifacts in GitHub Actions are built based on the latest commits. They don't contain Real-ESRGAN-ncnn-vulkan's executable file and models either.
  • Use Python 3.10 or above if you want to run Real-ESRGAN GUI from source. Don't forget to install the dependcies by pip install -r requirements.txt and extract Real-ESRGAN-ncnn-vulkan to the repository before running python main.py.
  • It may be possible to run Real-ESRGAN GUI in other Linux distributions, but I have not tested it.

Please check out CONTRIBUTING.md if you would like to contribute to Real-ESRGAN GUI.

Build Real-ESRGAN GUI.app for Apple Silicon (arm64)

The arm64 builds have been tested to perform better than universal2 builds. If you are using Apple Silicon, it is recommended that you can make an arm64 build by yourself.

# 1. Clone this repository.
git clone https://github.com/TransparentLC/realesrgan-gui.git
cd realesrgan-gui

# 2. Run the shell script to start building. Since the latest commit of this project requires tk version 8.6, while Python 3.10 bundles tk version 8.5, the local packaging must be done in Python 3.11 environment. Before packaging, enter python3 -V in the terminal to confirm the current version is 3.11.
# Password is required for "sudo pyinstaller realesrgan-gui-macOS-arm64.spec"
sh Build-macOS-arm64.sh

# 3. The built application is saved in "./realesrgan-gui/dist/Real-ESRGAN GUI.app".

Warning

Since I don't have any device running macOS, I may not be able to handle macOS-related issues.

Related projects

Features

In addition to the features supported by Real-ESRGAN-ncnn-vulkan, Real-ESRGAN GUI also supports these additional features:

  • Upscale to arbitrary size
    • Real-ESRGAN-ncnn-vulkan can only upscale the input image at a fixed 2-4x ratio (depending on the model chosen).
    • Real-ESRGAN GUI uses Real-ESRGAN-ncnn-vulkan to upscale the input image in multiple times, then downsamples the output image to the desired size with general image scaling algorithms.
    • For example, to upscale a 640x360 image to 1600 in width with a 2x model, it will be upscaled twice to 1280x720 and 2560x1440 then downsampled to 1600x900.
    • Lanczos is used by default to downsample the image. Other algorithms are also available.
  • Upscale GIF images
    • Split animated GIF into frames and reads their duration. Upscale the frames one by one then merge them into upscaled animated GIF image.
  • Drag and drop support
    • Drag and drop image files or directories onto the GUI and the input and output path will be set automatically.
    • The output path will contain a suffix like x4, w1280, h1080 based on the chosen resize mode.
  • Dark theme
    • Choose to use light or dark theme according to system settings.
    • The detection is done using darkdetect.
    • Not available on macOS?
  • Multi-language support
    • Simplified and traditional Chinese and English are currently supported.
    • Uses locale.getdefaultlocale for language detection.
    • Fallback to English by default if translated text is missing.
    • You can add or improve translations by editing i18n.ini. Contributions are very welcome!
      • After adding your language to i18n.ini, run the generate_locales_map.py file and copy the locales_map variable from the output until the end. Then, replace the variable in i18n.py with the one you copied. If you encounter any issues running generate_locales_map.py, try running pip install -r requirements.txt in the command line to install the necessary dependencies, and then try again.
      • If you don't want to deal with generate_locales_map.py, you can directly add your language code and the visible name of your language to the locales_map variable in the i18n.py file.

Samples

Nearest Neighbor Lanczos waifu2x-caffe Real-ESRGAN
Nearest Neighbor Lanczos waifu2x-caffe Real-ESRGAN
Nearest Neighbor Real-ESRGAN
Nearest Neighbor Real-ESRGAN
  • waifu2x-caffe samples are upscaled using UpResNet10 and UpPhoto models with noise reduction level 3 and TTA enabled.
  • Real-ESRGAN samples are upscaled using realesrgan-x4plus-anime and realesrgan-x4plus models with TTA enabled.
  • The original images are upscaled to 4x.
  • The displayed GIFs are lossy compressed to reduce the file size.

Frequently asked questions

Which model should I choose?

I recommend realesrgan-x4plus for real-life photos and realesrgan-x4plus-anime for anime images.

For different upscale ratio versions of the same model, it is recommended to choose the version that is equal to or greater than the ratio at which you want to enlarge the image. For example, if a model has x2 and x4 version and you want to upscale an image by 3x, you should choose the x4 version.

Models with animevideo in the filename are designed for anime videos. These models are small and have a faster processing speed (about 1.5-3x compare to realesrgan-x4plus-anime). However, Real-ESRGAN GUI will not consider adding video processing related features.

You can download additional models from here and place the bin and param files in the models directory to install. These model may produce better (or worse) results than the official models for some images, especially for real-life photos.

The usage of tile size

Corresponding to Real-ESRGAN-ncnn-vulkan's -t tile-size param. You can choose "auto" in most cases, or use a larger value if you have enough VRAM. Larger tile size can slightly increase processing speed and the upscaled image's quality, although it may not be obvious.

You can check the difference between the two images upscaled to 4x with tile size 32 and 256 from the 256x256 test image comes with Real-ESRGAN-ncnn-vulkan.

See #32 (in Chinese) for more details on this.

The usage of TTA mode

Slightly improve the quality of upscaled image, but the effect is actually very insignificant. The processing speed will become extremely slow if TTA mode is enabled, so it is not recommended to enable it.

I downloaded some anime images larger than 1200px to conduct an experiment: downsample the image to 1/4 then upscale them with realesrgan-x4plus-anime model, measure upscaling quality by SSIM compared with the original image. The TTA-enabled image's SSIM is only about 0.002 higher than TTA-not-enabled image. It is difficult to see the difference with eyes.

What is "additional processing for GIF with transparency"?

GIFs only support a palette of up to 256 RGB colors and set one of them to be transparent (optional), which means that there is no translucency. For GIFs with transparent parts, this raises two problems.

  • The Alpha channel of the image has only two values, 0 and 255, and can be represented by an image with only black and white colors, with severe jaggies.
  • The color of the transparent part on the RGB channel becomes unpredictable after each frame of a GIF is split out and saved as a PNG, WebP, etc. For example, the color set as transparent in a GIF is originally #FFFFFF, but after saving the frame it may become #000000, although it makes no difference if you just look at the image.

For upscaling GIF images using Real-ESRGAN directly (Example), the impact of the two problems above are:

  • The upscaled alpha channel's quality is very poor, resulting in a jagged ring around the scaled frame.
  • The color of the jagged ring is unpredictable, for example black in some cases and looks very ugly.

This option was added to resolve these issues. It adds the following actions:

  • Force the color of the transparent part to be white when splitting out each frame of the GIF.
  • Add a 3px Gaussian blur and apply a contrast curve to smooth out the jagged rings in the alpha channel. Then dither the alpha channel to a black and white image with only 0 and 255 values.

This option is experimental and it is recommended to enable it only when upscaling GIFs with transparency.

About lossy compression, compression quality and custom compression/post-processing command

If lossy compression is enabled and the output format is JPEG or WebP, you can control the compression quality of the output image to the set value. If the input is a directory, the output compression quality will also be affected by this option when upscaling JPEG or WebP images in the directory. The compression is done using Pillow.

If this option is not turned on, lossless compression is used when the output is in WebP format.

If custom compression/post-processing command is set, the Pillow's compression will not be performed. You can set a command to compress the upscaled image or do other processing with it.

  • {input} represents the path of the input file.
  • {output} represents the path of the output file.
  • {output:ext} represents the path of the output file with the extension ext.
  • Cookbook:
    • Use avifenc (libavif) to convert to AVIF: avifenc --speed 6 --jobs all --depth 8 --yuv 420 --min 0 --max 63 -a end-usage=q -a cq-level=30 -a enable-chroma-deltaq=1 --autotiling --ignore-icc --ignore-xmp --ignore-exif {input} {output:avif}
    • Use cjxl (libjxl) to convert to JPEG XL: cjxl {input} {output:jxl} --quality=80 --effort=9 --progressive --verbose
    • Use gif2webp (libwebp) to convert the output GIF to WebP: gif2webp -lossy -q 80 -m 6 -min_size -mt -v {input} -o {output:webp}
    • Use ImageMagick to add a text watermark in the lower-right corner and then convert to AVIF: magick convert -fill white -pointsize 24 -gravity SouthEast -draw "text 16 16 'https://github.com/TransparentLC/realesrgan-gui'" -quality 80 {input} {output:avif}

Where the configuration file is saved?

config.ini in the repository's directory or in the directory where Real-ESRGAN GUI's executable is located, without this file the default configuration is used.

The configuration will be saved automatically when exiting the program.

Additional models

You can download additional models from Upscale Wiki and use them in Real-ESRGAN GUI. These model may produce better (or worse) results than the official model for some images.

These model uses PyTorch's pth format, but Real-ESRGAN GUI (Real-ESRGAN-ncnn-vulkan) needs NCNN's bin and param format. You can follow this guide (written by RealSR-NCNN-Android's author in Chinese) to make a conversion with cupscale's pth2ncnn utility. The model's filename must contain its upscale factor like x4 or 4x.

You can download some converted additional models from here.

Why not (other similar GUI)?

Of course, there is more than one GUI for Real-ESRGAN. Here is a list of some, with reasons why I didn't use them and decided to build on my own.

This is an all-in-one toolbox that integrates tons of tools, including waifu2x, Anime4k, Real-SR, SRMD, Real-ESRGAN, Real-CUGAN ... for image upscaling and CAIN, DAIN, RIFE ... for video frame interpolation and other utils including ffmpeg, ImageMagick, gifsicle, nircmd, wget and more. Only supported on Windows.

The rich feature set leads to a complex UI and configurations, however I only need a small number of its features. I used to be a user of it when it was open source, but then the author modified LICENSE and switched to closed source since v3.41.01 in May 2021. Moreover, the advertisement of the premium version appears every time when it starts up and finishes processing.

Although I don't rely on those premium-only features, the changes still encourages me to write another lightweight GUI that meets my needs.

Built with Electron, so it is also cross-platform. Benefiting from the power of front-end technologies, the UI and interaction are excellent and there is even a comparison slider between the original image and the upscaled image. The documentation is also very detailed.

However, it still lacks some features such as handling GIFs, customizing post-processing commands, and localization.

Since it is an Electron application, the users will have to install yet another Chromium browser😂 The size of Upscayl is about 400 MB while Real-ESRGAN GUI is only about 10 MB (Windows version, excluding Real-ESRGAN-ncnn-vulkan's executable and models).

These GUIs are simple wrappers for the CLI parameters without any extra features.

However, I like the Material Design used by tsukumijima/Real-ESRGAN-GUI.

Open-source libraries used

Credits

Thanks @blacklein and @hyrulelinks for offering helps on using and bundling this application in macOS.

And other contributors!

Contributors

Star history

Star History Chart