A cross-platform GUI for image upscaler Real-ESRGAN with additional features. Inspired by waifu2x-caffe.
README translations
- 简体中文 (Simplified Chinese)
- English
- Ukrainian (Українська) Translated by @kirill0ermakov
- Türkçe (Turkish) Translated by @NandeMD tarafından
This application uses Real-ESRGAN's portable executable file (Real-ESRGAN-ncnn-vulkan) to upscale images with extremely high quality. It is written in Python and provides an user-friendly GUI with Tkinter.
Quick Start:
- Download the latest
realesrgan-gui-windows-bundled-v*.7z
from Release, extract the archive then launchrealesrgan-gui.exe
. - Download the latest
realesrgan-gui-ubuntu-bundled-v*.tar.xz
from Release, extract the archive then launchrealesrgan-gui
. - Download the latest
realesrgan-gui-macos-appbundle-v*.tar.xz
from Release, extract the archive and runchmod u+x "Real-ESRGAN GUI.app/Contents/MacOS/realesrgan-gui"
,chmod u+x "Real-ESRGAN GUI.app/Contents/MacOS/realesrgan-ncnn-vulkan"
andxattr -cr "Real-ESRGAN GUI.app"
in terminal, then launchReal-ESRGAN GUI
.
Tip
Real-ESRGAN-ncnn-vulkan has not been updated for a while since April 2022. You can use another actively maintained fork upscayl/upscayl-ncnn instead.
Download the latest release and extract upscayl-bin[.exe]
to the directory where Real-ESRGAN GUI's executable file is located. It will be used in priority.
Notes
- Real-ESRGAN-ncnn-vulkan's executable file and models are not contained in
realesrgan-gui-windows.7z
andrealesrgan-gui-ubuntu.tar.xz
. You have to download manually from here and extract them to the directory where Real-ESRGAN GUI's executable file is located. - The artifacts in GitHub Actions are built based on the latest commits. They don't contain Real-ESRGAN-ncnn-vulkan's executable file and models either.
- Use Python 3.10 or above if you want to run Real-ESRGAN GUI from source. Don't forget to install the dependcies by
pip install -r requirements.txt
and extract Real-ESRGAN-ncnn-vulkan to the repository before runningpython main.py
. - It may be possible to run Real-ESRGAN GUI in other Linux distributions, but I have not tested it.
Please check out CONTRIBUTING.md if you would like to contribute to Real-ESRGAN GUI.
The arm64
builds have been tested to perform better than universal2
builds. If you are using Apple Silicon, it is recommended that you can make an arm64
build by yourself.
# 1. Clone this repository.
git clone https://github.com/TransparentLC/realesrgan-gui.git
cd realesrgan-gui
# 2. Run the shell script to start building. Since the latest commit of this project requires tk version 8.6, while Python 3.10 bundles tk version 8.5, the local packaging must be done in Python 3.11 environment. Before packaging, enter python3 -V in the terminal to confirm the current version is 3.11.
# Password is required for "sudo pyinstaller realesrgan-gui-macOS-arm64.spec"
sh Build-macOS-arm64.sh
# 3. The built application is saved in "./realesrgan-gui/dist/Real-ESRGAN GUI.app".
Warning
Since I don't have any device running macOS, I may not be able to handle macOS-related issues.
- Use Real-ESRGAN on Android: tumuyan/RealSR-NCNN-Android
- Upscale video with Real-ESRGAN and Vapoursynth: HolyWu/vs-realesrgan
In addition to the features supported by Real-ESRGAN-ncnn-vulkan, Real-ESRGAN GUI also supports these additional features:
- Upscale to arbitrary size
- Real-ESRGAN-ncnn-vulkan can only upscale the input image at a fixed 2-4x ratio (depending on the model chosen).
- Real-ESRGAN GUI uses Real-ESRGAN-ncnn-vulkan to upscale the input image in multiple times, then downsamples the output image to the desired size with general image scaling algorithms.
- For example, to upscale a 640x360 image to 1600 in width with a 2x model, it will be upscaled twice to 1280x720 and 2560x1440 then downsampled to 1600x900.
- Lanczos is used by default to downsample the image. Other algorithms are also available.
- Upscale GIF images
- Split animated GIF into frames and reads their duration. Upscale the frames one by one then merge them into upscaled animated GIF image.
- Drag and drop support
- Drag and drop image files or directories onto the GUI and the input and output path will be set automatically.
- The output path will contain a suffix like x4, w1280, h1080 based on the chosen resize mode.
- Dark theme
- Choose to use light or dark theme according to system settings.
- The detection is done using darkdetect.
- Not available on macOS?
- Multi-language support
- Simplified and traditional Chinese and English are currently supported.
- Uses
locale.getdefaultlocale
for language detection. - Fallback to English by default if translated text is missing.
- You can add or improve translations by editing
i18n.ini
. Contributions are very welcome!- After adding your language to
i18n.ini
, run thegenerate_locales_map.py
file and copy thelocales_map
variable from the output until the end. Then, replace the variable ini18n.py
with the one you copied. If you encounter any issues runninggenerate_locales_map.py
, try runningpip install -r requirements.txt
in the command line to install the necessary dependencies, and then try again. - If you don't want to deal with
generate_locales_map.py
, you can directly add your language code and the visible name of your language to thelocales_map
variable in thei18n.py
file.
- After adding your language to
Nearest Neighbor | Lanczos | waifu2x-caffe | Real-ESRGAN |
---|---|---|---|
Nearest Neighbor | Lanczos | waifu2x-caffe | Real-ESRGAN |
---|---|---|---|
Nearest Neighbor | Real-ESRGAN |
---|---|
Nearest Neighbor | Real-ESRGAN |
---|---|
- waifu2x-caffe samples are upscaled using
UpResNet10
andUpPhoto
models with noise reduction level 3 and TTA enabled. - Real-ESRGAN samples are upscaled using
realesrgan-x4plus-anime
andrealesrgan-x4plus
models with TTA enabled. - The original images are upscaled to 4x.
- The displayed GIFs are lossy compressed to reduce the file size.
I recommend realesrgan-x4plus
for real-life photos and realesrgan-x4plus-anime
for anime images.
For different upscale ratio versions of the same model, it is recommended to choose the version that is equal to or greater than the ratio at which you want to enlarge the image. For example, if a model has x2 and x4 version and you want to upscale an image by 3x, you should choose the x4 version.
Models with animevideo
in the filename are designed for anime videos. These models are small and have a faster processing speed (about 1.5-3x compare to realesrgan-x4plus-anime
). However, Real-ESRGAN GUI will not consider adding video processing related features.
You can download additional models from here and place the bin
and param
files in the models
directory to install. These model may produce better (or worse) results than the official models for some images, especially for real-life photos.
Corresponding to Real-ESRGAN-ncnn-vulkan's -t tile-size
param. You can choose "auto" in most cases, or use a larger value if you have enough VRAM. Larger tile size can slightly increase processing speed and the upscaled image's quality, although it may not be obvious.
You can check the difference between the two images upscaled to 4x with tile size 32 and 256 from the 256x256 test image comes with Real-ESRGAN-ncnn-vulkan.
See #32 (in Chinese) for more details on this.
Slightly improve the quality of upscaled image, but the effect is actually very insignificant. The processing speed will become extremely slow if TTA mode is enabled, so it is not recommended to enable it.
I downloaded some anime images larger than 1200px to conduct an experiment: downsample the image to 1/4 then upscale them with realesrgan-x4plus-anime
model, measure upscaling quality by SSIM compared with the original image. The TTA-enabled image's SSIM is only about 0.002 higher than TTA-not-enabled image. It is difficult to see the difference with eyes.
GIFs only support a palette of up to 256 RGB colors and set one of them to be transparent (optional), which means that there is no translucency. For GIFs with transparent parts, this raises two problems.
- The Alpha channel of the image has only two values, 0 and 255, and can be represented by an image with only black and white colors, with severe jaggies.
- The color of the transparent part on the RGB channel becomes unpredictable after each frame of a GIF is split out and saved as a PNG, WebP, etc. For example, the color set as transparent in a GIF is originally #FFFFFF, but after saving the frame it may become #000000, although it makes no difference if you just look at the image.
For upscaling GIF images using Real-ESRGAN directly (Example), the impact of the two problems above are:
- The upscaled alpha channel's quality is very poor, resulting in a jagged ring around the scaled frame.
- The color of the jagged ring is unpredictable, for example black in some cases and looks very ugly.
This option was added to resolve these issues. It adds the following actions:
- Force the color of the transparent part to be white when splitting out each frame of the GIF.
- Add a 3px Gaussian blur and apply a contrast curve to smooth out the jagged rings in the alpha channel. Then dither the alpha channel to a black and white image with only 0 and 255 values.
This option is experimental and it is recommended to enable it only when upscaling GIFs with transparency.
If lossy compression is enabled and the output format is JPEG or WebP, you can control the compression quality of the output image to the set value. If the input is a directory, the output compression quality will also be affected by this option when upscaling JPEG or WebP images in the directory. The compression is done using Pillow.
If this option is not turned on, lossless compression is used when the output is in WebP format.
If custom compression/post-processing command is set, the Pillow's compression will not be performed. You can set a command to compress the upscaled image or do other processing with it.
{input}
represents the path of the input file.{output}
represents the path of the output file.{output:ext}
represents the path of the output file with the extensionext
.- Cookbook:
- Use avifenc (libavif) to convert to AVIF:
avifenc --speed 6 --jobs all --depth 8 --yuv 420 --min 0 --max 63 -a end-usage=q -a cq-level=30 -a enable-chroma-deltaq=1 --autotiling --ignore-icc --ignore-xmp --ignore-exif {input} {output:avif}
- Use cjxl (libjxl) to convert to JPEG XL:
cjxl {input} {output:jxl} --quality=80 --effort=9 --progressive --verbose
- Use gif2webp (libwebp) to convert the output GIF to WebP:
gif2webp -lossy -q 80 -m 6 -min_size -mt -v {input} -o {output:webp}
- Use ImageMagick to add a text watermark in the lower-right corner and then convert to AVIF:
magick convert -fill white -pointsize 24 -gravity SouthEast -draw "text 16 16 'https://github.com/TransparentLC/realesrgan-gui'" -quality 80 {input} {output:avif}
- Use avifenc (libavif) to convert to AVIF:
config.ini
in the repository's directory or in the directory where Real-ESRGAN GUI's executable is located, without this file the default configuration is used.
The configuration will be saved automatically when exiting the program.
You can download additional models from Upscale Wiki and use them in Real-ESRGAN GUI. These model may produce better (or worse) results than the official model for some images.
These model uses PyTorch's pth
format, but Real-ESRGAN GUI (Real-ESRGAN-ncnn-vulkan) needs NCNN's bin
and param
format. You can follow this guide (written by RealSR-NCNN-Android's author in Chinese) to make a conversion with cupscale's pth2ncnn
utility. The model's filename must contain its upscale factor like x4
or 4x
.
You can download some converted additional models from here.
Of course, there is more than one GUI for Real-ESRGAN. Here is a list of some, with reasons why I didn't use them and decided to build on my own.
This is an all-in-one toolbox that integrates tons of tools, including waifu2x, Anime4k, Real-SR, SRMD, Real-ESRGAN, Real-CUGAN ... for image upscaling and CAIN, DAIN, RIFE ... for video frame interpolation and other utils including ffmpeg, ImageMagick, gifsicle, nircmd, wget and more. Only supported on Windows.
The rich feature set leads to a complex UI and configurations, however I only need a small number of its features. I used to be a user of it when it was open source, but then the author modified LICENSE and switched to closed source since v3.41.01 in May 2021. Moreover, the advertisement of the premium version appears every time when it starts up and finishes processing.
Although I don't rely on those premium-only features, the changes still encourages me to write another lightweight GUI that meets my needs.
Built with Electron, so it is also cross-platform. Benefiting from the power of front-end technologies, the UI and interaction are excellent and there is even a comparison slider between the original image and the upscaled image. The documentation is also very detailed.
However, it still lacks some features such as handling GIFs, customizing post-processing commands, and localization.
Since it is an Electron application, the users will have to install yet another Chromium browser😂 The size of Upscayl is about 400 MB while Real-ESRGAN GUI is only about 10 MB (Windows version, excluding Real-ESRGAN-ncnn-vulkan's executable and models).
These GUIs are simple wrappers for the CLI parameters without any extra features.
However, I like the Material Design used by tsukumijima/Real-ESRGAN-GUI.
Thanks @blacklein and @hyrulelinks for offering helps on using and bundling this application in macOS.
And other contributors!