Skip to content

RealESRGANv2

WolframRhodium edited this page Dec 5, 2022 · 22 revisions

RealESRGANv2 is a practical image super-resolution neural network that removes common image artifacts.

Link:

Models

Includes small models optimized for anime videos:

  • RealESRGANv2: requires RGB input
    • animevideo-xsx2, animevideo-xsx4 (2x / 4x upscaling)

vsmlrt.py wrapper Usage

In order to simplify usage, we provided a Python wrapper module vsmlrt:

from vsmlrt import RealESRGANv2, RealESRGANv2Model, Backend

src = core.std.BlankClip(format=vs.RGBS)

# backend could be:
#  - CPU Backend.OV_CPU(): the recommended CPU backend; generally faster than ORT-CPU.
#  - CPU Backend.ORT_CPU(num_streams=1, verbosity=2): vs-ort cpu backend.
#  - GPU Backend.ORT_CUDA(device_id=0, cudnn_benchmark=True, num_streams=1, verbosity=2)
#     - use device_id to select device
#     - set cudnn_benchmark=False to reduce script reload latency when debugging, but with slight throughput performance penalty.
#  - GPU Backend.TRT(fp16=True, device_id=0, num_streams=1): TensorRT runtime, the fastest NV GPU runtime.
flt = RealESRGANv2(src, model=RealESRGANv2Model.animevideo_xsx2, backend=Backend.ORT_CUDA())

Raw Model Usage

src = core.std.BlankClip(width=1920, height=1080, format=vs.RGBS)
flt = core.ov.Model(src, "RealESRGANv2/RealESRGANv2-animevideo-xsx2")

Benchmarking

Measurements: FPS / Device Memory (MB)

Device memory:

  • CPU: private memory including VapourSynth
  • GPU: device memory including context

RTX 3090

Software: VapourSynth R57, Windows 10 LTSC 2021, Graphics Driver 511.23.

Input size: 1920x1080

Backends

  1. vs-mlrt v6
  2. vs-realesrgan v2.0.0, PyTorch 1.10.1+cu113
  3. vs-mlrt v8 (driver 511.79)

Performance

FP32

Model [1] ort-cuda [1] trt [2] cuda [3] ort-cuda [3] trt [3] trt (no tf32)
xsx2 5.12 / 2594 5.35 / 2953 3.34 / 7251 5.92 / 2786 6.70 / 1760 6.65 / 1759

FP16

Model [1] ort-cuda [1] trt [1] trt (2 streams) [2] cuda [3] ort-cuda [3] trt [3] trt (2 streams)
xsx2 10.6 / 2042 11.7 / 2041 21.6 / 2926 3.43 / 4753 8.12 / 3552 13.1 / 1624 22.3 / 2451

RTX 2080 Ti

Software: VapourSynth R57, Windows 10 LTSC 2021, Graphics Driver 511.23.

Input size: 1920x1080

Backends

  1. vs-mlrt v6
  2. vs-realesrgan v2.0.0, PyTorch 1.10.1+cu113
  3. vs-mlrt v8 (driver 511.79)

Performance

FP32

Model [1] ort-cuda [1] trt [2] cuda [3] ort-cuda [3] trt
xsx2 3.24 / 2302 3.91 / 1816 2.63 / 6804 3.08 / 2215 3.89 / 1603

FP16

Model [1] ort-cuda [1] trt [1] trt (2 streams) [2] cuda [3] ort-cuda [3] trt [3] trt (2 streams)
xsx2 4.84 / 2558 7.50 / 1299 11.8 / 1947 3.16 / 4296 5.12 / 2983 7.65 / 1450 14.0 / 2265

Tesla V100

Software: VapourSynth R57, Windows Server 2019, Graphics Driver 511.23.

Input size: 1920x1080

Backends

  1. vs-mlrt v6
  2. vs-realesrgan v2.0.0, PyTorch 1.10.1+cu113

Performance

FP32

Model [1] ort-cuda [1] trt [1] trt (2 streams) [2] cuda
xsx2 5.27 / 2213 6.07 / 1835 6.86 / 2999 3.64 / 6799

FP16

Model [1] ort-cuda [1] trt [1] trt (2 streams) [2] cuda
xsx2 8.90 / 2981 11.8 / 1637 15.5 / 2539 4.72 / 4291

Tesla A10

Software: VapourSynth R57, Windows Server 2019, Graphics Driver 511.23, lock the GPU clocks at max frequency.

Input size: 1920x1080

Backends

  1. vs-mlrt v6
  2. vs-realesrgan v2.0.0, PyTorch 1.10.1+cu113

Performance

FP32

Model [1] ort-cuda [1] trt [1] trt (2 streams) [2] cuda
xsx2 4.15 / 2817 4.97 / 1965 5.26 / 3125 3.47 / 7075

FP16

Model [1] ort-cuda [1] trt [1] trt (2 streams) [2] cuda
xsx2 7.16 / 3585 11.5 / 1881 12.1 / 2769 4.90 / 4585

Tesla A10G

Software: VapourSynth R58, Windows Server 2022, Graphics Driver 511.65, lock the GPU clocks at max frequency.

Input size: 1920x1080

Backends

  1. vs-mlrt v8

Performance

FP32

Model [1] trt
xsx2 5.88 / 1584

FP16

Model [1] trt [1] trt (2 streams)
xsx2 13.0 / 1525 14.8 / 2401

Tesla A100 (PCIe, 40 GB)

Software: VapourSynth R57, Windows Server 2019, Graphics Driver 511.23.

Input size: 1920x1080

Backends

  1. vs-mlrt v6
  2. vs-realesrgan v2.0.0, PyTorch 1.10.1+cu113

Performance

FP32

Model [1] ort-cuda [1] trt [1] trt (2 streams) [1] trt (3 streams) [2] cuda
xsx2 11.4 / 3903 16.7 / 2167 20.3 / 3315 21.5 / 4463 7.00 / 7229

FP16

Model [1] ort-cuda [1] trt [1] trt (2 streams) [1] trt (3 streams) [2] cuda
xsx2 16.1 / 3647 28.8 / 1655 41.0 / 2295 46.0 / 2953 6.89 / 4701

Tesla A100 (SXM4, 80 GB)

Software: VapourSynth R57-A4, Windows Server 2022, Graphics Driver 516.94.

Input size: 1920x1080

Backends

  1. vs-mlrt v9

Performance

FP16

Model [1] trt [1] trt (2 streams) [1] trt (3 streams)
xsx2 25.9 / 1987 34.9 / 1915 44.4 / 2530

Icelake Server

Hardware: Xeon Icelake Server 32C64T @2.90 GHz

Software: VapourSynth R57, Windows Server 2019

Input size: 1920x1080

Backends

  1. vs-mlrt v6

Performance

FP32

Model [1] ov-cpu
xsx2 1.49 / 16751

EPYC Milan

Hardware: EPYC Milan 32C64T @2.55 GHz

Software: VapourSynth R57, Windows Server 2019

Input size: 1920x1080

Backends

  1. vs-mlrt v6

Performance

FP32

Model [1] ov-cpu
xsx2 0.53 / 16750
Clone this wiki locally