iVSR facilitates AI media processing with exceptional quality and performance on Intel hardware.
iVSR offers a patch-based, heterogeneous, multi-GPU, and multi-algorithm solution, harnessing the full capabilities of Intel CPUs and GPUs. It is adaptable for deployment on a single device, a distributed system, cloud infrastructure, edge cloud, or K8S environment.
- Simple APIs ensure that any changes to the OpenVINO API remain hidden.
- A patch-based solution facilitates inference on hardware with limited memory capacity, particularly useful for super-resolution of high-resolution input videos, such as 4K.
- The iVSR SDK includes features to safeguard AI models created by Intel, which contain Intel IP.
- The iVSR SDK is versatile and supports a wide range of AI media processing algorithms.
- For specific algorithms, performance optimization can be executed to better align with customer requirements.
This repository or package includes the following major components:
The iVSR SDK is a middleware library that supports various AI video processing filters. It is designed to accommodate different AI inference backends, although currently, it only supports OpenVINO.
For a detailed introduction to the iVSR SDK API, please refer to this introduction.
We've also included a vsr_sample
as a demonstration of its usage.
In order to support the widely-used media processing solution FFmpeg, we've provided an iVSR SDK plugin to simplify integration.
This plugin is integrated into FFmpeg's dnn_processing
filter in the FFmpeg documentation in the libavfilter library, serving as a new ivsr
backend to this filter. Please note that the patches provided in this project are specifically for FFmpeg n6.1.
In this folder, you'll find patches for OpenVINO that enable the Enhanced BasicVSR model. These patches utilize OpenVINO's Custom OpenVINO™ Operations feature, which allows users to support models with custom operations not inherently supported by OpenVINO.
These patches are specifically for OpenVINO 2022.3, meaning the Enhanced BasicVSR model will only work on OpenVINO 2022.3 with these patches applied.
Currently, iVSR offers two AI media processing functionalities: Video Super Resolution (VSR) and Smart Video Processing (SVP) for bandwidth optimization. Both functionalities can be run on Intel CPUs and Intel GPUs (including Flex170, Arc770) via OpenVINO and FFmpeg.
Video Super Resolution (VSR) is a technique extensively employed in the AI media enhancement domain to upscale low-resolution videos to high-resolution. iVSR supports Enhanced BasicVSR
, Enhanced EDSR
, and TSENet
. It also has the capability to be extended to support additional models.
-
BasicVSR
is a publicly available AI-based VSR algorithm. For more details on the publicBasicVSR
, please refer to this paper.
We have improved the public model to attain superior visual quality and reduced computational complexity. This improved model is namedEnhanced BasicVSR
. The performance of theEnhanced BasicVSR
model inference has also been optimized for Intel GPUs. Please note that this optimization is specific to OpenVINO 2022.3. Therefore, the Enhanced BasicVSR model only works with OpenVINO 2022.3 with the applied patches.
The input shape of this model and the output shape are:Input shape: [1, (channels)3, (frames)3, H, W] Output shape: [1, (channels)3, (frames)3, 2xH, 2xW]
-
EDSR
is another publicly available AI-based single image SR algorithm. For more details on the public EDSR, please refer to this paper
We have improved the publicEDSR
model to reduce the computational complexity by over 79% compared to Enhanced BasicVSR. This improvement maintains similar visual quality and is namedEnhanced EDSR
.
The input shape of this model and the output shape are:Input shape: [1, (channels)3, H, W] Output shape: [1, (channels)3, 2xH, 2xW]
-
TSENet
is one multi-frame SR algorithm derived from ETDS.
We provide a preview version of the feature to support this model in the SDK and its plugin. Please contact your Intel representative to obtain the model package.
The input shape of this model and the output shape are:Input shape: [1, (channels * frames)9, H, W] Output shape: [1, (channels)3, 2xH, 2xW]
For each inference, the input data is the
(n-1)th
,(n)th
, and(n+1)th
frames combined. The output data is the(N)th
frame. For the first frame, the input data is1st
,1st
,2nd
frames combined. For the last frame, the input data is the(n-1)th
,(n)th
,(n)th
frames combined.
SVP
is an AI-based video prefilter that enhances perceptual rate-distortion in video encoding. With SVP
, encoded video streams maintain the same visual quality while reducing bandwidth usage.
Two SVP model variants are provided:
-
SVP-Basic: This model is designed for efficiency, preserving fidelity while reducing the encoded bitrate. Modifications made by SVP-Basic are imperceptible to the human eye but can be measured by minor BD-rate degradation when evaluated using SSIM or MS-SSIM metrics. SVP-Basic is adaptable to various video scenarios, including live sports, gaming, livestream sales, VOD, video conferencing, video surveillance, and 5G video streaming.
-
SVP-SE: This model focuses on subjective video quality preservation, achieving up to 50% bitrate savings. It enhances visuals by reducing complex details and noise that are less perceptible to human eyes. As a result, it cannot be evaluated by traditional full-reference visual quality metrics like PSNR, SSIM, or VMAF. SVP-SE improves the visibility and quality of visuals, making them more vivid and appealing, which is beneficial in industries such as entertainment, media, and advertising.
The input and output shapes are:
- RGB based model:
Input shape: [1, (channels)3, H, W] Output shape: [1, (channels)3, H, W]
- Y based model:
Input shape: [1, (channels)1, H, W] Output shape: [1, (channels)1, H, W]
The software was validated on:
- Intel Xeon hardware platform
- (Optional) Intel® Data Center GPU Flex 170(aka ATS-M1 150W)
- Host OS: Linux-based OS (Ubuntu 22.04 or Rocky Linux 9.3)
- Docker-based OS: Ubuntu 22.04 or Rocky Linux 9.3
- OpenVINO: 2022.3, 2023.2, or 2024.5
- FFmpeg: n6.1
Building iVSR requires the installation of the GPU driver (optional), OpenCV, OpenVINO, and FFmpeg.
We provide three ways to install requirements and build iVSR SDK & iVSR FFmpeg plugin:
- Install dependencies and build iVSR manually
- Install dependencies and build iVSR by scripts
- Install dependencies and build iVSR by Dockerfile
Note that to run inference on a GPU, it is necessary to have kernel packages installed on the bare metal system beforehand. See Install GPU kernel packages for details.
Refer to this instruction for the installation guide on Ubuntu. GPU runtime driver/packages are also installed in script and dockerfile provided.
Here are two guides for your reference:
- Generic Manual Building Guide: If you are familiar with Intel® devices and have experience with Intel® developed software, follow the official steps to build OpenCV and OpenVINO from source code. Refer to the Generic manual building guide.
- Quick Manual Building Guide: For absolute beginners, this tutorial provides step-by-step instructions to build the project on a clean Ubuntu OS. Refer to the Quick manual building guide.
We provide a build.sh
script to facilitate building the entire project from source on a clean Ubuntu 22.04-based Linux machine.
chmod a+x ./build.sh
./build.sh --ov_version [2022.3|2023.2|2024.5]
The script accepts the following parameter:
ov_version
: Specifies the OpenVINO version. iVSR supports2022.3
,2023.2
, and2024.5
. Note that running the Enhanced BasicVSR model requires2022.3
.
After the build is complete, set the environment variables. For OpenVINO 2022.3:
source <workspace>/ivsr_ov/based_on_openvino_2022.3/openvino/install/setupvars.sh
For other OpenVINO versions installed via official packages, manual environment setup is not required.
Once the build is successfully completed, refer to section 3.2 for instructions on using the FFmpeg command line to run the pipelines. Feel free to modify and update these scripts as needed. For newly released OpenVINO versions, please follow the manual build guide.
To simplify the environment setup, Dockerfiles are provided. Follow the Docker image build guide to build the Docker image and run the application in Docker containers.
You can run inference on the iVSR SDK using either the vsr_sample
or the ffmpeg
integrated with the iVSR plugin. Before running them, set up the environment with the following commands:
source <OpenVINO installation dir>/install/setupvars.sh
export LD_LIBRARY_PATH=<Package dir>/ivsr_sdk/lib:<OpenCV installation folder>/install/lib:$LD_LIBRARY_PATH
Note that the current solution is of pre-production
quality.
The vsr_sample
is developed using the iVSR SDK and OpenCV. For detailed instructions on running inference with it, refer to this section.
After applying the FFmpeg plugin patches and building FFmpeg, refer to the FFmpeg command line samples for instructions on running inference with FFmpeg.
iVSR supports only models in OpenVINO IR format. Contact your Intel representative to obtain the model files, as they are not included in the repo.
iVSR is licensed under the BSD 3-clause license. See LICENSE for details.