Skip to content

Latest commit

 

History

History
103 lines (66 loc) · 5.41 KB

README.md

File metadata and controls

103 lines (66 loc) · 5.41 KB

[CVPR 2023] Real-time 6K Image Rescaling with Rate-distortion Optimization

Chenyang Qi*, Xin Yang*, Ka Leong Cheng, Yingcong Chen, and Qifeng Chen

The application of 6K image rescaling in the context of cloud photo storage on smartphones (e.g., iCloud).

CLICK for the full description

As more high-resolution (HR) images are uploaded to cloud storage nowadays, challenges are brought to cloud service providers (CSPs) in fulfilling latency-sensitive image reading requests (e.g., zoom-in) through the internet. To facilitate faster transmission and high-quality visual content, our HyperThumbnail framework helps CSPs to encode an HR image into an LR JPEG thumbnail, which users could cache locally. When the internet is unstable or unavailable, our method can still reconstruct a high-fidelity HR image from the JPEG thumbnail in real time.

🎏 Abstract

HyperThumbnail is the first real-time 6K framework for rate-distortion-aware image rescaling.

CLICK for the full abstract

Contemporary image rescaling aims at embedding a high-resolution (HR) image into a low-resolution (LR) thumbnail image that contains embedded information for HR image reconstruction. Unlike traditional image super-resolution, this enables high-fidelity HR image restoration faithful to the original one, given the embedded information in the LR thumbnail. However, state-of-the-art image rescaling methods do not optimize the LR image file size for efficient sharing and fall short of real-time performance for ultra-high-resolution (\eg, 6K) image reconstruction. To address these two challenges, we propose a novel framework (HyperThumbnail) for real-time 6K rate-distortion-aware image rescaling. Our framework first embeds an HR image into a JPEG LR thumbnail by an encoder with our proposed quantization prediction module, which minimizes the file size of the embedding LR JPEG thumbnail while maximizing HR reconstruction quality. Then, an efficient frequency-aware decoder reconstructs a high-fidelity HR image from the LR one in real time. Extensive experiments demonstrate that our framework outperforms previous image rescaling baselines in rate-distortion performance and can perform 6K image reconstruction in real time.

🚧 Todo

  • Release the training and inference codes
  • Release the guidance documents for saving and loading the .jpg HyperThumbnail

Preparations

Before run our training and inference codes, please install our requirements by:

pip install -r ./requirements.txt

And please install our modified version of BaiscSR (Wang et al.) by:

cd ./BasicSR
pip install -r ./requirements.txt
python setup.py develop

Datasets and pretrained checkpoints

You can download example images here and our pretrained checkpoint for 4x rescaling here using bash command

bash download.sh

Training

You may prepare DIV2K and Set14 datasets following docs. Then start the 4x rescaling training using the config in the options/train/4x/HyperThumbnail_4x.yml.

python ./run.py -opt ./options/train/4x/HyperThumbnail_4x.yml

Tips: you can also go to BaiscSR for detailed documentation of the training code.

Inference

We provide a 4x rescaling testing config in the options/test/4x/HyperThumbnail_4x_test.yml. You can start the testing by:

python ./run_test.py -opt ./options/test/4x/HyperThumbnail_4x_test.yml

Export the jpeg HyperThumbnail

To export .jpg HyperThumbnail, you should install the TorchJPEG (Ehrlich et al.) package.

📍 Citation

@inproceedings{qi2023hyperthumbnail,
    author    = {Qi, Chenyang and Yang, Xin and Cheng, Ka Leong and Chen, Ying-Cong and Chen, Qifeng},
    title     = {Real-Time 6K Image Rescaling With Rate-Distortion Optimization},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    year      = {2023},
}

Acknowledgement

  • We build our training and testing code base on the BaiscSR toolbox. We are truely grateful for their outstanding works and contributions to the field.
  • We thank TorchJPEG for the JPEG extension for pytorch that interfaces with libjpeg to allow for manipulation of low-level JPEG data.
  • We thank CompressAI for the implementation of Entropy Model.