[ECCV 2022]Ghost-free High Dynamic Range Imaging with Context-aware Transformer

By Zhen Liu¹, Yinglong Wang², Bing Zeng³ and Shuaicheng Liu^3,1*

¹Megvii Technology, ²Noah’s Ark Lab, Huawei Technologies, ³University of Electronic Science and Technology of China

This is the official PyTorch implementation of our ECCV2022 paper: Ghost-free High Dynamic Range Imaging with Context-aware Transformer (HDR-Transformer). The MegEngine version is available at HDR-Transformer-MegEngine.

News

2022.08.26 The PyTorch implementation of our paper is now available.
2022.07.04 Our paper has been accepted by ECCV 2022.

Abstract

High dynamic range (HDR) deghosting algorithms aim to generate ghost-free HDR images with realistic details. Restricted by the locality of the receptive field, existing CNN-based methods are typically prone to producing ghosting artifacts and intensity distortions in the presence of large motion and severe saturation. In this paper, we propose a novel Context-Aware Vision Transformer (CA-ViT) for ghost-free high dynamic range imaging. The CA-ViT is designed as a dual-branch architecture, which can jointly capture both global and local dependencies. Specifically, the global branch employs a window-based Transformer encoder to model long-range object movements and intensity variations to solve ghosting. For the local branch, we design a local context extractor (LCE) to capture short-range image features and use the channel attention mechanism to select informative local details across the extracted features to complement the global branch. By incorporating the CA-ViT as basic components, we further build the HDR-Transformer, a hierarchical network to reconstruct high-quality ghost-free HDR images. Extensive experiments on three benchmark datasets show that our approach outperforms state-of-the-art methods qualitatively and quantitatively with considerably reduced computational budgets.

Pipeline

Illustration of the proposed CA-ViT. As shown in Fig (a), the CA-ViT is designed as a dual-branch architecture where the global branch models long-range dependency among image contexts through a multi-head Transformer encoder, and the local branch explores both intra-frame local details and inner-frame feature relationship through a local context extractor. Fig. (b) depicts the key insight of our HDR deghosting approach with CA-ViT. To remove the residual ghosting artifacts caused by large motions of the hand (marked with blue), long-range contexts (marked with red), which are required to hallucinate reasonable content in the ghosting area, are modeled by the self-attention in the global branch. Meanwhile, the well-exposed non-occluded local regions (marked with green) can be effectively extracted with convolutional layers and fused by the channel attention in the local branch.

Usage

Requirements

Python 3.7.13
PyTorch 1.9.0
Torchvision 0.10.0
CUDA 10.2 on Ubuntu 18.04

Install the require dependencies:

conda create -n hdr_transformer_pytorch python=3.7
conda activate hdr_transformer_pytorch
pip install -r requirements.txt

Dataset

Download the dataset (include the training set and test set) from Kalantari17's dataset
Move the dataset to ./data and reorganize the directories as follows:

./data/Training
|--001
|  |--262A0898.tif
|  |--262A0899.tif
|  |--262A0900.tif
|  |--exposure.txt
|  |--HDRImg.hdr
|--002
...
./data/Test (include 15 scenes from `EXTRA` and `PAPER`)
|--001
|  |--262A2615.tif
|  |--262A2616.tif
|  |--262A2617.tif
|  |--exposure.txt
|  |--HDRImg.hdr
...
|--BarbequeDay
|  |--262A2943.tif
|  |--262A2944.tif
|  |--262A2945.tif
|  |--exposure.txt
|  |--HDRImg.hdr
...

Prepare the corpped training set by running:

cd ./dataset
python gen_crop_data.py

Training & Evaluaton

To train the model, run:

python train.py

To evaluate, we provide a script for testing with limited GPU memory, which splits the full-size images into several patches and then merges them into the final results.

python test.py --pretrained_model ./checkpoints/pretrained_model.pth  --save_results --save_dir ./results/hdr_transformer

Note: The pretrained weights are obtained with the reorginized codes, in which the PSNR and the SSIM values are slightly lower and slightly higher than those values reported in our paper. Feel free to use either for comparison.

Results

Acknowledgement

Our work is inspired the following works and uses parts of their official implementations:

Swin-Transformer
SwinIR

We thank the respective authors for open sourcing their methods.

Citation

@inproceedings{liu2022ghost,
  title={Ghost-free High Dynamic Range Imaging with Context-aware Transformer},
  author={Liu, Zhen and Wang, Yinglong and Zeng, Bing and Liu, Shuaicheng},
  booktitle={European Conference on Computer Vision},
  pages={344--360},
  year={2022},
  organization={Springer}
}

Contact

If you have any questions, feel free to contact Zhen Liu at liuzhen03@megvii.com.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

[ECCV 2022]Ghost-free High Dynamic Range Imaging with Context-aware Transformer

News

Abstract

Pipeline

Usage

Requirements

Dataset

Training & Evaluaton

Results

Acknowledgement

Citation

Contact

Files

README.md

Latest commit

History

README.md

File metadata and controls

[ECCV 2022]Ghost-free High Dynamic Range Imaging with Context-aware Transformer

News

Abstract

Pipeline

Usage

Requirements

Dataset

Training & Evaluaton

Results

Acknowledgement

Citation

Contact