Skip to content

Latest commit

 

History

History
109 lines (88 loc) · 5.5 KB

README.md

File metadata and controls

109 lines (88 loc) · 5.5 KB

[ECCV 2022]Ghost-free High Dynamic Range Imaging with Context-aware Transformer

By Zhen Liu1, Yinglong Wang2, Bing Zeng3 and Shuaicheng Liu3,1*

1Megvii Technology, 2Noah’s Ark Lab, Huawei Technologies, 3University of Electronic Science and Technology of China

This is the official PyTorch implementation of our ECCV2022 paper: Ghost-free High Dynamic Range Imaging with Context-aware Transformer (HDR-Transformer). The MegEngine version is available at HDR-Transformer-MegEngine.

News

  • 2022.08.26 The PyTorch implementation of our paper is now available.
  • 2022.07.04 Our paper has been accepted by ECCV 2022.

Abstract

High dynamic range (HDR) deghosting algorithms aim to generate ghost-free HDR images with realistic details. Restricted by the locality of the receptive field, existing CNN-based methods are typically prone to producing ghosting artifacts and intensity distortions in the presence of large motion and severe saturation. In this paper, we propose a novel Context-Aware Vision Transformer (CA-ViT) for ghost-free high dynamic range imaging. The CA-ViT is designed as a dual-branch architecture, which can jointly capture both global and local dependencies. Specifically, the global branch employs a window-based Transformer encoder to model long-range object movements and intensity variations to solve ghosting. For the local branch, we design a local context extractor (LCE) to capture short-range image features and use the channel attention mechanism to select informative local details across the extracted features to complement the global branch. By incorporating the CA-ViT as basic components, we further build the HDR-Transformer, a hierarchical network to reconstruct high-quality ghost-free HDR images. Extensive experiments on three benchmark datasets show that our approach outperforms state-of-the-art methods qualitatively and quantitatively with considerably reduced computational budgets.

Pipeline

pipeline Illustration of the proposed CA-ViT. As shown in Fig (a), the CA-ViT is designed as a dual-branch architecture where the global branch models long-range dependency among image contexts through a multi-head Transformer encoder, and the local branch explores both intra-frame local details and inner-frame feature relationship through a local context extractor. Fig. (b) depicts the key insight of our HDR deghosting approach with CA-ViT. To remove the residual ghosting artifacts caused by large motions of the hand (marked with blue), long-range contexts (marked with red), which are required to hallucinate reasonable content in the ghosting area, are modeled by the self-attention in the global branch. Meanwhile, the well-exposed non-occluded local regions (marked with green) can be effectively extracted with convolutional layers and fused by the channel attention in the local branch.

Usage

Requirements

  • Python 3.7.13
  • PyTorch 1.9.0
  • Torchvision 0.10.0
  • CUDA 10.2 on Ubuntu 18.04

Install the require dependencies:

conda create -n hdr_transformer_pytorch python=3.7
conda activate hdr_transformer_pytorch
pip install -r requirements.txt

Dataset

  1. Download the dataset (include the training set and test set) from Kalantari17's dataset
  2. Move the dataset to ./data and reorganize the directories as follows:
./data/Training
|--001
|  |--262A0898.tif
|  |--262A0899.tif
|  |--262A0900.tif
|  |--exposure.txt
|  |--HDRImg.hdr
|--002
...
./data/Test (include 15 scenes from `EXTRA` and `PAPER`)
|--001
|  |--262A2615.tif
|  |--262A2616.tif
|  |--262A2617.tif
|  |--exposure.txt
|  |--HDRImg.hdr
...
|--BarbequeDay
|  |--262A2943.tif
|  |--262A2944.tif
|  |--262A2945.tif
|  |--exposure.txt
|  |--HDRImg.hdr
...
  1. Prepare the corpped training set by running:
cd ./dataset
python gen_crop_data.py

Training & Evaluaton

To train the model, run:

python train.py

To evaluate, we provide a script for testing with limited GPU memory, which splits the full-size images into several patches and then merges them into the final results.

python test.py --pretrained_model ./checkpoints/pretrained_model.pth  --save_results --save_dir ./results/hdr_transformer

Note: The pretrained weights are obtained with the reorginized codes, in which the PSNR and the SSIM values are slightly lower and slightly higher than those values reported in our paper. Feel free to use either for comparison.

Results

results

Acknowledgement

Our work is inspired the following works and uses parts of their official implementations:

We thank the respective authors for open sourcing their methods.

Citation

@inproceedings{liu2022ghost,
  title={Ghost-free High Dynamic Range Imaging with Context-aware Transformer},
  author={Liu, Zhen and Wang, Yinglong and Zeng, Bing and Liu, Shuaicheng},
  booktitle={European Conference on Computer Vision},
  pages={344--360},
  year={2022},
  organization={Springer}
}

Contact

If you have any questions, feel free to contact Zhen Liu at liuzhen03@megvii.com.