Wanyan Xu, Xingbo Dong, Lan Ma, Andrew Beng Jin Teoh, and Zhixian Lin
Abstract: Low-light image enhancement plays a central role in various downstream computer vision tasks. Vision Transformers (ViTs) have recently been adapted for low-level image processing and have achieved a promising performance. However, ViTs process images in a window- or patch-based manner, compromising their computational efficiency and long-range dependency. Additionally, existing ViTs process RGB images instead of RAW data from sensors, which is sub-optimal when it comes to utilizing the rich information from RAW data. We propose a fully end-to-end Conv-Transformer-based model, RawFormer, to directly utilize RAW data for low-light image enhancement. RawFormer has a structure similar to that ofU-Net, but it is integrated with a thoughtfully designed Conv-Transformer Fusing (CTF) block. The CTF block combines local attention and transposed self-attention mechanisms in one module and reduces the computational overhead by adopting a transposed self-attention operation. Experiments demonstrate that RawFormer outperforms state-of-the-art models by a significant margin on low-light RAW image enhancement tasks.
This project is built by Python 3.8, Pytorch 1.12, CUDA 11.6. For other python package dependencies:
pip install -r requirements.txt
Download the pretrained model to the corresponding weights folders under result
folder.
Download the SID(Sony part) and MCR datasets, and put them under the RawFormer
directory.
The folders should be like:
- RawFormer
- Sony (SID dataset Sony part)
- Mono_Colored_RAW_Paired_DATASET (MCR dataset)
- result
- MCR
- SID
- figs
- . . . . . .
To train and evaluate RawFormer on SID or MCR, set the options in train.py
or test.py
, and run:
python train.py
python test.py
The PSNR-FLOPs-Params comparison on SID dataset:
If you find this project useful in your research, please consider citing:
@article{xu2022rawformer,
title={RawFormer: An Efficient Vision Transformer for Low-Light RAW Image Enhancement},
author={Xu, Wanyan and Dong, Xingbo and Ma, Lan and Teoh, Andrew Beng Jin and Lin, Zhixian},
journal={IEEE Signal Processing Letters},
volume={29},
pages={2677--2681},
year={2022},
publisher={IEEE}
}
Acknowledgment: This code is based on the Uformer and Restormer.
If you have any questions or suggestions, please contact xuwanyan98@163.com or xingbod@gmail.com.
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.