The project is an official implementation of our paper POSTER: A Pyramid Cross-Fusion Transformer Network for Facial Expression Recognition.
-
create conda environment (we provide requirements.txt)
-
Data Preparation
Download RAF-DB dataset, and make sure it have a structure like following:
- data/raf-basic/ EmoLabel/ list_patition_label.txt Image/aligned/ train_00001_aligned.jpg test_0001_aligned.jpg ...
-
Pretrained model weights Dowonload pretrain weights (Image backbone and Landmark backbone) from here. Put entire
pretrain
folder undermodels
folder.- models/pretrain/ ir50.pth mobilefacenet_model_best.pth.tar ...
Our best model can be download from here, put under checkpoint
folder. You can evaluate our model on RAD-DB dataset by running:
python test.py --checkpoint checkpoint/rafdb_best.pth -p
Train on RAF-DB dataset:
python train.py --gpu 0,1 --batch_size 200
You may adjust batch_size based on your # of GPUs. Usually bigger batch size can get higher performance. We provide the log in log
folder. You may run several times to get the best results.
Our research code is released under the MIT license. See LICENSE for details.
If you find our work useful in your research, please consider citing:
@article{zheng2022poster,
title={Poster: A pyramid cross-fusion transformer network for facial expression recognition},
author={Zheng, Ce and Mendieta, Matias and Chen, Chen},
journal={arXiv preprint arXiv:2204.04083},
year={2022}
}
Our implementation and experiments are built on top of open-source GitHub repositories. We thank all the authors who made their code public, which tremendously accelerates our project progress. If you find these works helpful, please consider citing them as well.