Anticipative Feature Fusion Transformer for Multi-Modal Action Anticipation (WACV 2023)

This repository contains the official source code and data for our AFFT paper. If you find our code or paper useful, please consider citing:

Z. Zhong, D. Schneider, M. Voit, R. Stiefelhagen and J. Beyerer. Anticipative Feature Fusion Transformer for Multi-Modal Action Anticipation. In WACV, 2023.

@InProceedings{Zhong_2023_WACV,
    author    = {Zhong, Zeyun and Schneider, David and Voit, Michael and Stiefelhagen, Rainer and Beyerer, J\"urgen},
    title     = {Anticipative Feature Fusion Transformer for Multi-Modal Action Anticipation},
    booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
    month     = {January},
    year      = {2023},
    pages     = {6068-6077}
}

Installation

First clone the repo and set up the required packages in a conda environment.

$ git clone https://github.com/zeyun-zhong/AFFT.git
$ conda env create -f environment.yaml python=3.7
$ conda activate afft

Download Data

Dataset features

AFFT works on pre-extracted features, so you will need to download the features first. You can download the TSN-features from RULSTM for EK100 and for EGTEA Gaze+. The RGB-Swin features are available here and audio features are available here.

Please make sure that your data structure follows the structure shown below. Note that dataset_root_dir in config.yaml should be changed to your specific data path.

Dataset root path (e.g., /home/user/datasets)
├── epickitchens100
│   └── features
│       │── rgb
│       │   └── data.mdb
│       │── rgb_omnivore
│       │   └── data.mdb
│       │── obj
│       │   └── data.mdb
│       │── audio
│       │   └── data.mdb
│       └── flow
│           └── data.mdb
└── egtea
    └── features
        │── TSN-C_3_egtea_action_CE_s1_rgb_model_best_fcfull_hd
        │   └── data.mdb
        │── TSN-C_3_egtea_action_CE_s1_flow_model_best_fcfull_hd
        │   └── data.mdb
        │── TSN-C_3_egtea_action_CE_s2_rgb_model_best_fcfull_hd
        │   └── data.mdb
        │── TSN-C_3_egtea_action_CE_s2_flow_model_best_fcfull_hd
        │   └── data.mdb
        │── TSN-C_3_egtea_action_CE_s3_rgb_model_best_fcfull_hd
        │   └── data.mdb
        └── TSN-C_3_egtea_action_CE_s3_flow_model_best_fcfull_hd
            └── data.mdb

If you use a different organization, you would need to edit rulstm_feats_dir in EK100-common and EGTEA-common.

Model Zoo

Dataset	Modalities	Performance (Actions)	Config	Model
EK100	R-Swin, O, AU, F R-TSN, O, AU, F R-TSN, O, F	18.5 (MT5R) 17.0 (MT5R) 16.4 (MT5R)	`expts/01_SA-Fuser_ek100_val_Swin.txt` `expts/01_SA-Fuser_ek100_val_TSN.txt` `expts/01_SA-Fuser_ek100_val_TSN_wo_audio.txt`	link link link
EGTEA	RGB-TSN, Flow	42.5 (Top-1)	`expts/02_ek100_avt_tsn.txt`	link

Training

Recall that dataset_root_dir in config.yaml should be changed to your specific path.

EpicKitchens-100

python run.py -c expts/01_SA-Fuser_ek100_train.txt --mode train --nproc_per_node 2

EGTEA Gaze+

python run.py -c expts/06_SA-Fuser_egtea_train.txt --mode train --nproc_per_node 2

Validation

EpicKitchens-100

python run.py -c expts/01_SA-Fuser_ek100_val_TSN_wo_audio.txt --mode test --nproc_per_node 1

EGTEA Gaze+

python run.py -c expts/06_SA-Fuser_egtea_val.txt --mode test --nproc_per_node 1

Test / Challenge (EK100)

# save logits
python run.py -c expts/01_SA-Fuser_ek100_test_TSN_wo_audio.txt --mode test --nproc_per_node 1

# generate test / challenge file
python challenge.py --prefix_h5 test --models fusion_ek100_tsn_wo_audio_4h_18s --weights 1.

License

This codebase is released under the license terms specified in the LICENSE file. Any imported libraries, datasets or other code follows the license terms set by respective authors.

Acknowledgements

Many thanks to Rohit Girdhar and Antonino Furnari for providing their code and data.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
annotations		annotations
checkpoints		checkpoints
common		common
conf		conf
datasets		datasets
expts		expts
logits		logits
models		models
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
challenge.py		challenge.py
environment.yml		environment.yml
fuser.png		fuser.png
run.py		run.py
test.py		test.py
tmp.py		tmp.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Anticipative Feature Fusion Transformer for Multi-Modal Action Anticipation (WACV 2023)

Installation

Download Data

Dataset features

Model Zoo

Training

EpicKitchens-100

EGTEA Gaze+

Validation

EpicKitchens-100

EGTEA Gaze+

Test / Challenge (EK100)

License

Acknowledgements

About

Releases

Packages

Contributors 2

Languages

License

zeyun-zhong/AFFT

Folders and files

Latest commit

History

Repository files navigation

Anticipative Feature Fusion Transformer for Multi-Modal Action Anticipation (WACV 2023)

Installation

Download Data

Dataset features

Model Zoo

Training

EpicKitchens-100

EGTEA Gaze+

Validation

EpicKitchens-100

EGTEA Gaze+

Test / Challenge (EK100)

License

Acknowledgements

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages