Multimodal Action Recognition on the MECCANO Dataset (ICIAP Competition with Prize!)
This is the official github repository related to the MECCANO Dataset.
MECCANO is a multimodal dataset of egocentric videos to study humans behavior understanding in industrial-like settings. The multimodality is characterized by the presence of gaze signals, depth maps and RGB videos acquired simultaneously with a custom headset. You can download the MECCANO dataset and its annotations from the project web page.
To use the MECCANO Dataset in PySlowfast please follow the instructions below:
- Install PySlowFast following the official instructions;
- Download the PySlowFast_files folder from this repository;
- Place the files "init.py", "meccano.py" and "sampling.py" in your slowfast/datasets/ folder;
- Place the files "init.py", "custom_video_model_builder_MECCANO_gaze.py" in your slowfast/models/ folder (to use the gaze signal).
Now, run the training/test with:
python tools/run_net.py --cfg path_to_your_config_file --[optional flags]
We provide pre-extracted features of MECCANO Dataset:
- RGB features extracted with SlowFast: [
coming soon
]
We provided pretrained models on the MECCANO Dataset for the action recognition task (only for the first version of the dataset):
architecture | depth | model | config |
---|---|---|---|
I3D | R50 | link |
configs/action_recognition/I3D_8x8_R50.yaml |
SlowFast | R50 | link |
configs/action_recognition/SLOWFAST_8x8_R50.yaml |
We provided pretrained models on the MECCANO Multimodal Dataset for the action recognition task:
architecture | depth | modality | model | config |
---|---|---|---|---|
SlowFast | R50 | RGB | link |
configs/action_recognition/SLOWFAST_8x8_R50_MECCANO.yaml |
SlowFast | R50 | Depth | link |
configs/action_recognition/SLOWFAST_8x8_R50_MECCANO.yaml |
If you find the MECCANO Dataset useful in your research, please use the following BibTeX entry for citation.
@misc{ragusa2022meccano,
title={MECCANO: A Multimodal Egocentric Dataset for Humans Behavior Understanding in the Industrial-like Domain},
author={Francesco Ragusa and Antonino Furnari and Giovanni Maria Farinella},
year={2022},
eprint={2209.08691},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
Additionally, cite the original paper:
@inproceedings{ragusa2021meccano,
title = {The MECCANO Dataset: Understanding Human-Object Interactions from Egocentric Videos in an Industrial-like Domain},
author = {Francesco Ragusa and Antonino Furnari and Salvatore Livatino and Giovanni Maria Farinella},
year = {2021},
eprint = {2010.05654},
booktitle = {IEEE Winter Conference on Application of Computer Vision (WACV)}
}