The hand detectors are trained on (1) 100K and (2) 100K+ego images from 100DOH dataset.
Name | Data | Box AP | Model |
Faster-RCNN X101-FPN | 100K | 90.32% | Google Drive |
Faster-RCNN X101-FPN | 100K+ego | 90.46% | Google Drive |
- Set up detectron2 environment as in install.md
CUDA_VISIBLE_DEVICES=4,5,6,7 python trainval_net.py --num-gpus 4 --config-file faster_rcnn_X_101_32x8d_FPN_3x_100DOH.yaml
CUDA_VISIBLE_DEVICES=4,5,6,7 python trainval_net.py --num-gpus 4 --config-file faster_rcnn_X_101_32x8d_FPN_3x_100DOH.yaml --eval-only MODEL.WEIGHTS path/to/model.pth
CUDA_VISIBLE_DEVICES=1 python demo.py
If this work is helpful in your research, please cite:
@INPROCEEDINGS{Shan20,
author = {Shan, Dandan and Geng, Jiaqi and Shu, Michelle and Fouhey, David},
title = {Understanding Human Hands in Contact at Internet Scale},
booktitle = CVPR,
year = {2020}
}
When you use the model trained on our ego data, make sure to also cite the original datasets (Epic-Kitchens, EGTEA and CharadesEgo) that we collect from and agree to the original conditions for using that data.