This work addresses the problem of Libras recognition in video. The overview of the proposed system is shown in the Figure below.
The system employs a two-step method with feature space mapping and classification. First, we segment the body parts of each subject in a video through DensePose estimation model. Then, we use Gait Energy Image (GEI) to encode the motion of the body parts in a compact feature space, as illustrated in the Figure below.
The pipeline used in the classification step is illustrated in the Figure below. The input is the GEI representation that is cropped in the region of interest that contains movement. The cropped samples are then reshaped to a smaller size, while keeping the aspect ratio of the original video frames. We employ dimensionality reduction (or SMOTE) as a solution to the curse of dimensionality. After the dimensionality reduction (or data augmentation), the samples are submitted to a classification pipeline.
- CEFET/RJ-Libras: available under request. (Please, see the corresponding author in the original database paper).
- MINDS-Libras: available through this link. Please, see also their paper.
- LIBRAS-UFOP-ISO: also available under request through this link (paper).
We perform experiments on three challenging Brazilian sign language (Libras) datasets, CEFET/RJ-Libras, MINDS-Libras, and LIBRAS-UFOP.
This work was published in the IEEE Transactions on Circuit and Systems I: Regular Papers (TCAS-I) - Special Issue on Regional Flagship Conferences of the IEEE Circuits and Systems Society .
If you use this code for your research, please consider citing:
@article{passos2021,
author={Passos, Wesley L. and Araujo, Gabriel M. and Gois, Jonathan N. and de Lima, Amaro A.},
journal={IEEE Transactions on Circuits and Systems I: Regular Papers},
title={A Gait Energy Image-Based System for Brazilian Sign Language Recognition},
year={2021},
volume={68},
number={11},
pages={4761-4771},
doi={10.1109/TCSI.2021.3091001}}