This is a rough guide for how to use the below suite of applications to create a Sign Language training dataset, train an SVM-based model, and embed this in a smartphone application to translate gestures in real time
Thank you for taking the time to read this help guide, if you have any additional questions regarding this project please get in touch. Have fun!
1. Creating your Training and Testing datasets using Dataset Creator
This phase involves creating a usable dataset consisting of images of Sign Language gestures which will be used to train the SVM kernel for recognition
- Use Git to clone the Dataset Creator repository
- Attach a webcam device to your computer and make sure it's producing output
- Follow the
README
steps to ensure you have Python installed and that you resolve the external dependencies usingpip
- Run
python create.py
from the repository root to begin the script- You can edit the create.py script to define config variables to suit your testing needs:
mode
-auto
ormanual
, manual if you want to be in control of when pictures are taken using spacebar, or auto if you want images taken in successionimages
- the number of images to capture per classdelay
- the delay (seconds) before capture starts, useful if you're performing the signs yourself so you can get into position and let the camera's aperture adjust to the scene lightingdelaybetween
- delay (seconds) between each image being takenpath
- folder path to store the images
- You can edit the create.py script to define config variables to suit your testing needs:
- Whether running in
manual
orauto
mode, you will be prompted to enter which class you're performing, which will automatically be prepended to the output files.- For example, entering "A" as your input class will result in images being saved as
A_0.png
,A_1.png
and so on.
- For example, entering "A" as your input class will result in images being saved as
Try to capture at least 100 images of each gesture you require. When complete, you can split your data into training and testing subsets. If you're unsure what sort of split you require, aim for 80:20 in favour of training data - so if you capture 100 images for the class "A", for example, you would set aside 20 of these as test data, so that these 20 could be used to verify your remaining 80 train images.
Your resultant image folder should look something like this. For use with the Offline Trainer, the images can be nested within subdirectories within the train
and test
folders, only the file name matters as the first letter of the image file is used to classify it.
/dataset-creator/
|----train
|---A_0.jpg
|---A_1.jpg
|---A_2.jpg
|---...
|---Y_9.jpg
|----test
|---A_81.jpg
|---A_82.jpg
|---...
|---Y_81.jpg
If you are unable to create a dataset of your own in this manner, consider downloading the Raw MNIST Sign Language Dataset from Mon95 on GitHub and using this instead
2. Training the SVM recognition kernel using the Offline Trainer
This phase involves ingesting your training and testing data to train an SVM kernel and produce an output file which can be used with the Sign Language (App) in order to make predictions in real time. Performance and accuracy data on any available test images will also be produced during this phase.
- Use Git to clone the Offline Trainer repository.
- Follow the
README
steps to ensure you have configured IntelliJ IDEA correctly and that you have resolved the project dependencies - Edit the
Main
class in thesrc
folder to configure some testing parameters:DATASETS_FOLDER
- The folder created during phase 1 above containing your training datasets. When using multiple datasets, an example folder structure is shown below:
/hand-datasets/
|----dataset-one
|----train
|---A_0.jpg
|---A_1.jpg
|---A_2.jpg
|---...
|---Y_9.jpg
|----test
|---A_81.jpg
|---...
|---Y_81.jpg
|----dataset-two
|----train
|---A_0.jpg
|---...
|----test
|---A_81.jpg
|---...
note: the dataset folder names are not important, but the train and test folder names are, and the image filenames are, see phase 1 above for more info
-
(cont.)
RESULTS_FOLDER
- This is the folder to which your testing result files will be saveddetectionMethods
- This is a variable containing a list of image preprocessing methods to use on your data. Remove or comment out those you don't want to use, remember if you opt to use canny edges image processing for example, then the Sign Language (App) must also use this image processing technique - see section 3 for more infokernels
- The type of SVM kernels which will be used when training, remove or comment out those you don't wish to usedimReduceType
- Whether dimensionality reduction should be used or not, remove or comment out those you don't wish to usedatasets
- A list of strings containing the folder names of the datasets contained within theDATASETS_FOLDER
folder. Following the example above this list would contain"dataset-one"
and"dataset-two"
Note: The trainer will use every combination of the above parameters! So this can be thought of as
detection methods * kernels * dimensionality reduction types * datasets
. If you want to perform a single training run, there should be 1 entry in each list. For example, if you had2
detection methods,2
SVM kernel types,1
type of dimensionality reduction and4
datasets, the Offline Trainer will perform16
training runs (2 * 2 * 1 * 4
). -
(Optional) Configure the
EmailNotifier
class with valid SMTP details in order to receive email updates about your training progress, useful when performing large scale training runs -
Run the project using either method 1 or method 2 below
-
Method 1 - Manual Classpath
- Add the following to your
PATH
under environment variables in windows:F:\Repos\offline-trainer\new
but substitute for the path to which you have cloned the repo on your computer - Load the project in IntellIJ IDEA, load the project and hit run
- Add the following to your
-
Method 2 - Build artifact and command line
- If you want to run the project from the command line you can elect to
Build->Build Artifact
which will produce an executable.jar
in theoffline-trainer\out\artifacts\
folder - Run this
.jar
file from the command line usingjava -Djava.library.path=F:\Repos\offline-trainer\new\ -jar offline-trainer.jar
but substitute the file path to your cloned repo in thejava.library.path
string
- If you want to run the project from the command line you can elect to
-
-
The testing results will be placed in the
RESULTS_FOLDER
specified in step 3 with file names corresponding to the parameters used in step 3. -
The actual trained SVM models for use with the Sign Language App will be output to the project root in the form
trained_1.xml
, where1
indicates the sequential run #. This file can then be loaded into the app as shown in phase 3.
The test result files created by the Offline Trainer have some usable performance and accuracy statistics in them, here is the high level overview:
RUN 9
Features: none
SVM Kernel: linear
Dimensionality Reduction: pca
Dataset: mon95
This shows the parameters configured in step 3 above and what sequential # of run is this
Train Sample 1: A1_1_7.jpg
Train Sample 2: A2_1_7.jpg
Train Sample 3: A3_1_7.jpg
...
This is a list of every training sample you fed in to this run, this can be useful for finding examples of your false positives for example. Note: these are numbered according to the order they were processed.
Total Image Processing Time: 00:00:15.680
Average Image Processing Time: 00:00:00.11
Actual SVM Training Time: 00:00:12.234
Actual PCA Time: 00:01:09.250
This shows the time taken to perform the training phase alone in HH:mm:ss.mmm
Test Sample 1: A0_1.jpg
Test Sample 2: A0_1_2.jpg
Test Sample 3: A0_1_3.jpg
...
Again this shows the testing images as they were processed Note: these are numbered according to the order they were processed.
Total Image Processing Time: 00:00:05.141
Average Image Processing Time: 00:00:00.15
Actual Testing Time: 00:00:00.15
This shows the time taken to perform the testing phase alone in HH:mm:ss.mmm
CLASS: A
TP: 1 2 3 4 5 6 7 8 9 10 11 12 13 14
TN: 15 16 17 18 19 20 21 22 23 24 25
FP: 74
FN: 270 277
For each class, this output lists each True Positive, True Negative, False Positive and False Negative - the numbers correspond to the sample # shown above in "Testing Results". For example TP: 1
would correspond to Test Sample 1: A0_1.jpg
This allows you to create something like the following:
14 0 0 0 0 0 0 0
0 13 0 0 0 0 0 0
0 0 13 0 0 0 0 0
0 0 0 14 0 0 0 0
0 0 0 0 13 0 0 0
0 0 0 0 0 13 0 0
0 0 0 0 0 0 14 0
...
This is a simple confusion matrix of each class against each other class when testing. This matrix can be thought of as the following:
Actual Class
A B C D E F G H
A 14 0 0 0 0 0 0 0
Predicted B 0 13 0 0 0 0 0 0
Class C 0 0 13 0 0 0 0 0
D 0 0 0 14 0 0 0 0
E 0 0 0 0 13 0 0 0
F 0 0 0 0 0 13 0 0
G 0 0 0 0 0 0 14 0
...
This allows you to create something like the following:
3. Using the trained model to power the Sign Language App
This phase involves importing your trained model (trained_1.xml
) created using the Offline Trainer and then using this on an Android app.
-
Use Git to clone the Sign Language App
-
Follow the
README
steps to ensure you have configured Android Studio correctly and that you have resolved the project dependencies -
Attach an Android smartphone to your computer, enabling ADB to allow debugging
-
Import your
trained_1.xml
into thesign-language\app\src\main\res\raw
folder -
Important: Make sure the Sign Language App is configured to use the same parameters as you used in your testing run:
-
For example, if you used:
detection method: CANNY_EDGES
kernel: RBF
dimensionality reduction: none
In your training run, then these need to match when using the Sign Language App otherwise you're going to pass in invalid data to the SVM.
Detection method
can be set in theMainActivity
class in theonCameraViewStarted
methodSVM kernel
type can be set in theFrameClassifier
class in the constructorDimensionality Reduction
can currently be used in theFrameClassifier
although there is an open issue to create an export process for PCA eigenvectors to support this, for now stick tonone
-
-
Run the application, which will load the trained SVM file and allow for the live translation of sign language gestures using your trained model!
Some information on the overall processing steps taken by the app are shown below
4. (Optional) Changing the imaging kernel of the Sign Language App and importing this into the Offline Trainer in order to perform your own experiments
It is encouraged to experiment with your own image pre-processing techniques when training your SVM model. In order to train and refine your App's imaging kernel and thus produce better results, you can alter the Sign Language App's imaging kernel and import this into the Offline Trainer.
To do this, see the README
of the Sign Language App regarding the .jar
export process and the README
of the Offline Trainer regarding the library import process.
Following this process ensures that both the app and the trainer have the same data processing methods and will result in a cohesive workflow when perfecting the recognition workflow.
Some visual information on the current image processing steps taken by the app are shown below