Skip to content

Zju-George/RealtimeFER

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Real time facial expression recognition for webcam application

Real time Demo (using one RGB camera)

gif

Frameworks

  • Face Dectection is accomplished by mediapipe developed by Google.

  • Facial Expression Recognition is trained using DCNN (Deep Convolutional Neural Network) with FER+ dataset which is held by Microsoft.

nn

(Thanks to @zc of the image)

Language & Dependencies

  • Language: python3.6

  • Dependencies:

    • pytorch
    • opencv-python
    • mediapipe (modified)
    • CUDA10.1 (optional)
    • ...
  • you may install all the dependencies via command python -m pip install -r requirements.txt

Details

  • Usage:

    • Run from prebuilt exe: see release-windows-v0.1 (Recommended)
    • Run from source:
      • Replace drawing_utils.py in mediapipe with src/drawing_utils.py in which I slightly modified.
      • Contact me by Email to get the trained model.
      python camdemo.py --camera 0
  • Performance:

    • Absolutely REALTIME! The model could achieve above the average 60 FPS on a plain PC. If possible, try using a GPU to gain better performance!
    • My poor computer: Intel i7-7700K CPU (4.2GHz) with NVIDIA Quadro P2000 (5G memory)
  • Model Structure: the model is quite simple though. It uses ResNet50 as backbone for feature extraction after which it is stacked with two fully connected layer. The output is a 10-size digits vector corresponding to 10 emotion classes.

  • Accuracy: the model achieves 79.8% accuracy evaluated by FER+ valid subset after 14 epochs of training using softCE loss.

    epoch KLdiv softCE weightedSoftCE
    0 0.005 0.005 0.005
    1 0.55 0.598 0.56
    2 0.58 0.652 0.668
    3 * 0.695 0.697
    4 * 0.726 0.71
    5 * 0.753 0.68
    6 * 0.76 0.665
    ... ... ... ...
    14 * 0.798 0.742
  • Loss function:

    • Rather than original FER, each image in FER+ has been labeled by 10 crowd-sourced taggers but the default implementation of cross-entropy in pytorch uses just one hard label to compute the loss which abandons the information of 10 soft labels. So I implemented the soft cross-entropy to train the model fitting the probability distribution of emotion class which got pretty good results.

    • One more reason to use softCE loss is that for emotion classification, some human emotions cannot be distinguished well such as happiness and surprise.

    • As FER+ is a very imbalanced dataset (see image below) so I've tried use weightedSoftCE( like the idea of focal loss) but no good which I don't quite get it yet. If you happen to know why, tell me! Also when using weightedSoftCE during training, I found the loss rising upside and down a lot, which means it's not that numerical stable.

data

Expressions neutral happiness surprise sadness anger disgust fear contempt unknown NF
index 0 1 2 3 4 5 6 7 8 9

Potential applications

  • Online education for children, which could be used to identify whether children listen carefully; For on-site meeting or school classroom, to judge the quality of the speech.

online

  • On-site Human–Machine Interaction.

About

Real time facial emotion recognition

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages