Recognizing actors/celebs in a clip or image from any media, using DeepLearning with Python. Can use either CNN or HOG for face detection and then compare the face with our dataset of faces.
I have used the mostly comprehensive dataset available here. It is only updated with celebrity faces till 2021, so we might need to update it further if required.
Tons of help from ageitgey's face_recognition library.
Inspired by this wonderful article.
-
Install
cmake
, as it is required for the dlib library. For linux, runsudo apt-get install cmake
For Windows, download the installer from here For macOS, runbrew install cmake
Also, install
pipenv
for managing the virtual environment. For linux, runsudo pip install pipenv
For Windows, runpip install pipenv
For macOS, runbrew install pipenv
Then, run
pipenv shell
to activate the virtual environment. Finally, runpipenv install
to install all the dependencies.Note: the
face-recognition
library does not officially support Windows, but it still might work, as it says in its README -
The dataset has the following structure.
For my implementation, each actor has 25 images. More will do better, but this number seems to work fine.
-
For every image in the dataset, we first get a square enclosing the face in the image, then generate a 128d vector for that face, which is dumped to the 'encodings.pickle' file.
We can either use CNN(slower, more accurate) or HOG(faster, less accurate) for the face detection process. Here I've used the face_recognition library, which gives me both the options.
For a big dataset, techniques like MapReduce or Spark can be used to parallelize the process over a cluster of machines.
Moreover, use the
-fnn
flag in case you want to use the KDTree method for searching, which is much faster than the linear search. -
Consider an image, be it a still from the movie, or a frame of a video clip. First, we identify the faces in the image using the same method as above (CNN or HOG), generate an encoding for it(128d vector), and then compare it with our collected encodings. The actors with the most matched encodings is the actor in the image.
This search can either be linear, or using a KDTree. I've used the KDTree method, which is much faster. This can be done by passing the
-fnn
flag to the python file.
Read the first few lines of the Python file involved to understand the parameters used in each case
-
python faceEncode.py --dataset dataset/actors --encodings encodings/encodings.pickle -d hog -c 8
-c
flag is the number of cores to use for parallel processing.Can also use the
-fnn
flag to later use the KDTree method for searching. -
python faceRecImage.py -e encodings.pickle -i examples/ex6.png -d hog -o out/
Use the
-fnn
flag to use the KDTree method for searching. -
python faceRecVideoFile.py -e encodings/encodings.pickle -i input_vids/ex2.mp4 -o output_vids/ex2.avi -y 0 -d hog
Outputs a video with the faces marked.