This repo contains code to train age / gender prediction and run inference on a flask server. The pytorch model training / testing was copied using this template.
Check out this colab for demo.
If you are only interested in model inference, go to this section.
- A unix or unix-like x86 machine
- Docker. Don't be scared by docker. It's really easy and convenient
- python 3.7 or higher. Running in a virtual environment (e.g., conda, virtualenv, etc.) is highly recommended so that you don't mess up with the system python.
I used Adience age and gender dataset. Download the data and place them at ./data/Adience/
.
You can find the state of the art models at here for the age and here for the gender, respectively.
I also used the IMDB-WIKI dataset. This dataset is huge. It's got more than 500,000 faces with gender and age labeled. One weird thing is that this dataset doesn't have train / val / test splits. This dataset is pretty much only used to pre-train your model. I don't know why but that is what it is. People don't compare their scores against this dataset. So I'll do the same. I'll use this dataset to improve my training and report the final metrics on the Adience age and gender dataset. Anyways, download imdb_crop.tar
and wiki_crop.tar
, and place them at ./data/imdb_crop
and ./data/wiki_crop
, respectively.
I advise you that you run all of below in a virutal python environment.
For the CPU model
docker pull tae898/face-detection-recognition
For the GPU model
docker pull tae898/face-detection-recognition-cuda
The detailed insturctions can be found here.
It might take some time.
The port number 10002
is the port that the face-detection-recognition
docker container listens to. Set cuda=True
in the below code snippet, if you want to run on a NVIDIA GPU). The face embedding vectors are pickled. They are saved as <image-path>.pkl
(e.g. landmark_aligned_face.2174.9523333835_c7887c3fde_o.jpg.RESIZED.pkl
).
Resizing image to the same shape (e.g. resize=640
resizes every image to a black background square RGB image with the width and height being 640 pixels) dramatically increase the speed due to some mxnet stuff that I'm not a big fan of.
det_score
is the confidence score on face detection. The faces whose confidence score is lower than this this threshold value will not be considered.
-
Adience age and gender dataset
At the root of this repo, run the below command.
python3 -c "from utils.scripts import extract_Adience_arcface; extract_Adience_arcface('aligned', docker_port=10002, cuda=False, resize=640)"
The argument
aligned
means that we'll be using the aligned face images, not raw. The aligned images have the face of interest in the center, which makes it easier to find the face.python3 -c "from utils.scripts import get_Adience_clean; get_Adience_clean('aligned', resize=640, det_score=0.9)"
This will write
./data/Adience/meta-data-aligned.json
and./data/Adience/data-aligned.npy
-
IMDB
At the root of this repo, run the below command.
python3 -c "from utils.scripts import extract_imdb_wiki_arcface; extract_imdb_wiki_arcface('imdb', docker_port=10002, cuda=False, resize=640)"
python3 -c "from utils.scripts import get_imdb_wiki_clean; get_imdb_wiki_clean('imdb', resize=640, det_score=0.9)"
This will write
./data/imdb_crop/meta-data.json
and./data/imdb_crop/data.npy
-
WIKI
At the root of this repo, run the below command.
python3 -c "from utils.scripts import extract_imdb_wiki_arcface; extract_imdb_wiki_arcface('wiki', docker_port=10002, cuda=False, resize=640)"
python3 -c "from utils.scripts import get_imdb_wiki_clean; get_imdb_wiki_clean('wiki', resize=640, det_score=0.9)"
This will write
./data/wiki_crop/meta-data.json
and./data/wiki_crop/data.npy
-
Adience age and gender dataset
This dataset has five folds. The performance metric is accuracy on five-fold cross validation.
images before removal fold 0 fold 1 fold 2 fold 3 fold 4 19,370 4,484 3,730 3,894 3,446 3,816 Removed data
failed to process image no age found no gender found no face detected bad quality (det_score<0.9) SUM 0 748 1,170 322 75 2,315 (11.95 %) Genders
female male 9,103 7,952 Ages
0 to 2 4 to 6 8 to 12 15 to 20 25 to 32 38 to 43 48 to 53 60 to 100 1,363 2,087 2,226 1,761 5,162 2,719 907 830 -
IMDB age and gender dataset
This dataset does not have train / val / test splits. Researchers normally use this dataset for pretraining.
images before removal 460,723 Removed data
failed to process image no age found no gender found no face detected more than one face bad quality (det_score<0.9) no embeddings SUM 22,200 690 8,453 21,441 47,278 3855 27 103,944 (22.56 %) Genders
female male 153,316 203,463 Ages
Ages are fine-grained integers from 0 to 100. Check
./data/imdb_crop/meta-data.json
for the details. -
WIKI age and gender dataset
This dataset does not have train / val / test splits. Researchers normally use this dataset for pretraining.
images before removal 62,328 Removed data
failed to process image no age found no gender found no face detected more than one face bad quality (det_score<0.9) no embeddings SUM 10,909 1,781 2,485 3,074 2,179 428 0 20,856 (33.46 %) Genders
female male 9,912 31,560 Ages
Ages are fine-grained integers from 0 to 100. Check
./data/wiki_crop/meta-data.json
for the details.
The model is basically an MLP. There are two variants considered. One is a plain MLP and the other is MLP with IC layers. It's emperically shown that the latter is better than the plain MLP.
There are three training steps involved.
-
Hyperparameter search using Ray Tune
This searches dropout rate, number of residuals per block, number of blocks in the network, batch size, peak learning rate, weight decay rate, and gamma of exponential learning rate decay. Configure the values in
hp-tuning.json
and runpython hp-tuning.py
. -
Pre-training on the
IMDB
andWIKI
dataset.We'll use the optimal hyperparameters found in the step 1 to pre-train the model. Configure the values in
train.json
and runpython train.py
. -
Five random seeds on 5-fold cross-validation on the
Adience
dataset.Since the reported metrics (i.e. accuracy) is 5-fold cross-validation, we will do the same here. In order to get the least biased numbers, we run this five times each with a different seed. This means that we are training in total of 25 times and report the average of the 25 numbers. Configure the values in
cross-val.json
and runpython cross-val.py
.
Click on the above link to see the detailed results.
Check ./test-images
to see the model inference results on some stock images.
We provide the gender and the age models, which are trained on IMDB, WIKI, and Adience datasets. The gender model is a binary classification and the age model is a 101-class (from 0 to 100 years old) classification. They are MLPs with dropout, batch norm, and residual connections. They can be found at ./models/gender.pth
and ./models/age.pth
, respectively. Both are light-weight. Running on a CPU is enough.
app.py
is a flask server app that receives accepts 512-dimensional arcface embeddings and returns estimated genders and ages. You can also run this on a docker container.
Check out this demo video.
-
Pull the image from docker hub and run the container.
docker run -it --rm -p 10003:10003 tae898/age-gender
-
For whatever reason if you want to build it from scratch,
docker build -t age-gender . docker run -it --rm -p 10003:10003 age-gender
-
Install the required python packages.
pip install -r requirements.txt
-
Run
app.py
python3 app.py
After running the container (i.e. docker run -it --rm -p 10003:10003 tae898/age-gender
), you can run client.py
(e.g. python client.py --image-path test-images/matrix-tae-final_exported_37233.jpg
) to get estimated genders and ages in the picture.
NB: You also have to run the face-detection-recognition (docker run -it --rm -p 10002:10002 tae898/face-detection-recognition
for CPU or docker run --gpus all -it --rm -p 10002:10002 tae898/face-detection-recognition-cuda
for cuda), before running client.py
. This separation might be annoying but the modularization will help in the future.
The best way to find and solve your problems is to see in the github issue tab. If you can't find what you want, feel free to raise an issue. We are pretty responsive.
Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated.
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature
) - Commit your Changes (
git commit -m 'Add some AmazingFeature'
) - Push to the Branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
Check out the paper.
@misc{kim2021generalizing,
title={Generalizing MLPs With Dropouts, Batch Normalization, and Skip Connections},
author={Taewoon Kim},
year={2021},
eprint={2108.08186},
archivePrefix={arXiv},
primaryClass={cs.LG}
}