by Toan Vu, An Dang, Jia-Ching Wang
This is the implementation of our submissions at AI Meets Beauty Challenge 2020
We develop a deep learning model for the beauty product image retrieval problem. The proposed model has two main components- an encoder and a memory. The encoder extracts and aggregates features from a deep convolutional neural network at multiple scales to get feature embeddings. With the use of an attention mechanism and a data augmentation method, it learns to focus on foreground objects and neglect background on images, so can it extract more relevant features. The memory consists of representative states of all database images as its stacks, and it can be updated during training process. Based on the memory, we introduce a distance loss to regularize embedding vectors from the encoder to be more discriminative. Our model is fully end-to-end, requires no manual feature aggregation and post-processing.
You can train a model for a given architecture as follows
python main.py --model=MultiScaleDense121 --batch-size=16
You can get retrieval results given an image folder
python predict.py model_name ./test_data ./result/predictions.csv
If your have any suggestion or questions, please send email to toanvuhong@gmail.com
If you find this code useful for your work, please cite our paper
@inproceedings{10.1145/3394171.3416281,
author = {Vu, Toan H. and Dang, An and Wang, Jia-Ching},
title = {Learning to Remember Beauty Products},
year = {2020},
isbn = {9781450379885},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3394171.3416281},
doi = {10.1145/3394171.3416281},
booktitle = {Proceedings of the 28th ACM International Conference on Multimedia},
pages = {4728–4732},
numpages = {5},
location = {Seattle, WA, USA},
series = {MM '20}
}