Skip to content

jittinabraham/Leaflet-Product-Classification

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Leaflet Product Classification

image

This Git repository contains the code for the Kaggle competition: Retail Products Classification 2023.

Abstract

This repository presents a model for fine-grained product recognition, specifically based on leaflet images. The dataset comprises 41.6k manually annotated product images categorized into 832 classes. These images were extracted from advertisement leaflets collected over several years from various European retailers. The project explores three different approaches for fine-grained product classification: Classification by Image, by Text, and by Image and Text. The "Classification by Text" approach utilizes text extracted directly from the leaflet product images. The study demonstrates that combining image and text inputs improves the classification accuracy, particularly for visually challenging products. The final model achieves an impressive accuracy of 97.44% with a Top-3 score of 99.2%.

Repository Contents

Usage

Please review the README files located in the respective directories for information regarding the image and text models.

Image Model

Image Model Guide

Execute train_model.ipynb to train the model. Open In Colab

Use test_model.ipynb for testing and predicting with the trained model. Open In Colab

Text Model

OCR Guide

Text Model Guide

Execute text_model.ipynb to train the model. Open In Colab

Key Findings:

Image Model:

Top 5 accuracy: 97.08%
Number of misclassified images: 653

Text Model:

Accuracy on misclassified images from the image model: 67.38%
Top 5 accuracy on misclassified images: 93.85%
Overall accuracy: 85.06%
Overall top 5 accuracy: 93.85%

Combined result:

Combined result: Open In Colab

Overall estimated accuracy: 97.44%

Credits

This project is created for the Kaggle competition "Retail Products Classification 2023".

If you find this code helpful, consider giving it a star!

References

Link to Kaggle competition

Original Dataset

Fine-Grained Product Classification on Leaflet Advertisements

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%