Releases: openfoodfacts/robotoff-models
pytorch-ingredient-detection-1.0
This ingredient detection model was trained on the ingredient detection dataset v1 using code in this version of the repository.
Training was tracked on Wandb.
More information on experiments performed on can be found in this document.
This release provides the following assets:
Training-related assets:
predictions.tar.gz
: predictions on train and test dataset of the model, in:- HTML format: easier to view
- JSONL format: either the raw or the aggregated (post-processed) version
model-huggingface.tar.gz
: the HuggingFace serialized model
Serving assets:
onnx.tar.gz
: the model exported to ONNX format
keras-category-classifier-image-embeddings-3.0
This category classification model was trained on the v4 Data For Good 2022 category dataset using code in this version of the off-category-classification repository.
Training was tracked on Wandb.
This release provides the following assets:
Dataset assets:
predict_categories_dataset_products.jsonl.gz
: product selected fields.predict_categories_dataset_images_ids.jsonl.gz
: IDs of images associated with each product.predict_categories_dataset_ocrs.jsonl.gz
: extracted OCR texts for each product.(train|test|val).txt
: train, test and validation splits (list of barcodes).
Training-related assets:
config.json
providing the parameter configuration used during training.categories.full.json.gz
containing the category taxonomy version used in this model's training.ingredients.full.json.gz
containing the ingredient taxonomy version used in this model's training.training.log
: training logs.
Validation assets:
classification_report_(test|val).json
is the classification report for test/val datasets.threshold_report_0.99.json
: category-specific thresholds required to reach aprecision >= 0.99
on a merged validation + test set.(test|val)_top_predictions.tsv
: top-10 predictions on validation/test sets.
Serving assets:
saved_model.tar.gz
containing the model saved in SavedModel format.
clip-vit-base-patch32
ONNX export of CLIP-ViT base patch-32.
Exported with HuggingFace Transformers (v4.25.1) with Pytorch backend, ONNX opset 17.
tf-universal-logo-detector-1.0
Universal logo detection model: detects generic logos. trained on 2019-12-13. The model detects the following objects:
- brand (all brand logos)
- label (remaining logos)
Training and validation data (TFRecords files) can be found in data.zip
.
Tensorflow SavedModel files can be found in saved_model.tar.gz
, all checkpoints (intermediate and final checkpoint) in checkpoints.tar.gz
. Tensorboard event files are also attached.
The model was trained using Tensorflow Object Detection API: https://github.com/tensorflow/models/tree/60bb50675ed7fab3afd05edab02a45acee57532a
Base model: Faster-RCNN ResNet-101 pretrained on COCO dataset: http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet101_coco_2018_01_28.tar.gz
An ONNX export (model.onnx
) using opset 13 is also attached.
tf-nutrition-table-1.0
Nutrition table detection model, trained at 2019-12-03. The model detects the following objects:
- nutrition-table
- nutrition-table-small
- nutrition-table-small-energy
- nutrition-table-text
Training and validation data (TFRecords files) can be found in data.zip
. Before train-val split, the dataset was obtained by merging 3 annotated datasets on the annotation interface:
- nutrition-table-1
- Nutrition table (Sagar)
- nutrition-table-2
Tensorflow SavedModel files can be found in saved_model.tar.gz
.
The model was trained using Tensorflow Object Detection API: https://github.com/tensorflow/models/tree/e3f8ea2227ef5ce67df04bd175e6c20711079d8f
An ONNX export (model.onnx
) using opset 13 is also attached.
tf-nutriscore-1.0
Provides configuration file, serialized model and training/validation data for the nutriscore object detection model. More specifically, includes:
- the label pbtxt file (labels.pbtxt)
- the training configuration file (pipeline.config)
- training (train.record) and validation (val.record) data
- model checkpoint (model.ckpt-*)
- the frozen inference graph (frozen_inference_graph.pb)
- the saved model (in saved_model.tar.gz), for use in Tensorflow Serving
An ONNX export (model.onnx
) using opset 13 is also attached.
keras-category-classifier-xx-2.0
This category classification model was trained on the 2021-09-15 multi-lingual dataset, using code in this repository.
This release provides the following assets:
Training-related assets:
config.json
providing the parameter configuration used during training.category_voc.json
specifying the mapping between the model's outputs and the taxonomised categories.category_taxonomy.json
containing the category taxonomy version used in this model's training.training_model.tar.gz
containing the training model that can be used for further training of the model.
Validation assets:
classification_report_(test|val).json
is the classification report for test/val datasets.metrics_(test|val).json
is the model's performance metrics for test/val datasets.
Serving assets:
serving_model.tar.gz
containing the TF Serving-compatible model with an additional output layer that will convert the raw vocabulary indices to category strings.
keras-category-classifier-xx-1.0
This category classification model was trained on the 2019-09-16 multilingual (xx) dataset, using the code contained in this repository.
Provides:
- the configuration file (
config.json
) - assets (
category_taxonomy.json
,category_voc.json
,ingredient_voc.json
,product_name_voc.json
) - the keras hdf5 checkpoint (
checkpoint.hdf5
) - classification reports and metrics on the test and val sets, for the whole set or splitted by major language
category-predictor-xgfood-emlyon-1.0
category-predictor-ocr-lewagon-1.0
Model trained by students from the bootcamp Le Wagon in March 2021.
Based on a RidgeClassifier (sklearn) trained on text from OCRed images.
Output is a confidence for each possible category, out of a short of 38 categories.
For more details, see https://github.com/Laurel16/OpenFoodFactsCategorizer