STMicroelectronics – STM32 model zoo services

Welcome to STM32 model zoo services!

The STM32 AI model zoo is a set of services and scripts used to ease end to end AI models integration on ST devices. This can be used in conjunction of the STM32 model zoo which contains a collection of reference machine learning models optimized to run on STM32 microcontrollers. Available on GitHub, this is a valuable resource for anyone looking to add AI capabilities to their STM32-based projects.

Scripts to easily retrain or fine-tune any model from user datasets (BYOD and BYOM)
A set of services and chained services to quantize, benchmark, predict and evaluate any model (BYOM)
Application code examples automatically generated from user AI models

These models can be useful for quick deployment if you are interested in the categories they were trained on. We also provide training scripts to do transfer learning or to train your own model from scratch on your custom dataset.

The performances on reference STM32 MCU and MPU are provided for float and quantized models.

This project is organized by application, for each application you will have a step by step guide that will indicate how to train and deploy the models.

What's new in releases :

3.0:

Full support of the new STM32N6570-DK board.
Included additional models compatible with the STM32N6.
Included support for STEdgeAI Core v2.0.0 (STM32Cube.AI v10.0.0).
Split of model zoo and services into two GitHub repositories
Integrated support for ONNX model quantization and evaluation from h5 models.
Expanded use case support to include Instance Segmentation and Speech Enhancement.
Added Pytorch support through the speech enhancement Use Case.
Support of On device evaluation and prediction on the STM32N6570-DK boards.
Model Zoo hosted on Hugging Face

2.1:

Included additional models compatible with the STM32MP257F-EV1 board.
Added support for per-tensor quantization.
Integrated support for ONNX model quantization and evaluation.
Included support for STEdgeAI (STM32Cube.AI v10.0.0 and subsequent versions).
Expanded use case support to include Pose Estimation and Semantic Segmentation.
Standardized logging information for a unified experience.

2.0:

An aligned and uniform architecture for all the use case
A modular design to run different operation modes (training, benchmarking, evaluation, deployment, quantization) independently or with an option of chaining multiple modes in a single launch.
A simple and single entry point to the code : a .yaml configuration file to configure all the needed services.
Support of the Bring Your Own Model (BYOM) feature to allow the user (re-)training his own model. Example is provided here, chapter 5.1.
Support of the Bring Your Own Data (BYOD) feature to allow the user finetuning some pretrained models with his own datasets. Example is provided here, chapter 2.3.

Available use-cases

The ST model zoo provides a collection of independent services and pre-built chained services that can be used to perform various functions related to machine learning. The individual services include tasks such as training or quantization of a model, while the chained services combine multiple services to perform more complex functions, such as training the model, quantizing it, and evaluating the quantized model successively before benchmarking it on a HW of your choice.

All trained models in the STM32 model zoo are provided with their configuration .yaml file used to generate them. This is a very good baseline to start with!

Tip

All services are available for following use cases with quick and easy examples that are provided and can be executed for a fast ramp up (click on use cases links below).

Image Classification
Object Detection
Pose Estimation
Semantic Segmentation
Instance Segmentation
Audio Event Detection
Speech Enhancement
Human Activity Recognition
Hand Posture Recognition

Image Classification

Image classification is used to classify the content of an image within a predefined set of classes. Only one class is predicted from an input image.

Image classification (IC) models

Models	Input Resolutions	Supported Services	Suitable Targets for deployment
MobileNet v1 0.25	96x96x1 96x96x3 224x224x3	Full IC Services	STM32H747I-DISCO with B-CAMS-OMV camera daughter board NUCLEO-H743ZI2 with B-CAMS-OMV camera daughter board
MobileNet v1 0.5	224x224x3	Full IC Services	STM32H747I-DISCO with B-CAMS-OMV camera daughter board NUCLEO-H743ZI2 with B-CAMS-OMV camera daughter board
MobileNet v1 1.0	224x224x3	Full IC Services	STM32MP257F-EV1 STM32N6570-DK
MobileNet v2 0.35	128x128x3 224x224x3	Full IC Services	STM32H747I-DISCO with B-CAMS-OMV camera daughter board NUCLEO-H743ZI2 with B-CAMS-OMV camera daughter board
MobileNet v2 1.0	224x224x3	Full IC Services	STM32MP257F-EV1 STM32N6570-DK
MobileNet v2 1.4	224x224x3	Full IC Services	STM32MP257F-EV1 STM32N6570-DK
ResNet8 v1	32x32x3	Full IC Services	STM32H747I-DISCO with B-CAMS-OMV camera daughter board NUCLEO-H743ZI2 with B-CAMS-OMV camera daughter board
ST ResNet8	32x32x3	Full IC Services	STM32H747I-DISCO with B-CAMS-OMV camera daughter board NUCLEO-H743ZI2 with B-CAMS-OMV camera daughter board
ResNet32 v1	32x32x3	Full IC Services	STM32H747I-DISCO with B-CAMS-OMV camera daughter board NUCLEO-H743ZI2 with B-CAMS-OMV camera daughter board
ResNet50 v2	224x224x3	Full IC Services	STM32MP257F-EV1 STM32N6570-DK
SqueezeNet v1.1	128x128x3 224x224x3	Full IC Services	STM32H747I-DISCO with B-CAMS-OMV camera daughter board NUCLEO-H743ZI2 with B-CAMS-OMV camera daughter board
FD MobileNet 0.25	128x128x3 224x224x3	Full IC Services	STM32H747I-DISCO with B-CAMS-OMV camera daughter board NUCLEO-H743ZI2 with B-CAMS-OMV camera daughter board
ST FD MobileNet	128x128x3 224x224x3	Full IC Services	STM32H747I-DISCO with B-CAMS-OMV camera daughter board NUCLEO-H743ZI2 with B-CAMS-OMV camera daughter board
ST EfficientNet	128x128x3 224x224x3	Full IC Services	STM32H747I-DISCO with B-CAMS-OMV camera daughter board NUCLEO-H743ZI2 with B-CAMS-OMV camera daughter board
EfficientNet v2	224x224x3 240x240x3 260x260x3 384x384x3	Full IC Services	STM32MP257F-EV1 STM32N6570-DK
Mnist	28x28x1	Full IC Services	STM32H747I-DISCO with B-CAMS-OMV camera daughter board NUCLEO-H743ZI2 with B-CAMS-OMV camera daughter board

Full IC Services : training, evaluation, quantization, benchmarking, prediction, deployment

Selecting a model for a specific task or a specific device is not always an easy task, and relying on metrics like the inference time and the accuracy as in example figure on food-101 classification below can help making the right choice before fine tuning your model.

Please find below some tutorials for a quick ramp up!

Image Classification top readme here

Object Detection

Object detection is used to detect, locate and estimate the occurrences probability of predefined objects from input images.

Object Detection (OD) Models

Models	Input Resolutions	Supported Services	Targets for deployment
ST SSD MobileNet v1 0.25	192x192x3 224x224x3 256x256x3	Full OD Services	STM32H747I-DISCO with B-CAMS-OMV camera daughter board
SSD MobileNet v2 fpn lite 0.35	192x192x3 224x224x3 256x256x3 416x416x3	Full OD Services	STM32H747I-DISCO with B-CAMS-OMV camera daughter board STM32MP257F-EV1 STM32N6570-DK
SSD MobileNet v2 fpn lite 1.0	256x256x3 416x416x3	Full OD Services	STM32MP257F-EV1 STM32N6570-DK
ST Yolo LC v1	192x192x3 224x224x3 256x256x3	Full OD Services	STM32H747I-DISCO with B-CAMS-OMV camera daughter board
Tiny Yolo v2	224x224x3 416x416x3	Full OD Services	STM32N6570-DK
ST Yolo X	256x256x3 416x416x3	Full OD Services	STM32N6570-DK
Yolo v8	192x192x3 256x256x3 320x320x3 416x416x3	Evaluation / Benchmarking / Prediction / Deployment	STM32N6570-DK

Full OD Services : training, evaluation, quantization, benchmarking, prediction, deployment

Relying on metrics like the inference time and the mean Average Precision (mAP) as in example figure on people detection below can help making the right choice before fine tuning your model, as well as checking HW capabilities for OD task.

Please find below some tutorials for a quick ramp up!

Object Detection top readme here

Pose Estimation

Pose estimation allows to detect key points on some specific objects (people, hand, face, ...). It can be single pose where key points can be extracted from a single object, or multi pose where location of key points are estimated on all detected objects from the input images.

Pose Estimation (PE) Models

Models	Input Resolutions	Supported Services	Targets for deployment
Yolo v8n pose	192x192x3 256x256x3 320x320x3	Evaluation / Benchmarking / Prediction / Deployment	STM32MP257F-EV1 STM32N6570-DK
MoveNet 17 kps	192x192x3 224x224x3 256x256x3	Evaluation / Quantization / Benchmarking / Prediction	STM32N6570-DK
ST MoveNet 13 kps	192x192x3	Full PE Services	STM32N6570-DK

Full PE Services : training, evaluation, quantization, benchmarking, prediction, deployment

Various metrics can be used to estimate quality of a single or multiple pose estimation use case. Metrics like the inference time and the Object Key point Similarity (OKS) as in example figure on single pose estimation below can help making the right choice before fine tuning your model, as well as checking HW capabilities for PE task.

Please find below some tutorials for a quick ramp up!

Pose Estimation top readme here

Semantic Segmentation

Semantic segmentation is an algorithm that associates a label to every pixel in an image. It is used to recognize a collection of pixels that form distinct categories. Also, it doesn't differentiate instances of a same category which is the main difference between instance and semantic segmentation.

Semantic Segmentation (SemSeg) Models

Models	Input Resolutions	Supported Services	Targets for deployment
DeepLab v3	256x256x3 320x320x3 416x416x3 512x512x3	Full Seg Services	STM32MP257F-EV1 STM32N6570-DK

Full Seg Services : training, evaluation, quantization, benchmarking, prediction, deployment

Various metrics can be used to estimate quality of a segmentation UC. Metrics like the inference time and IoU as in example figure on person segmentation below can help making the right choice before fine tuning your model, as well as checking HW capabilities for segmentation task.

Please find below some tutorials for a quick ramp up!

Semantic Segmentation top readme here

Instance Segmentation

Instance segmentation is an algorithm that associates a label to every pixel in an image. It also output bounding boxes on detected classes objects. It is used to recognize a collection of pixels that form distinct categories and instances of each categories. It differentiates instances of a same category which is the main difference between instance and semantic segmentation.

Instance Segmentation (InstSeg) Models

Models	Input Resolutions	Supported Services	Targets for deployment
yolov8n_seg	256x256x3 320x320x3	Prediction, Benchmark, Deployment	STM32N6570-DK

Please find below some tutorials for a quick ramp up!

How can I deploy an Ultralytics Yolov8 instance segmentation model?

Instance Segmentation top readme here

Audio Event Detection

This is used to detect a set of pre-defined audio events.

Audio Event Detection (AED) Models

Audio Event Detection use case

Models	Input Resolutions	Supported Services	Targets for deployment
miniresnet	64x50x1	Full AED Services	B-U585I-IOT02A using RTOS, ThreadX or FreeRTOS
miniresnet v2	64x50x1	Full AED Services	B-U585I-IOT02A using RTOS, ThreadX or FreeRTOS
yamnet 256/1024	64x96x1	Full AED Services	B-U585I-IOT02A using RTOS, ThreadX or FreeRTOS STM32N6570-DK

Full AED Services : training, evaluation, quantization, benchmarking, prediction, deployment

Various metrics can be used to estimate quality of an audio event detection UC. The main ones are the inference time and the accuracy (percentage of good detections) on esc-10 dataset as in example figure below. This may help making the right choice before fine tuning your model, as well as checking HW capabilities for such AED task.

Please find below some tutorials for a quick ramp up!

Audio Event Detection top readme here

Speech Enhancement

Speech Enhancement is an algorithm that enhances audio perception in a noisy environment.

Speech Enhancement SE) Models

Models	Input Resolutions	Supported Services	Targets for deployment
stft_tcnn	257x40	Full SE Services	STM32N6570-DK

Full SE Services : training, evaluation, quantization, benchmarking, deployment

Speech Enhancement top readme here

Human Activity Recognition

This allows to recognize various activities like walking, running, ...

Human Activity Recognition (HAR) Models

Human Activity Recognition use case

Models	Input Resolutions	Supported Services	Targets for deployment
gmp	24x3x1 48x3x1	training / Evaluation / Benchmarking / Deployment	B-U585I-IOT02A using ThreadX RTOS
ign	24x3x1 48x3x1	training / Evaluation / Benchmarking / Deployment	B-U585I-IOT02A using ThreadX RTOS

Please find below some tutorials for a quick ramp up!

Human Activity Recognition top readme here

Hand Posture Recognition

This allows to recognize a set of hand postures using Time of Flight (ToF) sensor.

Hand Posture Recognition (HPR) Models

Hand Posture Recognition use case

Models	Input Resolutions	Supported Services	Targets for deployment
ST CNN 2D Hand Posture	64x50x1	training / Evaluation / Benchmarking / Deployment	NUCLEO-F401RE with X-NUCLEO-53LxA1 Time-of-Flight Nucleo expansion board

Hand Posture Recognition top readme here

Hugging Face host

The Model Zoo Dashboard is hosted in a Docker environment under the STMicroelectronics Organization. This dashboard is developed using Dash Plotly and Flask, and it operates within a Docker container. It can also run locally if Docker is installed on your system. The dashboard provides the following features:

• Training: Train machine learning models. • Evaluation: Evaluate the performance of models. • Benchmarking: Benchmark your model using ST Edge AI Developer Cloud • Visualization: Visualize model performance and metrics. • User Configuration Update: Update and modify user configurations directly from the dashboard. • Output Download: Download model results and outputs.

You can also find our models on Hugging Face under the STMicroelectronics Organization. Each model from the STM32AI Model Zoo is represented by a model card on Hugging Face, providing all the necessary information about the model and linking to dedicated scripts.

Available tutorials and utilities

stm32ai_model_zoo_colab.ipynb: a Jupyter notebook that can be easily deployed on Colab to exercise STM32 model zoo training scripts.
stm32ai_devcloud.ipynb: a Jupyter notebook that shows how to access to the STM32Cube.AI Developer Cloud through ST Python APIs (based on REST API) instead of using the web application https://stedgeai-dc.st.com.
stm32ai_quantize_onnx_benchmark.ipynb: a Jupyter notebook that shows how to quantize ONNX format models with fake or real data by using ONNX runtime and benchmark it by using the STM32Cube.AI Developer Cloud.
STM32 Developer Cloud examples: a collection of Python scripts that you can use in order to get started with STM32Cube.AI Developer Cloud ST Python APIs.
stm32ai-tao: this GitHub repository provides Python scripts and Jupyter notebooks to manage a complete life cycle of a model from training, to compression, optimization and benchmarking using NVIDIA TAO Toolkit and STM32Cube.AI Developer Cloud.
stm32ai-nota: this GitHub repository contains Jupyter notebooks that demonstrate how to use NetsPresso to prune pre-trained deep learning models from the model zoo and fine-tune, quantize and benchmark them by using STM32Cube.AI Developer Cloud for your specific use case.

Before you start

For more in depth guide on installing and setting up the model zoo and its requirement on your PC, specially in the cases when you are running behind the proxy in corporate setup, follow the detailed wiki article on How to install STM32 model zoo.

It is also important to note that that the application code for the STM32N6 shall be downloaded from https://www.st.com/en/development-tools/stm32n6-ai.html and unzipped in the application code.

Create an account on myST and then sign in to STM32Cube.AI Developer Cloud to be able access the service.
Or, install STM32Cube.AI locally by following the instructions provided in the user manual in section 2, and get the path to stm32ai executable.
- Alternatively, download latest version of STM32Cube.AI for your OS, extract the package and get the path to stm32ai executable.
If you don't have python already installed, you can download and install it from here, a Python Version == 3.10.x is required to be able to run the the code
(For Windows systems make sure to check the Add python.exe to PATH option during the installation process).
If using GPU make sure to install the GPU driver. For NVIDIA GPUs please refer to https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html to install CUDA and CUDNN. On Windows, it is not recommended to use WSL to get the best GPU training acceleration. If using conda, see below for installation.
Clone this repository using the following command:

git clone https://github.com/STMicroelectronics/stm32ai-modelzoo-services.git

Create a python virtual environment for the project:
```
cd stm32ai-modelzoo-services
python -m venv st_zoo
```
Activate your virtual environment On Windows run:
```
st_zoo\Scripts\activate.bat
```
On Unix or MacOS, run:
```
source st_zoo/bin/activate
```

Or create a conda virtual environment for the project:

cd stm32ai-modelzoo-services
conda create -n st_zoo

Activate your virtual environment:

conda activate st_zoo

Install python 3.10:

conda install -c conda-forge python=3.10

If using NVIDIA GPU, install cudatoolkit and cudnn and add to conda path:

conda install -c conda-forge cudatoolkit=11.8 cudnn

Add cudatoolkit and cudnn to path permanently:

mkdir -p $CONDA_PREFIX/etc/conda/activate.d
echo 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CONDA_PREFIX/lib/' > $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh

Then install all the necessary python packages, the requirement file contains it all.

pip install -r requirements.txt

Jump start with Colab

In tutorials/notebooks you will find a jupyter notebook that can be easily deployed on Colab to exercise STM32 model zoo training scripts.

[!IMPORTANT] In this project, we are using TensorFLow version 2.8.3 following unresolved issues with newest versions of TensorFlow, see more.

[!CAUTION] If there are some white spaces in the paths (for Python, STM32CubeIDE, or, STM32Cube.AI local installation) this can result in errors. So avoid having paths with white spaces in them.

[!TIP] In this project we are using the mlflow library to log the results of different runs. Depending on which version of Windows OS are you using or where you place the project the output log files might have a very long path which might result in an error at the time of logging the results. As by default, Windows uses a path length limitation (MAX_PATH) of 256 characters: Naming Files, Paths, and Namespaces. To avoid this potential error, create (or edit) a variable named LongPathsEnabled in Registry Editor under Computer/HKEY_LOCAL_MACHINE/SYSTEM/CurrentControlSet/Control/FileSystem/ and assign it a value of 1. This will change the maximum length allowed for the file length on Windows machines and will avoid any errors resulting due to this. For more details have a look at this link. Note that using GIT, line below may help solving long path issue :

git config --system core.longpaths true

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
application_code		application_code
audio_event_detection		audio_event_detection
common		common
hand_posture		hand_posture
human_activity_recognition		human_activity_recognition
image_classification		image_classification
instance_segmentation		instance_segmentation
object_detection		object_detection
pose_estimation		pose_estimation
semantic_segmentation		semantic_segmentation
speech_enhancement		speech_enhancement
tutorials		tutorials
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE.md		LICENSE.md
README.md		README.md
SECURITY.md		SECURITY.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

STMicroelectronics – STM32 model zoo services

What's new in releases :

Available use-cases

Image Classification

Object Detection

Pose Estimation

Semantic Segmentation

Instance Segmentation

Audio Event Detection

Speech Enhancement

Human Activity Recognition

Hand Posture Recognition

Hugging Face host

Available tutorials and utilities

Before you start

Jump start with Colab

About

Releases 1

Packages

Contributors 2

Languages

License

STMicroelectronics/stm32ai-modelzoo-services

Folders and files

Latest commit

History

Repository files navigation

STMicroelectronics – STM32 model zoo services

What's new in releases :

Available use-cases

Available tutorials and utilities

Before you start

Jump start with Colab

About

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Languages