Python Examples

Python Examples

Introduction

TIDL provides multiple deployment options with industry defined inference engines as listed below. These inference engines are being referred as Open Source Run Times in this document.

TFLite Runtime: TensorFlow Lite based inference with heterogeneous execution on cortex-A** + C7x-MMA, using TFlite Delegates TFLite Delgate API
ONNX RunTime: ONNX Runtime based inference with heterogeneous execution on cortex-A** + C7x-MMA.
TVM/Neo-AI RunTime: TVM/Neo-AI-DLR based inference with heterogeneous execution on cortex-A** + C7x-MMA

** Please refer to the device TRM to know which cortex-A MPU the device of interest contains

This heterogeneous execution enables:

OSRT as the top level inference for user applications
Offloading subgraphs to C7x/MMA for accelerated execution with TIDL
Runs optimized code on ARM core for layers that are not supported by TIDL

OSRT offering also supports general purpose ARM-only OSRT inference capability for low-end TI devices with no C7x/MMA

OSRT based user work flow

The diagram below illustrates the TFLite based work flow as an example. ONNX RunTime and TVM/Neo-AI DLR RunTime also follow similar work flow. The user needs to run the model compilation (sub-graph(s) creation and quantization) on PC and the generated artifacts can be used for inference on the device.

Model Compilation

Model compilation is supported only on PC. Follow the below outlined steps to perform model compilation:

Prepare the Environment for the Model compilation by following the setup section here
Run for model compilation in the corresponding runtimes examples folder (examples/osrt_python/{runtime}) – This step generates artifacts needed for inference in the <repo base>/model-artifacts folder. Each subgraph is identified in the artifacts using the tensor index/name of its output in the model

cd examples/osrt_python/tfl
python3 tflrt_delegate.py -c

Model Inference on PC (optional)

Run Inference on PC using TIDL artifacts generated during compilation - User can test the inference in host emulation mode and check the output; the output images will be saved in the <repo base>/output_images folder

python3 tflrt_delegate.py

Run Inference on PC without TIDL offload - User can test the inference in host emulation mode without using any delegation to TI Delegate

python3 tflrt_delegate.py -d

Model Inference on Device

The artifacts generated by python scripts in the above section can be used for inference using either python or C/C++ APIs. The following steps are for performing inference using python API

Clone the repo on Device
Copy below folders from PC to device where this repo is cloned

./model-artifacts
./models

Run inference script in the corresponding examples folder on the device and check the results, performance etc.

cd examples/osrt_python/tfl
python3 tflrt_delegate.py

Note : These scripts are only for basic functionally testing and performance check. Accuracy of the models can be benchmarked using the python module released here edgeai-benchmark

User options for TIDL Acceleration

'options' in the tfLite interpreter/ONNX inference session/TVM compiler call as outlined in OSRT APIs for TIDL Acceleration are described below.
Options are common across runtimes. Any runtime specific requirements/constraints in terms of options are called out in below documentation.
Please see compilation examples (examples/osrt_python) for how to use these options.

Required options

The following options must be specified by user while creating inference sessions for respective runtimes.

Name	Description	Default Values	Option Type	Additional details
tidl_tools_path	This option indicates the path to TIDL tools to be used for model-compilation. On successful completion of setup script, the required TIDL tools are available in tidl_tools folder. As part of demo examples, this option is populated using ${TIDL_TOOLS_PATH} environment variable set by user	No default value, must be specified by user	Model compilation
artifacts_folder	TIDL model compilation generates artifacts which are used for model inference. This option specifies the path to folder where model artifacts are saved/to be saved	No default value, must be specified by user	Model compilation / Model inference

TVM specific required options in addition to above:

Name	Description	Supported values/range	Default values	Option Type	Additional details
platform	This option specifies the platform used for inference	"J7", "AM62A"	"J7"	Model compilation

Optional options

The following options are set to default values, to be specified if modification needed by user.

Basic options:

Name	Description	Supported values/range	Default values	Option Type	Additional details
tensor_bits	This option specifies number of bits for TIDL tensor and weights	8,16 (32 - only for PC inference, not device)	8	Model Compilation
debug_level	This options enables increasing levels of debug prints and TIDL layer traces	0 - no debug, 1 - Level 1 debug prints 2 - Level 2 debug prints 3 - Level 1 debug prints, fixed point layer traces 4 (experimental) - Level 1 debug prints, Fixed point and floating point traces 5 (experimental) - Level 2 debug prints, Fixed point and floating point traces 6 - Level 3 debug prints	0	Model compilation / Model inference
max_num_subgraphs	This option specifies maximum number of subgraphs to be offloaded to TIDL for acceleration, rest to be delegated to ARM	<= 16	16	Model Compilation
accuracy_level	This option specifies level of accuracy desired - specifying higher accuracy_level gives improved accuracy, but may take more time for model compilation	0 - basic calibration, 1 - higher accuracy (advanced bias calibration), 9 - user defined	1	Model compilation	Refer advanced options below for more granular control on accuracy knobs using accuracy_level = 9. Refer Quantization for more details on model quantization and accuracy

Options to enable control on layer level delegation to TI DSP/ARM

Following options force offload of a particular layer to TIDL DSP/ARM. These can be exercised either for debug purpose, or performance improvement by creating optimal cluster in case desired

Name	Description	Supported values/range	Option Type	Additional details
deny_list:layer_type	This option forcefully disables offload of a particular operator to TIDL DSP using layer type	Comma separated string	Model Compilation	This option is not available currently for TVM, please refer deny_list option.
deny_list:layer_name	This option forcefully disables offload of a particular operator to TIDL DSP using layer name	Comma separated string	Model Compilation	This option is not available currently for TVM, please refer deny_list option
deny_list	This option offers same functionality as deny_list:layer_type	Comma separated string	Model Compilation	Maintained for backward compatibility, not recommended for Tflite/ONNX runtime
allow_list:layer_name	This option forcefully enables offload of a particular operator to TIDL DSP using layer name	Comma separated string	Model Compilation	Only the layer/layers specified are accelerated, others are delegated to ARM. Experimental for Tflite/ONNX runtime and currently not applicable for TVM

Note : Allow_list and deny_list options cannot be enabled simultaneously

Examples of usage:
Specifying layer_type as part of options:

Tflite runtime : Specify registration code as specified in tflite builtin ops - Please refer Tflite builtin ops , e.g. 'deny_list:layer_type':'1, 2' to deny offloading 'AveragePool2d' and 'Concatenation' operators to TIDL.
ONNX runtime : Specify the ONNX operator name e.g. "MaxPool" to deny offloading Max pooling operator to TIDL
TVM runtime : Specify TVM relay operator name e.g. "nn.conv2d" to deny offloading convolution operator to TIDL

Specifying layer_name as part of options:

Specify the layer name as observed in Netron for the layer
For ONNX models, layer name may not present as part of layer in models in some cases; in such cases, output name corresponding to output(0) for the particular layer can be specified as part of 'deny_list:layer_name'/'allow_list:layer_name' options

Object Detection model specific options

Following options need to be specified to enable post processing optimization for object detection models. Please refer Object detection meta architectures for more details about these options

Name	Description	Default values	Option Type	Additional details
object_detection:meta_layers_names_list	This option specifies path to the meta architecture file used to convey OD post processing information to TIDL	""	Model Compilation	Refer Object detection meta architectures for more details
object_detection:meta_arch_type	This option indicates the post processing architecture used by OD model	-1 (no post processing optimization)	Model compilation	Refer Object detection meta architectures for more details

Options for devices with multiple DSP cores

The following options are applicable only for SoCs with multiple DSP cores/MMA support and enable additional features supported on these devices.

Name	Description	Supported values/range	Default values	Option Type	Additional details
advanced_options:inference_mode	This option specifies the feature/mode to be used for inference. This option must be specified during compilation and impacts the artifacts generated	0 (TIDL_inferenceModeDefault) 1 (TIDL_inferenceModeHighThroughput) 2 (TIDL_inferenceModeLowLatency)	0	Model compilation	Refer Multi-DSP inference for more details
advanced_options:num_cores	This option specifies the number of DSP cores to be used for inference	Min : 1, Max : maximum number of DSP cores available on device	1	Model compilation	Refer Multi-DSP inference for more details
core_number	This option specifies the index of core out of all the available C7x cores to be used to execute single core inference(this options is 1-indexed) As an example, core_number = 1 for C7x_1	Min : 1, Max : maximum number of DSP cores available on device	1	Model inference	Refer Multi-DSP inference for more details
core_start_idx	This option specifies index of core from which to start processing. In case of inference_mode = 1 or 2, execution would happen on C7x_{core_start_idx} to C7x_{core_start_idx + advanced_options:num_cores}	Min : 1, Max : maximum number of DSP cores available on device	1	Model inference	Refer Multi-DSP inference for more details

Advanced options for accuracy enhancement

Following options must be accessed as "advanced_options:Name" where Name is as specified in table below. For TVM, this shall be passed as additional dictionary. Refer out-of-box example for usage.

Name	Description	Supported values/range	Default values	Option Type	Additional details
calibration_frames	This option specifies number of frames to be used for calibration - min 10 frames recommended	Any - min 10 frames recommended	20	Model compilation	Applicable only for accuracy_level=1, Refer Quantization for more details
calibration_iterations	This option specifies number of bias calibration iterations	Any - min 10 recommended	50	Model compilation	Applicable only for accuracy_level=1, Refer Quantization for more details
output_feature_16bit_names_list	This option specifies list of names of the layers as in the original model whose feature/activation output user wants to be in 16 bit	Comma separated string	""	Model compilation	Refer Quantization for more details
params_16bit_names_list	This option specifies list of names of the output layers as in the original model whose parameters user wants to be in 16 bit	Comma separated string	""	Model compilation	Refer Quantization for more details
mixed_precision_factor	This option is used to enable the automated mixed precision feature - automatically decide which layers to set to 16 bit for improving accuracy based on acceptable performance degradation. This parameter is defined as mixed_precision_factor = (Acceptable latency with mixed precision / Latency with 8 bit inference), e.g. if acceptable latency for accuracy improvement is 1.2 times the 8 bit inference latency, the automated mixed precision algorithm finds the most optimal layers to set to 16 bits to gain accuracy improvement while making sure performance constraint set by mixedPrecisionFactor is satisfied	Any float value > 1	-1 (No automated mixed precision)	Model compilation	Refer Quantization for more details

Below options will be used only if accuracy_level = 9, else will be discarded. For accuracy level 9, default value of specified options will be overwritten, rest will be set to default values. For accuracy_level = 0/1, these are preset internally to default values. Please refer Quantization for more details on individual options.

Name	Description	Default values
advanced_options:activation_clipping	0 for disable, 1 for enable	1
advanced_options:weight_clipping	0 for disable, 1 for enable	1
advanced_options:bias_calibration	0 for disable, 1 for enable	1
advanced_options:channel_wise_quantization	0 for disable, 1 for enable	0

Advanced miscellaneous options

Name	Description	Supported values/range	Default values	Option Type	Additional details
advanced_options:quantization_scale_type	This option specifies type of quantization style to be used for model quantization	0 - non-power-of-2, 1 - power-of-2 3 - TF-Lite pre-quantized model 4 - Asymmetric, Per-channel Quantization	0	Model compilation	Refer Quantization for more details
advanced_options:quant_params_proto_path	This option allows you to configure quantization scales manually by specifying the min/max values of outputs	String	""	Model compilation	Refer to Quantization Parameters for further details
advanced_options:prequantized_model	This option enables reading of scales and zero points from an ONNX QDQ model and bypasses the need for calibration	0 - disable, 1 enable	0	Model compilation	This impacts only ONNX models, for TF-Lite models quantization_scale_type=3 has the same effect
advanced_options:high_resolution_optimization	This option enables performance optimization for high resolution models	0 - disable, 1 enable	0	Model compilation
advanced_options:add_data_convert_ops	This option embeds input and output format conversions (layout, data type, etc.) as part of model and performs the same in DSP instead of ARM	0 - disable, 1 - Input format conversion 2 - output format conversion 3 - Input and output format conversion	0	Model compilation	This is currently an experimental feature
advanced_options:network_name	This option allows the user to set the network name (used for the name of the subgraph being delegated to C7x/MMA). If your model contains a network name, it will get used by default	String	"Subgraph"	Model compilation
advanced_options:c7x_firmware_version	Refer to Release version convention. In case you are using firmware released as part of processor SDK RTOS, this field can be ignored. If you are using TIDL firmware release with a new patch release of the same "release line" then it is essential to use c7x_firmware_version explicitly	String	"10_00_02_00"	Model compilation	Possible values are "10_00_04_00" & "10_00_02_00"
advanced_options:partial_init_during_compile	This option allows the user to enable partial initialization of handles during model compilation to reduce the runtime initialization time	0 - disable, 1 - enable	0	Model compilation	-
advanced_options:batch_mode	This option allows the user to enable batch stitching	0 - disable, 1 - enable	0	Model compilation	-
advanced_options:log_file_name	This option allows the user to redirect the output logs to a file	String	""	Model compilation
advanced_options:single_core_layers_names_list	This option allows the user to specify layers to run on single core in multi core inference	Comma separated string	""	Model inference	-
model_type	This option is meant to communicate to TIDL import library that specified model is object detection model	"OD"	""	Model compilation	This option is required to be set to "OD" only if model is object detection, and compilation throws warning asking to explicitly specify this option as "OD", else this option can be ignored
c7x_codegen	This option is used to enable running TIDL-unsupported layers on DSP using TVM auto code generation feature	0 - Run TIDL-unsupported layers on ARM, 1 - Run TIDL-unsupported layers on DSP	0	Model compilation	This is a TVM specific feature, has undergone limited validation [^3]
ti_internal_nc_flag	internal use only	-	-	-	-

[1]: Specifying layer_type as part of deny_list option :
Tflite runtime : Specify registration code as specified in tflite builtin ops - Please refer Tflite builtin ops , e.g. "1, 2" to deny offloading 'AveragePool2d' and 'Concatenation' operators to TIDL.
ONNX runtime : Specify the ONNX operator name e.g. "MaxPool" to deny offloading Max pooling operator to TIDL TVM runtime : Specify TVM relay operator name e.g. "nn.conv2d" to deny offloading convolution operator to TIDL
[2]: ONNX runtime - In case layer name is not present as part of layer in model, output name corresponding to output(0) for the particular layer can be specified
[3]: Running TIDL-unsupported layers on DSP with parameter "c7x_codegen=1", requires Processor SDK 8.2 or newer. This feature has only been validated with selected models in TI's Edgeai-benchmark that are using the TVM flow. We will continue to work on this feature to improve the operator coverage and generate more performant DSP code. If your model encounters problem with this feature, please set "c7x_codegen=0" and run the TIDL-unsupported layers on ARM.

Trouble Shooting

Refer this Troubleshooting section if any issues observed during compilation of custom models.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Python Examples

Introduction

OSRT based user work flow

Model Compilation

Model Inference on PC (optional)

Model Inference on Device

User options for TIDL Acceleration

Required options

Optional options

Basic options:

Options to enable control on layer level delegation to TI DSP/ARM

Object Detection model specific options

Options for devices with multiple DSP cores

Advanced options for accuracy enhancement

Advanced miscellaneous options

Trouble Shooting

Files

README.md

Latest commit

History

README.md

File metadata and controls

Python Examples

Introduction

OSRT based user work flow

Model Compilation

Model Inference on PC (optional)

Model Inference on Device

User options for TIDL Acceleration

Required options

Optional options

Basic options:

Options to enable control on layer level delegation to TI DSP/ARM

Object Detection model specific options

Options for devices with multiple DSP cores

Advanced options for accuracy enhancement

Advanced miscellaneous options

Trouble Shooting