Creating integer only models #66

StuartIanNaylor · 2022-11-04T18:47:50Z

Nils is it possible to create an integer only models so this could run on accelerators or frameworks such as ArmNN?
https://www.tensorflow.org/lite/performance/post_training_quantization#full_integer_quantization

I always get confused at how to implement the representative_dataset()?

import tensorflow as tf
converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.int8  # or tf.uint8
converter.inference_output_type = tf.int8  # or tf.uint8
tflite_quant_model = converter.convert()

Has anyone done this and got an example or even better the tflite models?

The text was updated successfully, but these errors were encountered:

jeungmin717 · 2022-11-14T01:11:54Z

@StuartIanNaylor

In my case , Full-integer quantization for this double stacked LSTM model is not available.
the calculation inside the model still remains float32, when I (dynamic) quantized this model

from tensorflow official documentation full-int quantization(static quantization) for LSTM not available.

check below issue also
tensorflow/tensorflow#25563

I think research on fully quantize LSTM model is still under construction
hope this can give you some help : )

StuartIanNaylor · 2022-11-14T04:55:30Z

That is a massive help and many thanks for the bad news as will save much wasted time.

Damn! :( There are a lot of frameworks such as ArmNN to Npus's that can not run it then unless cpu.
I will leave it open so people can see your great info.
Many Thanks

jeungmin717 · 2022-11-14T05:38:13Z

@StuartIanNaylor
Glad you got my little help.
It's too bad that It cannot be fully-quantized for ArmNN or microcontrollers
But in my opinion, it already satifies realtime performance on CPU (worst case maybe ? )
Which makes no need for fully-quantized model, if your hardware has CPU.
amazing acheivement breizhn has made.

StuartIanNaylor · 2022-11-14T06:34:57Z

Its no criticism of what breizhn produced just the realisation of even better and further optimisation whilst also dropping the python for a DSP more performant C/Rust environ, could achieve.
There are so many devices now with Mali GPU's that with ArmNN quant could of run maybe and same of embedded NPU's.
This lies with ML frameworks especially Tensorflow or maybe Onnx and why recurrent metworks such as LSTM or GRU is so problematic is out of the scope of my knowledge level but I can appreciate the limitations.

JorgeRuizDev · 2022-11-18T08:08:55Z

A few months ago I managed to quantize this LSTM model and run it on a Coral Edge TPU
https://colab.research.google.com/github/google-coral/tutorials/blob/master/train_lstm_timeseries_ptq_tf2.ipynb

The example has been broken since TF 2.7...

StuartIanNaylor · 2022-11-21T09:06:03Z

A few months ago I managed to quantize this LSTM model and run it on a Coral Edge TPU https://colab.research.google.com/github/google-coral/tutorials/blob/master/train_lstm_timeseries_ptq_tf2.ipynb

The example has been broken since TF 2.7...

Yeah its confusing as post-training quantization of recurrent layers does seem to be broken, dunno.

WaterBoiledPizza · 2022-12-30T04:46:38Z

If I may ask, how do you plan to convert this model to integer only?

The mask produced by the model ranges from 0 to 1. Is it possible to train the integer-only model to produce mask ranges from 0 to 255 ?
If the states are changed to integer only, it would affect the LSTM's/RNN's performance. So how do you keep the difference minimal?

JorgeRuizDev · 2022-12-30T06:49:50Z

You map the 0 to -128 and the 1 to 127, and all the intermediate values are then quantized into that interval.

If the network is tightly fitted , quantization can destroy the network performance, and you need to use alternative methods that only a few experimental/research frameworks support...

In other cases, the network will just output a similar output with some extra error.

I think that Quantization Aware Training with RNN is still in an experimental phase, but you can check out QKeras, a QAT library that partially supports this type of training for RNN.

WaterBoiledPizza · 2023-01-31T09:37:35Z

I noticed that the value of states keep going up as the model is processing the audio, so how should I quantize it within the int8 limit?

nyadla-sys · 2023-09-14T02:35:27Z

@ST4 use the attached dtln quantized tflite model
https://github.com/nyadla-sys/whisper.tflite/blob/main/models/dtln_quantized.tflite

heisenberg-kim · 2023-11-28T03:49:32Z

https://github.com/heisenberg-kim/lstm_in_the_unet

quantization model from dtln

StuartIanNaylor mentioned this issue Nov 14, 2022

Can LSTM be fully quantized for inference? tensorflow/tensorflow#25563

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Creating integer only models #66

Creating integer only models #66

StuartIanNaylor commented Nov 4, 2022

jeungmin717 commented Nov 14, 2022 •

edited

Loading

StuartIanNaylor commented Nov 14, 2022

jeungmin717 commented Nov 14, 2022 •

edited

Loading

StuartIanNaylor commented Nov 14, 2022 •

edited

Loading

JorgeRuizDev commented Nov 18, 2022

StuartIanNaylor commented Nov 21, 2022

WaterBoiledPizza commented Dec 30, 2022 •

edited

Loading

JorgeRuizDev commented Dec 30, 2022

WaterBoiledPizza commented Jan 31, 2023

nyadla-sys commented Sep 14, 2023

heisenberg-kim commented Nov 28, 2023

Creating integer only models #66

Creating integer only models #66

Comments

StuartIanNaylor commented Nov 4, 2022

jeungmin717 commented Nov 14, 2022 • edited Loading

StuartIanNaylor commented Nov 14, 2022

jeungmin717 commented Nov 14, 2022 • edited Loading

StuartIanNaylor commented Nov 14, 2022 • edited Loading

JorgeRuizDev commented Nov 18, 2022

StuartIanNaylor commented Nov 21, 2022

WaterBoiledPizza commented Dec 30, 2022 • edited Loading

JorgeRuizDev commented Dec 30, 2022

WaterBoiledPizza commented Jan 31, 2023

nyadla-sys commented Sep 14, 2023

heisenberg-kim commented Nov 28, 2023

jeungmin717 commented Nov 14, 2022 •

edited

Loading

jeungmin717 commented Nov 14, 2022 •

edited

Loading

StuartIanNaylor commented Nov 14, 2022 •

edited

Loading

WaterBoiledPizza commented Dec 30, 2022 •

edited

Loading