-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Creating integer only models #66
Comments
In my case , Full-integer quantization for this double stacked LSTM model is not available. from tensorflow official documentation full-int quantization(static quantization) for LSTM not available. check below issue also I think research on fully quantize LSTM model is still under construction |
That is a massive help and many thanks for the bad news as will save much wasted time. Damn! :( There are a lot of frameworks such as ArmNN to Npus's that can not run it then unless cpu. |
@StuartIanNaylor |
Its no criticism of what breizhn produced just the realisation of even better and further optimisation whilst also dropping the python for a DSP more performant C/Rust environ, could achieve. |
A few months ago I managed to quantize this LSTM model and run it on a Coral Edge TPU The example has been broken since TF 2.7... |
Yeah its confusing as post-training quantization of recurrent layers does seem to be broken, dunno. |
If I may ask, how do you plan to convert this model to integer only?
|
You map the 0 to -128 and the 1 to 127, and all the intermediate values are then quantized into that interval. If the network is tightly fitted , quantization can destroy the network performance, and you need to use alternative methods that only a few experimental/research frameworks support... In other cases, the network will just output a similar output with some extra error. I think that Quantization Aware Training with RNN is still in an experimental phase, but you can check out QKeras, a QAT library that partially supports this type of training for RNN. |
I noticed that the value of states keep going up as the model is processing the audio, so how should I quantize it within the int8 limit? |
@ST4 use the attached dtln quantized tflite model |
https://github.com/heisenberg-kim/lstm_in_the_unet quantization model from dtln |
Nils is it possible to create an integer only models so this could run on accelerators or frameworks such as ArmNN?
https://www.tensorflow.org/lite/performance/post_training_quantization#full_integer_quantization
I always get confused at how to implement the representative_dataset()?
Has anyone done this and got an example or even better the tflite models?
The text was updated successfully, but these errors were encountered: