You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The ONNX runtime has a number of different precision options to choose from.
By default, the fp32 (float32) option is used. However, using alternatives like float16 or mixed precision can help decrease the size of the model in half and significantly help improve performance- especially as it scales in the future.
The ONNX runtime has a number of different precision options to choose from.
By default, the fp32 (float32) option is used. However, using alternatives like float16 or mixed precision can help decrease the size of the model in half and significantly help improve performance- especially as it scales in the future.
Will look into exploring more of https://github.com/microsoft/onnxruntime to customize the precision of the model to best suit our use case.
Refs:
https://medium.com/data-science-at-microsoft/model-compression-and-optimization-why-think-bigger-when-you-can-think-smaller-216ec096f68b
https://onnxruntime.ai/docs/performance/model-optimizations/float16.html
The text was updated successfully, but these errors were encountered: