New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

关于bevformer进行int8量化的疑问？ #108

Open

IEIAuto opened this issue Aug 27, 2024 · 0 comments

IEIAuto commented Aug 27, 2024

您好，感谢您的工作，目前在进行int8量化时，发现了两个问题，想请教一下：

使用示例custom plugin中ptq量化时（onnx2trt_int8_qdp.sh），量化后的模型中两个关键cuda核仍使用的fp16的精度
Multi-scale Deformable Attention与Modulated Deformable Conv2d
使用示例custom plugin中int8量化时（onnx2trt_int8.sh），量化后的模型中两个cuda核同样也使用的fp16的精度

请问为什么会出现如上的现象呢？

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment