Pure TRT has worse paerfonmance than TRT with MMDeploy SDK #2689
Unanswered
roman-duris
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello,
I am currently testing my models in Triton Server and I observed interesting behaviour.
More specifically I observed this behaviour on RTMDet family of models.
I tested two variants for deployment:
I noticed that when using pure TRT the speed is 5 times, sometimes even more, lower than when using python backend and mmdeploy_runtime_gpu
I am still wondering why this is, as in my mind the speed should be higher as there should be less calls when executing only TRT model without the runtime.
Any ideas?
Beta Was this translation helpful? Give feedback.
All reactions