Pure TRT has worse paerfonmance than TRT with MMDeploy SDK #2689

roman-duris · 2024-03-05T11:53:46Z

roman-duris
Mar 5, 2024

Hello,

I am currently testing my models in Triton Server and I observed interesting behaviour.
More specifically I observed this behaviour on RTMDet family of models.

I tested two variants for deployment:

RTMDET in TRT - TensorRT backend + custom ops in .so files
RTMDET in TRT - Python backend + mmdeploy_runtime_gpu + model.py inference script

I noticed that when using pure TRT the speed is 5 times, sometimes even more, lower than when using python backend and mmdeploy_runtime_gpu

I am still wondering why this is, as in my mind the speed should be higher as there should be less calls when executing only TRT model without the runtime.

Any ideas?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pure TRT has worse paerfonmance than TRT with MMDeploy SDK #2689

{{title}}

Replies: 0 comments

Select a reply

Pure TRT has worse paerfonmance than TRT with MMDeploy SDK #2689

roman-duris Mar 5, 2024

Replies: 0 comments

roman-duris
Mar 5, 2024