Achieved real-time on RTX 3090 using TensorRT, reaching speeds of 30+ FPS. #150

warmshao · 2024-07-16T17:28:03Z

My implementation: https://github.com/warmshao/FasterLivePortrait
New Features:

Achieved real-time performance of LivePortrait on RTX 3090 GPU using TensorRT, reaching speeds of 30+ FPS.
Implemented conversion of LivePortrait model to ONNX format, achieving inference speed of approximately 70ms/frame (~12 FPS) using onnxruntime-gpu on RTX 3090, facilitating cross-platform deployment.
Seamlessly integrated support for native Gradio app, delivering several times faster speed and enabling simultaneous inference on multiple faces. Sample results available at: PR add multi faces supported #105
Refactored code structure to eliminate PyTorch dependency. All models now use ONNX or TensorRT for inference.

FurkanGozukara · 2024-07-16T21:05:55Z

amazing work congrats

warmshao · 2024-07-16T23:04:54Z

amazing work congrats

thanks! The speed is truly unbelievably fast. Perhaps it can be used for some interesting applications.

galigaligo · 2024-07-17T00:08:38Z

I still need to compile onnxrruntime gpu myself, which is a bit discouraging

warmshao · 2024-07-17T00:12:30Z

I still need to compile onnxrruntime gpu myself, which is a bit discouraging

The latest onnxruntime-gpu still doesn't support grid_sample cuda, so we need build it from source. But I will upload a docker image soon, stay tuned!

juntaosun · 2024-07-17T05:41:15Z

Very good, it runs at a steady 20FPS on RTX 3080 . 👍️

juntaosun · 2024-07-17T05:56:58Z

FasterLivePortrait.mp4

shaoguowen · 2024-07-17T08:45:27Z

FasterLivePortrait.mp4

wow, cool! Are you using tensorrt or onnx?

warmshao · 2024-07-17T11:11:00Z

Very good, it runs at a steady 20FPS on RTX 3080 . 👍️
cool

warmshao · 2024-07-17T14:19:16Z

hi guys, I have uploaded an docker image that supports docker running https://github.com/warmshao/FasterLivePortrait. Please try it out. I will provide integration packages for Windows and macOS that support one-click run. Stay tuned.

vpckso · 2024-07-18T03:30:40Z

Thanks, but not working

Could you share your Dockerfile so that I can build myself?

warmshao · 2024-07-18T04:11:56Z

Thanks, but not working

Could you share your Dockerfile so that I can build myself?
You can try installing pycuda yourself: pip install pycuda. Actually, I follows the readme tutorial step by step to install in the container, then commit it, there's no Dockerfile.

vpckso · 2024-07-18T07:05:55Z

nvidia-smi also failed so the image cannot be used, I must build from scratch
but I got a lot of compile error when follow the readme
seems some libraries version not compatible

shaoguowen · 2024-07-18T09:58:10Z

nvidia-smi also failed so the image cannot be used, I must build from scratch but I got a lot of compile error when follow the readme seems some libraries version not compatible

pls refer this: warmshao/FasterLivePortrait#8

vpckso · 2024-07-19T03:15:52Z

Thanks, it works after fix libcuda.so.1 and libnvidia-ml.so.1
also need to fix scripts/all_onnx2trt.sh to retinaface_det_static.onnx and face_2dpose_106_static.onnx

3060 with official pytorch, source/s6.jpg + driving/d0.mp3:
real 0m16.065s
user 0m19.367s
sys 0m1.738s

compiled model can speed up around 3s

TensortRT:
real 0m7.773s
user 0m11.793s
sys 0m11.129s

warmshao · 2024-07-23T14:04:36Z

Install-free, extract-and-play Windows package with TensorRT support now available! please refer FasterLivePortrait releases, Really fast and very convenient!!!

falconwingz88 · 2024-08-08T11:21:29Z

will this work to a video target ?

warmshao · 2024-08-08T11:23:36Z

will this work to a video target ?

yes

juntaosun mentioned this issue Jul 17, 2024

💡About the inference speed of onnx #144

Closed

cleardusk added ONNX TensorRT labels Jul 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Achieved real-time on RTX 3090 using TensorRT, reaching speeds of 30+ FPS. #150

Achieved real-time on RTX 3090 using TensorRT, reaching speeds of 30+ FPS. #150

warmshao commented Jul 16, 2024

FurkanGozukara commented Jul 16, 2024

warmshao commented Jul 16, 2024

galigaligo commented Jul 17, 2024

warmshao commented Jul 17, 2024

juntaosun commented Jul 17, 2024 •

edited

Loading

juntaosun commented Jul 17, 2024

shaoguowen commented Jul 17, 2024

warmshao commented Jul 17, 2024

warmshao commented Jul 17, 2024

vpckso commented Jul 18, 2024 •

edited

Loading

warmshao commented Jul 18, 2024

vpckso commented Jul 18, 2024

shaoguowen commented Jul 18, 2024

vpckso commented Jul 19, 2024

warmshao commented Jul 23, 2024

falconwingz88 commented Aug 8, 2024

warmshao commented Aug 8, 2024

Achieved real-time on RTX 3090 using TensorRT, reaching speeds of 30+ FPS. #150

Achieved real-time on RTX 3090 using TensorRT, reaching speeds of 30+ FPS. #150

Comments

warmshao commented Jul 16, 2024

FurkanGozukara commented Jul 16, 2024

warmshao commented Jul 16, 2024

galigaligo commented Jul 17, 2024

warmshao commented Jul 17, 2024

juntaosun commented Jul 17, 2024 • edited Loading

juntaosun commented Jul 17, 2024

shaoguowen commented Jul 17, 2024

warmshao commented Jul 17, 2024

warmshao commented Jul 17, 2024

vpckso commented Jul 18, 2024 • edited Loading

warmshao commented Jul 18, 2024

vpckso commented Jul 18, 2024

shaoguowen commented Jul 18, 2024

vpckso commented Jul 19, 2024

warmshao commented Jul 23, 2024

falconwingz88 commented Aug 8, 2024

warmshao commented Aug 8, 2024

juntaosun commented Jul 17, 2024 •

edited

Loading

vpckso commented Jul 18, 2024 •

edited

Loading