Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Achieved real-time on RTX 3090 using TensorRT, reaching speeds of 30+ FPS. #150

Open
warmshao opened this issue Jul 16, 2024 · 17 comments
Open

Comments

@warmshao
Copy link

My implementation: https://github.com/warmshao/FasterLivePortrait
New Features:

  1. Achieved real-time performance of LivePortrait on RTX 3090 GPU using TensorRT, reaching speeds of 30+ FPS.
  2. Implemented conversion of LivePortrait model to ONNX format, achieving inference speed of approximately 70ms/frame (~12 FPS) using onnxruntime-gpu on RTX 3090, facilitating cross-platform deployment.
  3. Seamlessly integrated support for native Gradio app, delivering several times faster speed and enabling simultaneous inference on multiple faces. Sample results available at: PR add multi faces supported #105
  4. Refactored code structure to eliminate PyTorch dependency. All models now use ONNX or TensorRT for inference.
@FurkanGozukara
Copy link

amazing work congrats

@warmshao
Copy link
Author

amazing work congrats

thanks! The speed is truly unbelievably fast. Perhaps it can be used for some interesting applications.

@galigaligo
Copy link

I still need to compile onnxrruntime gpu myself, which is a bit discouraging

@warmshao
Copy link
Author

I still need to compile onnxrruntime gpu myself, which is a bit discouraging

The latest onnxruntime-gpu still doesn't support grid_sample cuda, so we need build it from source. But I will upload a docker image soon, stay tuned!

@juntaosun
Copy link

juntaosun commented Jul 17, 2024

Very good, it runs at a steady 20FPS on RTX 3080 . 👍️

@juntaosun
Copy link

FasterLivePortrait.mp4

@shaoguowen
Copy link

FasterLivePortrait.mp4

wow, cool! Are you using tensorrt or onnx?

@warmshao
Copy link
Author

Very good, it runs at a steady 20FPS on RTX 3080 . 👍️
cool

@warmshao
Copy link
Author

hi guys, I have uploaded an docker image that supports docker running https://github.com/warmshao/FasterLivePortrait. Please try it out. I will provide integration packages for Windows and macOS that support one-click run. Stay tuned.

@vpckso
Copy link

vpckso commented Jul 18, 2024

Thanks, but not working
fastliveportrait-docker
docker

Could you share your Dockerfile so that I can build myself?

@warmshao
Copy link
Author

Thanks, but not working fastliveportrait-docker docker

Could you share your Dockerfile so that I can build myself?
You can try installing pycuda yourself: pip install pycuda. Actually, I follows the readme tutorial step by step to install in the container, then commit it, there's no Dockerfile.

@vpckso
Copy link

vpckso commented Jul 18, 2024

nvidia-smi also failed so the image cannot be used, I must build from scratch
but I got a lot of compile error when follow the readme
seems some libraries version not compatible
errors

@shaoguowen
Copy link

nvidia-smi also failed so the image cannot be used, I must build from scratch but I got a lot of compile error when follow the readme seems some libraries version not compatible errors

pls refer this: warmshao/FasterLivePortrait#8

@vpckso
Copy link

vpckso commented Jul 19, 2024

Thanks, it works after fix libcuda.so.1 and libnvidia-ml.so.1
also need to fix scripts/all_onnx2trt.sh to retinaface_det_static.onnx and face_2dpose_106_static.onnx

3060 with official pytorch, source/s6.jpg + driving/d0.mp3:
real 0m16.065s
user 0m19.367s
sys 0m1.738s

compiled model can speed up around 3s

TensortRT:
real 0m7.773s
user 0m11.793s
sys 0m11.129s

@warmshao
Copy link
Author

Install-free, extract-and-play Windows package with TensorRT support now available! please refer FasterLivePortrait releases, Really fast and very convenient!!!

@falconwingz88
Copy link

will this work to a video target ?

@warmshao
Copy link
Author

warmshao commented Aug 8, 2024

will this work to a video target ?

yes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants