Replies: 2 comments 15 replies
-
Building from source everything takes super long. So i think i am not doing this right |
Beta Was this translation helpful? Give feedback.
-
Hi! To efficiently build and deploy your customized Triton server with the TensorRT LLM backend and Llama 3 model, here's a streamlined approach:
Let me know if you need further details on any specific step! |
Beta Was this translation helpful? Give feedback.
-
Hi folks,
Happy friday. I am currently customising triton server to add some meta information to response headers. My changes are only in server, I only want to build server. I want to build my tritonserver and deploy with tensor rt llm backend and llama 3 model, Whats the best most efficient way to do this. Should i make my changes then run compose with
min
image oftrtllm-python-py3
. How do I use this image built by compose then to run my inferenceBeta Was this translation helpful? Give feedback.
All reactions