From e4a711ce234d156cb595d389a285fe49852bb095 Mon Sep 17 00:00:00 2001 From: krishung5 Date: Wed, 7 Aug 2024 10:59:49 -0700 Subject: [PATCH 1/3] Add TRT-LLM backend to the doc --- README.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/README.md b/README.md index b57d44e..2b3c4e9 100644 --- a/README.md +++ b/README.md @@ -123,6 +123,14 @@ to load and serve models. The [vllm_backend](https://github.com/triton-inference-server/vllm_backend) repo contains the documentation and source for the backend. +**TensorRT-LLM**: The TensorRT-LLM backend allows you to serve +[TensorRT-LLM](https://github.com/NVIDIA/TensorRT-LLM) models with Triton Server. +Check out the +[Triton TRT-LLM user guide](https://github.com/triton-inference-server/server/blob/main/docs/getting_started/trtllm_user_guide.md) +for more information. The +[tensorrtllm_backend](https://github.com/triton-inference-server/tensorrtllm_backend) +repo contains the documentation and source for the backend. + **Important Note!** Not all the above backends are supported on every platform supported by Triton. Look at the [Backend-Platform Support Matrix](docs/backend_platform_support_matrix.md) From 1d36963cfe8204a4186a56fc99f462d078fa30a6 Mon Sep 17 00:00:00 2001 From: krishung5 Date: Wed, 7 Aug 2024 17:13:20 -0700 Subject: [PATCH 2/3] Add TRT-LLM backend to platform support matrix --- docs/backend_platform_support_matrix.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/backend_platform_support_matrix.md b/docs/backend_platform_support_matrix.md index d00a73c..64522e6 100644 --- a/docs/backend_platform_support_matrix.md +++ b/docs/backend_platform_support_matrix.md @@ -1,5 +1,5 @@