Support for multimodal model #344

spoonbobo · 2024-02-21T06:17:00Z

Does tensorrtllm_backend supports multimodal LLM like LLaVA like those listed in https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/multimodal?

symphonylyh · 2024-02-23T10:00:41Z

Hi @spoonbobo , not yet. We're currently working on a general backend for structures like encoder-decoder and multimodal models. Encoder-decoder work is in progress and multimodal follows it. The progress is tracked in NVIDIA/TensorRT-LLM#800.

Meanwhie, if you're referring to a Triton Python backend, do you think it's ok for users to implement a multimodal workflow based on the gpt example

tensorrtllm_backend/all_models/gpt/tensorrt_llm/1/model.py

Line 42 in 2c8c6ae

class TritonPythonModel:

?

spoonbobo · 2024-02-23T10:54:25Z

hi @symphonylyh. Appreciated the efforts you've put on providing general encoder-decoder support. Haven't tried implement a workflow based on this example, I think definitely worth a try.

FernandoDorado · 2024-12-11T20:08:38Z

I'm also very interested in these capabilities. Looking forward to try the examples provided. Is there any reference tutorial or guide?

byshiue assigned symphonylyh Feb 23, 2024

byshiue added the triaged Issue has been triaged by maintainers label Feb 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for multimodal model #344

Support for multimodal model #344

spoonbobo commented Feb 21, 2024

symphonylyh commented Feb 23, 2024

spoonbobo commented Feb 23, 2024

FernandoDorado commented Dec 11, 2024

Support for multimodal model #344

Support for multimodal model #344

Comments

spoonbobo commented Feb 21, 2024

symphonylyh commented Feb 23, 2024

spoonbobo commented Feb 23, 2024

FernandoDorado commented Dec 11, 2024