-
Notifications
You must be signed in to change notification settings - Fork 477
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Request for Docker images of tensorrt_llm #605
Comments
HI there! |
@nmq45698 still coming soon but getting closer! for now I would just run SmoothQuant inference through PyTorch like their repo shows, or use AWQ TinyChat, or MLC/TVM. |
I've updated to the last version of tensort_llm https://github.com/NVIDIA/TensorRT-LLM/releases/tag/v0.12.0 |
Thanks for your remind! :P While for Jetson AGX Orin, I've tried under some docker images pulled from @dusty-nv dustynv/... r35.x.x, which seems not supportive for TensorRT 10.3 due to the GLIBC under Ubuntu 20.04. The kINT64, kBF16 in TensorRT-LLM could not be compiled. Then, I tried on docker images pulled from dustynv/... r36.x.x (say nano_llm) based on Ubuntu 22.04, the TensorRT 10.3 was successfully installed. Have you ever encountered that upon building TensorRT-LLM on Orin? ^v^ |
Do you use jetpack 5? |
@johnnynunez Does this release support Jetson Orin or AGX? The official documentation page still states that there's only support for x86_64 architecture at the moment, and nothing about aarch64. |
https://www.jetson-ai-lab.com/tensorrt_llm.html @dusty-nv in the first example, change the name to dustynv/tensorrt_llm:0.12-r36.4.0 |
Ahh thanks @johnnynunez, just fixed that in NVIDIA-AI-IOT/jetson-generative-ai-playground#229 @Hrishikeshh-nd the Orin support is in this TRT-LLM branch: https://github.com/NVIDIA/TensorRT-LLM/tree/v0.12.0-jetson |
hi @dusty-nv, @johnnynunez, If I am not mistaken, we also need to add --tokenizer path to the openai_server.py command. See this link: NVIDIA/TensorRT-LLM#2357 |
Hi @dusty-nv, I successfully run the openai_server.py with endpoint exposed, but can not send the request, always shows " |
Thank you @dusty-nv @johnnynunez ! |
No description provided.
The text was updated successfully, but these errors were encountered: