You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)
Reproduction
Run git clone https://github.com/NVIDIA/TensorRT-LLM.git
Create Dockerfile and docker-compose.yaml in TensorRT-LLM/
Dockerfile
# Obtain and start the basic docker image environment.
FROM nvidia/cuda:12.1.0-devel-ubuntu22.04
# Install dependencies, TensorRT-LLM requires Python 3.10
RUN apt-get update && apt-get -y install \
python3.10 \
python3-pip \
openmpi-bin \
libopenmpi-dev
# Install the latest preview version (corresponding to the main branch) of TensorRT-LLM.
# If you want to install the stable version (corresponding to the release branch), please
# remove the `--pre` option.
RUN --mount=type=cache,target=/root/.cache/pip pip3 install tensorrt_llm -U --pre --extra-index-url https://pypi.nvidia.com
COPY ./examples/qwen/requirements.txt .
RUN --mount=type=cache,target=/root/.cache/pip pip3 install -r requirements.txt
WORKDIR /workdir
Run git clone https://huggingface.co/Qwen/Qwen-7B-Chat in /mnt/models/Large_Language_Model
Run docker compose up
Expected behavior
No error
actual behavior
[04/16/2024-22:50:23] [TRT-LLM] [I] NVLink is active: True
[04/16/2024-22:50:23] [TRT-LLM] [I] NVLink version: 6
Traceback (most recent call last):
File "/usr/local/bin/trtllm-build", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/commands/build.py", line 411, in main
cluster_config = infer_cluster_config()
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/auto_parallel/cluster_info.py", line 523, in infer_cluster_config
cluster_info=infer_cluster_info(),
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/auto_parallel/cluster_info.py", line 487, in infer_cluster_info
nvl_bw = nvlink_bandwidth(nvl_version)
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/auto_parallel/cluster_info.py", line 433, in nvlink_bandwidth
return nvl_bw_table[nvlink_version]
KeyError: 6
System Info
GPU: NVIDIA RTX A6000
Who can help?
@Tracin
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Run
git clone https://github.com/NVIDIA/TensorRT-LLM.git
Create
Dockerfile
anddocker-compose.yaml
inTensorRT-LLM/
Dockerfile
docker-compose.yaml
Run
git clone https://huggingface.co/Qwen/Qwen-7B-Chat
in/mnt/models/Large_Language_Model
Run
docker compose up
Expected behavior
No error
actual behavior
additional notes
Relevant code: https://github.com/NVIDIA/TensorRT-LLM/blob/main/tensorrt_llm/auto_parallel/cluster_info.py#L427-L433
Can't seem to find info about NVLink version 6's bandwidth online.
The text was updated successfully, but these errors were encountered: