How to deploy multiple model in a node with multople GPUs #165

jjjjohnson · 2023-09-14T06:18:23Z

Description

Suppose I have 5 GPT models with each TP=2 and I want to deploy them in a machine with 8 GPUs.  Is it possible? If so, how to control the GPU allocation? I tried to set CUDA_VISIBLE_DEVICES when launch the Triton server does not work.

Reproduced Steps

Tried CUDA_VISIBLE_DEVICES

jjjjohnson added the bug Something isn't working label Sep 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to deploy multiple model in a node with multople GPUs #165

How to deploy multiple model in a node with multople GPUs #165

jjjjohnson commented Sep 14, 2023

How to deploy multiple model in a node with multople GPUs #165

How to deploy multiple model in a node with multople GPUs #165

Comments

jjjjohnson commented Sep 14, 2023

Description

Reproduced Steps