Skip to content
This repository has been archived by the owner on Oct 11, 2024. It is now read-only.

Commit

Permalink
[bugfix][distributed] fix 16 gpus local rank arrangement (vllm-projec…
Browse files Browse the repository at this point in the history
  • Loading branch information
youkaichao authored and robertgshaw2-neuralmagic committed Jun 23, 2024
1 parent a212392 commit bc2be04
Showing 1 changed file with 6 additions and 0 deletions.
6 changes: 6 additions & 0 deletions vllm/executor/ray_gpu_executor.py
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,12 @@ def _init_workers_ray(self, placement_group: "PlacementGroup",

for i, (node_id, gpu_ids) in enumerate(worker_node_and_gpu_ids):
node_workers[node_id].append(i)
# `gpu_ids` can be a list of strings or integers.
# convert them to integers for consistency.
# NOTE: gpu_ids can be larger than 9 (e.g. 16 GPUs),
# string sorting is not sufficient.
# see https://github.com/vllm-project/vllm/issues/5590
gpu_ids = [int(x) for x in gpu_ids]
node_gpus[node_id].extend(gpu_ids)
for node_id, gpu_ids in node_gpus.items():
node_gpus[node_id] = sorted(gpu_ids)
Expand Down

0 comments on commit bc2be04

Please sign in to comment.