You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
seems, it will create number == parallel_config.world_size workers for model parallelism, but if worker's IP equal to driver's IP, this worker will be assign to driver_dummy_worker and will never append to normal worker list.
the driver_dummy_worker only act as driver not worker. so that's means one worker not invoke in model parallelism. I guess that's not expected.
can anyone make clarification, I guess I missing something, thanks
The text was updated successfully, but these errors were encountered:
This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!
Your current environment
🐛 Describe the bug
I dig into the implements of
ray_gpu_executor.py
and find such implements:https://github.com/vllm-project/vllm/blob/ee3eea0a1b2c690557455d97074d8829d5a98320/vllm/executor/ray_gpu_executor.py#L112-123
seems, it will create number ==
parallel_config.world_size
workers for model parallelism, but if worker's IP equal to driver's IP, this worker will be assign todriver_dummy_worker
and will never append to normal worker list.the
driver_dummy_worker
only act as driver not worker. so that's means one worker not invoke in model parallelism. I guess that's not expected.can anyone make clarification, I guess I missing something, thanks
The text was updated successfully, but these errors were encountered: