Skip to content

Commit 8598890

Browse files
committed
[XPU] Use spawn with XPU multiprocessing
It's required to use `spawn` start method running XPU backend with multiprocessing. There are 2 places in vllm where this needs to be fixed: * One in `vllm/utils` * Another in `test/utils` Fix in the test adjusts `create_new_process_for_each_test` decorator which further needs to be used for the actual test. Some tests are already marked with it due to work done for ROCm. In some cases it might still be missing or `fork_new_process_for_each_test` used instead. This commit unlocks running a number of tests on xpu and allows tolook into actual runtime issues. Commit behavior can be tried on these tests: * `tests/v1/engine/test_llm_engine.py::test_engine_metrics` * `tests/v1/e2e/test_cascade_attention.py` Error happenning before the fix: ``` RuntimeError: Cannot re-initialize XPU in forked subprocess. To use XPU with multiprocessing, you must use the 'spawn' start method ``` Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
1 parent d8ee5a2 commit 8598890

File tree

3 files changed

+15
-5
lines changed

3 files changed

+15
-5
lines changed

tests/utils.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -818,14 +818,15 @@ def create_new_process_for_each_test(
818818
819819
Args:
820820
method: The process creation method. Can be either "spawn" or "fork".
821-
If not specified,
822-
it defaults to "spawn" on ROCm platforms and "fork" otherwise.
821+
If not specified, it defaults to "spawn" on ROCm and XPU
822+
platforms and "fork" otherwise.
823823
824824
Returns:
825825
A decorator to run test functions in separate processes.
826826
"""
827827
if method is None:
828-
method = "spawn" if current_platform.is_rocm() else "fork"
828+
use_spawn = current_platform.is_rocm() or current_platform.is_xpu()
829+
method = "spawn" if use_spawn else "fork"
829830

830831
assert method in ["spawn",
831832
"fork"], "Method must be either 'spawn' or 'fork'"

tests/v1/e2e/test_cascade_attention.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,10 +5,10 @@
55

66
from vllm import LLM, SamplingParams
77

8-
from ...utils import fork_new_process_for_each_test
8+
from ...utils import create_new_process_for_each_test
99

1010

11-
@fork_new_process_for_each_test
11+
@create_new_process_for_each_test()
1212
@pytest.mark.parametrize("attn_backend",
1313
["FLASH_ATTN_VLLM_V1", "FLASHINFER_VLLM_V1"])
1414
def test_cascade_attention(example_system_message, monkeypatch, attn_backend):

vllm/utils/__init__.py

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1535,6 +1535,13 @@ def cuda_is_initialized() -> bool:
15351535
return torch.cuda.is_initialized()
15361536

15371537

1538+
def xpu_is_initialized() -> bool:
1539+
"""Check if XPU is initialized."""
1540+
if not torch.xpu._is_compiled():
1541+
return False
1542+
return torch.xpu.is_initialized()
1543+
1544+
15381545
def cuda_get_device_properties(device,
15391546
names: Sequence[str],
15401547
init_cuda=False) -> tuple[Any, ...]:
@@ -2848,6 +2855,8 @@ def _maybe_force_spawn():
28482855
reason = None
28492856
if cuda_is_initialized():
28502857
reason = "CUDA is initialized"
2858+
elif xpu_is_initialized():
2859+
reason = "XPU is initialized"
28512860
elif is_in_ray_actor():
28522861
# even if we choose to spawn, we need to pass the ray address
28532862
# to the subprocess so that it knows how to connect to the ray cluster.

0 commit comments

Comments
 (0)