[XPU][CI] enhance xpu test support #20652

Liangliang-Ma · 2025-07-09T01:13:51Z

This PR introduces a set of changes to enable running vLLM's tests on the XPU (Intel GPU) backend.

To achieve this, we've made various modifications across the files to improve compatibility and stability with XPU.
These include adjustments:

disable fork for xpu (spawn need)
use default block size 64 on xpu
xpu.py config handle
distributed: add xpu device setting
xpuCommunicator adds broadcast

This is part of an ongoing effort — our goal is to ensure that all vLLM v1 tests can eventually pass on XPU. We will continue refining and expanding test support in future PRs until full test coverage is achieved.

Co-authored-by chaojun-zhang <chzhang@habana.ai> Co-authored-by: zufangzhu <zufang.zhu@intel.com> Co-authored-by: zhenwei-intel <zhenweiliu@habana.ai> Signed-off-by: Ma, Liangliang <liangliang.ma@intel.com>

github-actions · 2025-07-09T01:14:15Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

gemini-code-assist

Summary of Changes

Hello @Liangliang-Ma, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances vLLM's support for Intel XPU devices, primarily by enabling and stabilizing the execution of vLLM v1 tests on the XPU backend. It addresses several compatibility challenges, ensuring that the testing infrastructure and core components correctly interact with XPU hardware and its specific requirements for multiprocessing, memory management, and distributed operations.

Highlights

XPU Test Compatibility & Stability: The core objective of this PR is to enable and improve the reliability of vLLM v1 tests on Intel XPU devices, moving towards full test coverage.
XPU Multiprocessing Method: Modified test utilities to explicitly disable the fork multiprocessing method for XPU, requiring the spawn method for compatibility with XPU's multiprocessing requirements.
XPU-Specific Block Size: Adjusted the default block_size in test configurations to 64 for XPU devices, optimizing memory usage for this platform.
Distributed Communication for XPU: Added a broadcast operation to the XPUCommunicator to facilitate necessary collective communication in distributed XPU environments.
XPU Device and Configuration Handling: Ensured that distributed environments correctly identify and set XPU devices (xpu:{local_rank}) and introduced temporary measures to force eager execution mode for XPU stability, while also refining checks for distributed executor backends.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

mergify · 2025-07-09T01:14:27Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @Liangliang-Ma.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

gemini-code-assist

Code Review

This pull request enhances test support for the XPU (Intel GPU) backend by introducing several compatibility and stability improvements. The changes include disabling fork for xpu, using a default block size of 64 on xpu, and adding xpu device settings and broadcast functionality. The review focuses on improving code robustness and readability.

gemini-code-assist · 2025-07-09T01:15:24Z

tests/utils.py

+        # To use XPU with multiprocessing, must use the 'spawn' start method via 'VLLM_WORKER_MULTIPROC_METHOD=spawn'
+        if current_platform.is_xpu():
+            f(*args, **kwargs)
+            return


To use XPU with multiprocessing, the start method must be 'spawn'. If current_platform.is_xpu() is true, then the function f should be executed using the 'spawn' start method. Otherwise, the XPU device may not be initialized correctly in the new process, leading to runtime errors.

Suggested change

# To use XPU with multiprocessing, must use the 'spawn' start method via 'VLLM_WORKER_MULTIPROC_METHOD=spawn'

if current_platform.is_xpu():

f(*args, **kwargs)

return

# To use XPU with multiprocessing, must use the 'spawn' start method via 'VLLM_WORKER_MULTIPROC_METHOD=spawn'

if current_platform.is_xpu():

import torch.multiprocessing as mp

if mp.get_start_method() != 'spawn':

mp.set_start_method('spawn')

f(*args, **kwargs)

return

gemini-code-assist · 2025-07-09T01:15:24Z

tests/conftest.py

        disable_log_stats: bool = True,
        tensor_parallel_size: int = 1,
-        block_size: int = 16,
+        block_size: int = 16 if not hasattr(torch, 'xpu') else 64,


For improved robustness, it's better to check for XPU availability using torch.xpu.is_available() in addition to hasattr(torch, 'xpu'). This ensures that the XPU device is not only supported but also ready for use.

Suggested change

block_size: int = 16 if not hasattr(torch, 'xpu') else 64,

block_size: int = 16 if not hasattr(torch, 'xpu') or not torch.xpu.is_available() else 64,

gemini-code-assist · 2025-07-09T01:15:25Z

vllm/distributed/parallel_state.py

+        elif current_platform.is_xpu():
+            self.device = torch.device(f"xpu:{local_rank}")


Consider using torch.device directly instead of string formatting for device creation. This can improve code readability and reduce the risk of errors.

self.device = torch.device("xpu", local_rank)

gemini-code-assist · 2025-07-09T01:15:25Z

vllm/platforms/xpu.py

+        # FIXME: Temporarily forcing eager mode
+        # remove after t.compile support stabilizes.
+        if envs.VLLM_USE_V1 and vllm_config.model_config is not None and \
+            not vllm_config.model_config.enforce_eager:
+            from vllm.config import CompilationLevel
+            vllm_config.compilation_config.level = \
+                CompilationLevel.NO_COMPILATION


Consider wrapping the long conditional statement in parentheses for better readability and adherence to PEP 8 style guidelines.

Suggested change

# FIXME: Temporarily forcing eager mode

# remove after t.compile support stabilizes.

if envs.VLLM_USE_V1 and vllm_config.model_config is not None and \

not vllm_config.model_config.enforce_eager:

from vllm.config import CompilationLevel

vllm_config.compilation_config.level = \

CompilationLevel.NO_COMPILATION

if (envs.VLLM_USE_V1 and vllm_config.model_config is not None and

not vllm_config.model_config.enforce_eager):

from vllm.config import CompilationLevel

vllm_config.compilation_config.level = CompilationLevel.NO_COMPILATION

gemini-code-assist · 2025-07-09T01:15:25Z

vllm/platforms/xpu.py

        elif parallel_config.distributed_executor_backend != "ray" and \
-                parallel_config.distributed_executor_backend != "uni":
+                parallel_config.distributed_executor_backend != "uni" and \
+                parallel_config.distributed_executor_backend != "external_launcher":


Consider wrapping the long conditional statement in parentheses for better readability and adherence to PEP 8 style guidelines.

Suggested change

elif parallel_config.distributed_executor_backend != "ray" and \

parallel_config.distributed_executor_backend != "uni":

parallel_config.distributed_executor_backend != "uni" and \

parallel_config.distributed_executor_backend != "external_launcher":

elif (parallel_config.distributed_executor_backend != "ray" and

parallel_config.distributed_executor_backend != "uni" and

parallel_config.distributed_executor_backend != "external_launcher"):

Signed-off-by: Ma, Liangliang <liangliang.ma@intel.com>

Liangliang-Ma · 2025-07-09T07:20:57Z

Remove fork decorator modification. We would like to use this one: #20649

Signed-off-by: Ma, Liangliang <liangliang.ma@intel.com>

DarkLight1337

Can you merge in the latest changes and see if the test still passes?

mergify · 2025-07-09T07:38:00Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @Liangliang-Ma.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Liangliang-Ma · 2025-07-09T07:49:19Z

Can you merge in the latest changes and see if the test still passes?

Merged.

Signed-off-by: Ma, Liangliang <liangliang.ma@intel.com>

DarkLight1337 · 2025-07-09T09:59:01Z

Please fix pre-commit

Signed-off-by: Ma, Liangliang <liangliang.ma@intel.com>

dvrogozh

Which tests are being fixed by these changes? Can this information be added to the PR description, please? If these affect many, I guess 1 test for each change could be specified.

dvrogozh · 2025-07-09T17:49:14Z

vllm/v1/worker/xpu_model_runner.py


    def _init_device_properties(self) -> None:
-        pass
+        self.num_sms = None


On top of this change, may I suggest an improvement to move these customization to gpu_model_runner.py as these are just backend specific dispatch logic which can be handled easier without class inheritance? See proposal here:

[XPU] dispatch xpu/cuda specific calls in the model runner #20698

Actually these function was in gpu_model_runner.py originally and move to different device model_runner for cleaner readness. So I think we could follow this design.

Signed-off-by: Ma, Liangliang <liangliang.ma@intel.com> Co-authored-by: zhenwei-intel <zhenweiliu@habana.ai>

Signed-off-by: Ma, Liangliang <liangliang.ma@intel.com> Co-authored-by: zhenwei-intel <zhenweiliu@habana.ai> Signed-off-by: Jinzhen Lin <linjinzhen@hotmail.com>

Signed-off-by: Ma, Liangliang <liangliang.ma@intel.com> Co-authored-by: zhenwei-intel <zhenweiliu@habana.ai>

enhance xpu test support

aa9e94d

Co-authored-by chaojun-zhang <chzhang@habana.ai> Co-authored-by: zufangzhu <zufang.zhu@intel.com> Co-authored-by: zhenwei-intel <zhenweiliu@habana.ai> Signed-off-by: Ma, Liangliang <liangliang.ma@intel.com>

Liangliang-Ma requested review from WoosukKwon, alexm-redhat, comaniac, njhill, robertgshaw2-redhat and ywang96 as code owners July 9, 2025 01:13

gemini-code-assist bot reviewed Jul 9, 2025

View reviewed changes

mergify bot added the v1 label Jul 9, 2025

mergify bot added the needs-rebase label Jul 9, 2025

gemini-code-assist bot reviewed Jul 9, 2025

View reviewed changes

Merge branch 'main' into xpu

1eba73d

mergify bot removed the needs-rebase label Jul 9, 2025

remove fork change

cb0b3dc

Signed-off-by: Ma, Liangliang <liangliang.ma@intel.com>

use torch.xpu.is_available instead of hasattr

6dba0f3

Signed-off-by: Ma, Liangliang <liangliang.ma@intel.com>

DarkLight1337 reviewed Jul 9, 2025

View reviewed changes

mergify bot added the needs-rebase label Jul 9, 2025

Merge branch 'main' into xpu

d833eed

mergify bot removed the needs-rebase label Jul 9, 2025

fix format

200dba9

Signed-off-by: Ma, Liangliang <liangliang.ma@intel.com>

fix format that exceed max line length

a84337e

Signed-off-by: Ma, Liangliang <liangliang.ma@intel.com>

DarkLight1337 enabled auto-merge (squash) July 9, 2025 11:10

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Jul 9, 2025

dvrogozh reviewed Jul 9, 2025

View reviewed changes

DarkLight1337 merged commit a3e4e85 into vllm-project:main Jul 9, 2025
80 checks passed

dvrogozh reviewed Jul 9, 2025

View reviewed changes

Pradyun92 pushed a commit to Pradyun92/vllm that referenced this pull request Aug 6, 2025

[XPU][CI] enhance xpu test support (vllm-project#20652)

0145f22

Signed-off-by: Ma, Liangliang <liangliang.ma@intel.com> Co-authored-by: zhenwei-intel <zhenweiliu@habana.ai>

npanpaliya pushed a commit to odh-on-pz/vllm-upstream that referenced this pull request Aug 6, 2025

[XPU][CI] enhance xpu test support (vllm-project#20652)

0bc8971

Signed-off-by: Ma, Liangliang <liangliang.ma@intel.com> Co-authored-by: zhenwei-intel <zhenweiliu@habana.ai>

epwalsh pushed a commit to epwalsh/vllm that referenced this pull request Aug 27, 2025

[XPU][CI] enhance xpu test support (vllm-project#20652)

d2648c6

Signed-off-by: Ma, Liangliang <liangliang.ma@intel.com> Co-authored-by: zhenwei-intel <zhenweiliu@habana.ai>

	block_size: int = 16 if not hasattr(torch, 'xpu') else 64,
	block_size: int = 16 if not hasattr(torch, 'xpu') or not torch.xpu.is_available() else 64,

		elif current_platform.is_xpu():
		self.device = torch.device(f"xpu:{local_rank}")

Uh oh!

[XPU][CI] enhance xpu test support #20652

[XPU][CI] enhance xpu test support #20652

Uh oh!

Conversation

Liangliang-Ma commented Jul 9, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jul 9, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

mergify bot commented Jul 9, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jul 9, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jul 9, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jul 9, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jul 9, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jul 9, 2025

Choose a reason for hiding this comment

Uh oh!

Liangliang-Ma commented Jul 9, 2025

Uh oh!

DarkLight1337 left a comment

Choose a reason for hiding this comment

Uh oh!

mergify bot commented Jul 9, 2025

Uh oh!

Liangliang-Ma commented Jul 9, 2025

Uh oh!

DarkLight1337 commented Jul 9, 2025

Uh oh!

dvrogozh left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

dvrogozh Jul 9, 2025

Choose a reason for hiding this comment

Uh oh!

Liangliang-Ma Jul 10, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Liangliang-Ma commented Jul 9, 2025 •

edited by github-actions bot

Loading