Skip to content

[Bug]: RISC-V and non-Intel CPU architectures fail due to widespread unconditional IPEX dependencies #25737

@ihb2032

Description

@ihb2032

Your current environment

  • Hardware: sg2044
  • OS: LEulixOS 3.0
  • Python: 3.11
  • PyTorch: 2.8.0
  • GCC: 15.1
  • vLLM: commit 13dd93c

🐛 Describe the bug

vLLM fails to run on RISC-V architecture due to multiple unconditional imports of Intel Extension for PyTorch throughout the CPU backend, even though IPEX is Intel x86-specific and not available on RISC-V architectures.

export VLLM_ENABLE_V1_MULTIPROCESSING=0 VLLM_COMPILATION_LEVEL=0
vllm bench throughput \
--model Qwen/Qwen1.5-0.5B \
--input-len 128 \
--output-len 128 \
--enforce-eager --dtype float16 --max_model_len 4096 --max_num_batched_tokens 4096
Error message:

[rank0]: File "/AI/hebo/vllm/vllm/v1/attention/backends/cpu_attn.py", line 595, in forward
[rank0]: import intel_extension_for_pytorch.llm.modules as ipex_modules
[rank0]: ModuleNotFoundError: No module named 'intel_extension_for_pytorch'

Root cause analysis:
1. Architecture detection works correctly: The RISC-V architecture is now properly detected after recent additions to CpuArchEnum.RISCV.
2. IPEX availability check exists but is inconsistently applied: The code correctly checks IPEX availability at module level cpu_attn.py and sets _use_ipex = False when IPEX is not available.
3. Platform configuration assumes IPEX: CPU platform configuration checks IPEX availability but doesn't prevent IPEX-dependent code paths from executing
Expected behavior:
The CPU backend should conditionally use IPEX features only when:
  - IPEX is available _use_ipex = True
  - The CPU architecture supports it primarily x86
  - Gracefully fall back to non-IPEX implementations on other architectures
Current workarounds attempted all failed:

These configurations still trigger IPEX imports:

--enable-chunked-prefill=False
--quantization=None
--enforce-eager

Impact:
This affects all non-Intel CPU architectures where IPEX is not available, including:
  - RISC-V architectures
  - Some ARM implementations without IPEX support
  - Other emerging CPU architectures
Suggested fixes:
1. Conditional IPEX imports: Wrap all IPEX imports with availability checks
2. Architecture-aware fallbacks: Implement non-IPEX code paths for non-x86 architectures
3. Platform-specific configuration: Disable IPEX-dependent features automatically on unsupported architectures
4. Consistent availability checking: Ensure _use_ipex flag is respected throughout the codebase
Additional context:
The issue is more widespread than initially thought - even with chunked prefill disabled, IPEX dependencies are triggered through quantization modules, MoE layers, and other CPU backend components. This suggests a systemic issue where the CPU backend assumes Intel architecture and IPEX availability.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions