[Bug]: RISC-V and non-Intel CPU architectures fail due to widespread unconditional IPEX dependencies

### Your current environment

- Hardware: sg2044
- OS: LEulixOS 3.0
- Python: 3.11
- PyTorch: 2.8.0
- GCC: 15.1
- vLLM: commit 13dd93c6

### 🐛 Describe the bug

vLLM fails to run on RISC-V architecture due to multiple unconditional imports of Intel Extension for PyTorch throughout the CPU backend, even though IPEX is Intel x86-specific and not available on RISC-V architectures.
```bash
export VLLM_ENABLE_V1_MULTIPROCESSING=0 VLLM_COMPILATION_LEVEL=0
vllm bench throughput \
--model Qwen/Qwen1.5-0.5B \
--input-len 128 \
--output-len 128 \
--enforce-eager --dtype float16 --max_model_len 4096 --max_num_batched_tokens 4096
Error message:
```
[rank0]:   File "/AI/hebo/vllm/vllm/v1/attention/backends/cpu_attn.py", line 595, in forward  
[rank0]:     import intel_extension_for_pytorch.llm.modules as ipex_modules  
[rank0]: ModuleNotFoundError: No module named 'intel_extension_for_pytorch'  
```
Root cause analysis:
1. Architecture detection works correctly: The RISC-V architecture is now properly detected after recent additions to CpuArchEnum.RISCV.
2. IPEX availability check exists but is inconsistently applied: The code correctly checks IPEX availability at module level cpu_attn.py and sets _use_ipex = False when IPEX is not available.
3. Platform configuration assumes IPEX: CPU platform configuration checks IPEX availability but doesn't prevent IPEX-dependent code paths from executing
Expected behavior:
The CPU backend should conditionally use IPEX features only when:
  - IPEX is available _use_ipex = True
  - The CPU architecture supports it primarily x86
  - Gracefully fall back to non-IPEX implementations on other architectures
Current workarounds attempted all failed:
```
# These configurations still trigger IPEX imports:  
--enable-chunked-prefill=False  
--quantization=None    
--enforce-eager  
```
Impact:
This affects all non-Intel CPU architectures where IPEX is not available, including:
  - RISC-V architectures
  - Some ARM implementations without IPEX support
  - Other emerging CPU architectures
Suggested fixes:
1. Conditional IPEX imports: Wrap all IPEX imports with availability checks
2. Architecture-aware fallbacks: Implement non-IPEX code paths for non-x86 architectures
3. Platform-specific configuration: Disable IPEX-dependent features automatically on unsupported architectures
4. Consistent availability checking: Ensure _use_ipex flag is respected throughout the codebase
Additional context:
The issue is more widespread than initially thought - even with chunked prefill disabled, IPEX dependencies are triggered through quantization modules, MoE layers, and other CPU backend components. This suggests a systemic issue where the CPU backend assumes Intel architecture and IPEX availability.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug]: RISC-V and non-Intel CPU architectures fail due to widespread unconditional IPEX dependencies #25737

Your current environment

🐛 Describe the bug

These configurations still trigger IPEX imports:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: RISC-V and non-Intel CPU architectures fail due to widespread unconditional IPEX dependencies #25737

Description

Your current environment

🐛 Describe the bug

These configurations still trigger IPEX imports:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions