Phi-3 Hardware Support

Microsoft Phi-3 has been optimized for ONNX Runtime and supports Windows DirectML. It works well across various hardware types, including GPUs, CPUs, and even mobile devices.

Device Hardware

Specifically, the supported hardware includes:

GPU SKU: RTX 4090 (DirectML)
GPU SKU: 1 A100 80GB (CUDA)
CPU SKU: Standard F64s v2 (64 vCPUs, 128 GiB memory)

Mobile SKU

Android - Samsung Galaxy S21
Apple iPhone 14 or higher A16/A17 Processor

Phi-3 Hardware Specification

Minimum Configuration Required.
Windows: DirectX 12-capable GPU and a minimum of 4GB of combined RAM

CUDA: NVIDIA GPU with Compute Capability >= 7.02

Running onnxruntime on multiple GPUs

Currently available Phi-3 ONNX models are only for 1 GPU. It's possible to support multi-gpu for Phi-3 model, but ORT with 2 gpu doesn't guarantee that it will give more throughput compared to 2 instance of ort.

At Build 2024 the GenAI ONNX Team announced that they had enabled multi-instance instead of multi-gpu for Phi models.

At present this allows you to run one onnnxruntime or onnxruntime-genai instance with CUDA_VISIBLE_DEVICES environment variable like this.

CUDA_VISIBLE_DEVICES=0 python infer.py
CUDA_VISIBLE_DEVICES=1 python infer.py

Feel free to explore Phi-3 further in Azure AI Studio

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hardwaresupport.md

Hardwaresupport.md

Phi-3 Hardware Support

Device Hardware

Mobile SKU

Phi-3 Hardware Specification

Running onnxruntime on multiple GPUs

Files

Hardwaresupport.md

Latest commit

History

Hardwaresupport.md

File metadata and controls

Phi-3 Hardware Support

Device Hardware

Mobile SKU

Phi-3 Hardware Specification

Running onnxruntime on multiple GPUs