Apart from the significant acceleration capabilites on Intel CPUs, IPEX-LLM also supports optimizations and acceleration for running LLMs (large language models) on Intel GPUs.
IPEX-LLM supports optimizations of any HuggingFace transformers model on Intel GPUs with the help of low-bit techniques, modern hardware accelerations and latest software optimizations.
In Chapter 6, you will learn how to run LLMs, as well as implement stream chat functionalities, using IPEX-LLM optimizations on Intel GPUs. Popular open source models are used as examples:
Hardware:
- Intel Arc™ A-Series Graphics
- Intel Data Center GPU Flex Series
- Intel Data Center GPU Max Series
Operating System:
- Ubuntu 20.04 or later (Ubuntu 22.04 is preferred)
Hardware:
- Intel iGPU and dGPU
Operating System:
- Windows 10/11, with or without WSL
Please refer to the GPU installation guide for mode details. It is strongly recommended that you follow the corresponding steps below to configure your environment properly.