Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bugfix][Intel] Fix XPU Dockerfile Build #7824

Merged
merged 34 commits into from
Sep 28, 2024
Merged
Show file tree
Hide file tree
Changes from 25 commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
0d94a73
Update requirements-xpu.txt
tylertitsworth Aug 8, 2024
94f6c0f
Update Dockerfile.xpu
tylertitsworth Aug 8, 2024
87f2ae9
Update Dockerfile.xpu
tylertitsworth Aug 8, 2024
8c2ac7f
Update Dockerfile.xpu
tylertitsworth Aug 8, 2024
405ba25
update dockerfile
Aug 8, 2024
d2b2cae
closer to build
Aug 8, 2024
8501045
Merge branch 'vllm-project:main' into main
tylertitsworth Aug 22, 2024
1a1dbad
Merge branch 'vllm-project:main' into main
tylertitsworth Aug 23, 2024
3b140cc
update interface
Aug 23, 2024
7ccc035
fix platform spec
Aug 23, 2024
ac335d4
Update xpu.py
tylertitsworth Aug 23, 2024
6899720
Update __init__.py
tylertitsworth Aug 23, 2024
8a6eecf
Update .dockerignore
tylertitsworth Aug 24, 2024
60743aa
uncap tiktoken
Aug 26, 2024
b14a869
bump oneapi
Aug 28, 2024
916144e
Merge branch 'vllm-project:main' into main
tylertitsworth Sep 3, 2024
0d5c597
remove openai server support
tylertitsworth Sep 4, 2024
f6c9577
remove unecessary reqs
tylertitsworth Sep 4, 2024
1df6d83
Merge branch 'vllm-project:main' into main
tylertitsworth Sep 4, 2024
d8ad520
Merge branch 'main' into main
tylertitsworth Sep 11, 2024
cc3df2f
update ipex versions
Sep 11, 2024
98f14b8
fix lint error
Sep 11, 2024
61a5506
Merge branch 'vllm-project:main' into main
tylertitsworth Sep 12, 2024
ea3a728
Merge branch 'main' into main
tylertitsworth Sep 14, 2024
55ad39e
Update Dockerfile.xpu
tylertitsworth Sep 14, 2024
843278e
Merge branch 'main' into main
tylertitsworth Sep 23, 2024
cbbd0f4
fix lint and build errors
Sep 23, 2024
2f4f8e9
Update xpu.py
tylertitsworth Sep 23, 2024
17adb23
isort xpu.py
Sep 23, 2024
73060af
yapf xpu.py (this formatting sucks)
Sep 23, 2024
e01fa0a
address pr comments
Sep 25, 2024
be5646e
Merge branch 'vllm-project:main' into main
tylertitsworth Sep 26, 2024
1429aba
Update run-xpu-test.sh
tylertitsworth Sep 27, 2024
c901ecf
Merge branch 'main' into xpu-main
youkaichao Sep 28, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion .dockerignore
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
vllm/*.so
/.github/
/.venv
/build
dist
Dockerfile*
vllm/*.so
41 changes: 34 additions & 7 deletions Dockerfile.xpu
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
FROM intel/oneapi-basekit:2024.2.1-0-devel-ubuntu22.04
FROM intel/oneapi-basekit:2024.2.1.0-devel-ubuntu22.04 AS vllm-base

RUN wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB | gpg --dearmor | tee /usr/share/keyrings/intel-oneapi-archive-keyring.gpg > /dev/null && \
echo "deb [signed-by=/usr/share/keyrings/intel-oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main " | tee /etc/apt/sources.list.d/oneAPI.list && \
Expand All @@ -7,8 +7,22 @@ RUN wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRO
echo "deb [arch=amd64,i386 signed-by=/usr/share/keyrings/intel-graphics.gpg] https://repositories.intel.com/graphics/ubuntu jammy arc" | tee /etc/apt/sources.list.d/intel.gpu.jammy.list && \
chmod 644 /usr/share/keyrings/intel-graphics.gpg

RUN apt-get update -y \
&& apt-get install -y curl libicu70 lsb-release git wget vim numactl python3 python3-pip ffmpeg libsm6 libxext6 libgl1
RUN apt-get update -y && \
apt-get install -y --no-install-recommends --fix-missing \
curl \
ffmpeg \
git \
libsndfile1 \
libsm6 \
libxext6 \
libgl1 \
lsb-release \
numactl \
python3 \
python3-dev \
python3-pip \
# vim \
wget

RUN git clone https://github.com/intel/pti-gpu && \
cd pti-gpu/sdk && \
Expand All @@ -18,12 +32,25 @@ RUN git clone https://github.com/intel/pti-gpu && \
make -j && \
cmake --install . --config Release --prefix "/usr/local"

WORKDIR /workspace/vllm
COPY requirements-xpu.txt /workspace/vllm/requirements-xpu.txt
COPY requirements-common.txt /workspace/vllm/requirements-common.txt

tylertitsworth marked this conversation as resolved.
Show resolved Hide resolved
RUN pip install --no-cache-dir -r requirements-xpu.txt

COPY ./ /workspace/vllm

WORKDIR /workspace/vllm
ENV VLLM_TARGET_DEVICE=xpu

RUN python3 setup.py install

FROM vllm-base AS vllm-openai

RUN pip install -v -r requirements-xpu.txt
# install additional dependencies for openai api server
RUN --mount=type=cache,target=/root/.cache/pip \
pip install accelerate hf_transfer 'modelscope!=1.15.0'

RUN VLLM_TARGET_DEVICE=xpu python3 setup.py install
ENV VLLM_USAGE_SOURCE production-docker-image \
TRITON_XPU_PROFILE 1

CMD ["/bin/bash"]
ENTRYPOINT ["python3", "-m", "vllm.entrypoints.openai.api_server"]
tylertitsworth marked this conversation as resolved.
Show resolved Hide resolved
2 changes: 1 addition & 1 deletion requirements-common.txt
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
psutil
sentencepiece # Required for LLaMA tokenizer.
numpy < 2.0.0
requests
requests >= 2.26.0
tqdm
py-cpuinfo
transformers >= 4.43.2 # Required for Chameleon and Llama 3.1 hotfox.
Expand Down
2 changes: 0 additions & 2 deletions requirements-xpu.txt
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
# Common dependencies
-r requirements-common.txt

setuptools < 70.0.0 # IPEX's torch have some dependency. to be removed.
tylertitsworth marked this conversation as resolved.
Show resolved Hide resolved

torch == 2.3.1+cxx11.abi
intel-extension-for-pytorch == 2.3.110+xpu
oneccl_bind_pt == 2.3.100+xpu
Expand Down
2 changes: 2 additions & 0 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -417,6 +417,8 @@ def _read_requirements(filename: str) -> List[str]:
for line in requirements:
if line.startswith("-r "):
resolved_requirements += _read_requirements(line.split()[1])
elif line.startswith("--"):
continue
else:
resolved_requirements.append(line)
return resolved_requirements
Expand Down
12 changes: 12 additions & 0 deletions vllm/platforms/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,15 @@
except Exception:
pass

is_xpu = False

try:
import torch
if hasattr(torch, 'xpu') and torch.xpu.is_available():
tylertitsworth marked this conversation as resolved.
Show resolved Hide resolved
is_xpu = True
except Exception:
pass

is_cpu = False
try:
from importlib.metadata import version
Expand All @@ -60,6 +69,9 @@
elif is_rocm:
from .rocm import RocmPlatform
current_platform = RocmPlatform()
elif is_xpu:
from .xpu import XPUPlatform
current_platform = XPUPlatform()
elif is_cpu:
from .cpu import CpuPlatform
current_platform = CpuPlatform()
Expand Down
4 changes: 4 additions & 0 deletions vllm/platforms/interface.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ class PlatformEnum(enum.Enum):
CUDA = enum.auto()
ROCM = enum.auto()
TPU = enum.auto()
XPU = enum.auto()
CPU = enum.auto()
UNSPECIFIED = enum.auto()

Expand All @@ -24,6 +25,9 @@ def is_rocm(self) -> bool:
def is_tpu(self) -> bool:
return self._enum == PlatformEnum.TPU

def is_xpu(self) -> bool:
return self._enum == PlatformEnum.XPU

def is_cpu(self) -> bool:
return self._enum == PlatformEnum.CPU

Expand Down
17 changes: 17 additions & 0 deletions vllm/platforms/xpu.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
from typing import Tuple

import torch

from .interface import Platform, PlatformEnum


class XPUPlatform(Platform):
_enum = PlatformEnum.XPU

@staticmethod
def get_device_capability(device_id: int = 0) -> Tuple[int, int]:
return torch.xpu.get_device_capability(device_id)

@staticmethod
def get_device_name(device_id: int = 0) -> str:
return torch.xpu.get_device_name(device_id)
Loading