In order to run accelerated AI workloads, we've prepared bootc container images for the major AI platforms.
Target | Description |
---|---|
amd | Create bootable container for AMD platform |
deepspeed | DeepSpeed container for optimization deep learning |
disk-amd | Create disk image from bootable container for AMD platform |
disk-intel | Create disk image from bootable container for Intel platform |
disk-nvidia | Create disk image from bootable container for Nvidia platform |
instruct-amd | Create instruct lab image for bootable container for AMD platform |
instruct-intel | Create instruct lab image for bootable container for Intel platform |
instruct-nvidia | Create instruct lab image for bootable container for Nvidia platform |
intel | Create bootable container for Intel Habanalabs platform |
nvidia | Create bootable container for NVidia platform |
vllm | Containerized inference/serving engine for LLMs |
Variable | Description | Default |
---|---|---|
FROM | Overrides the base image for the Containerfiles | quay.io/centos-bootc/centos-bootc:stream9 |
REGISTRY | Container Registry for storing container images | quay.io |
REGISTRY_ORG | Container Registry organization | ai-lab |
IMAGE_NAME | Container image name | platform (i.e. amd ) |
IMAGE_TAG | Container image tag | latest |
CONTAINER_TOOL | Container tool used for build | podman |
CONTAINER_TOOL_EXTRA_ARGS | Container tool extra arguments | |
VENDOR | Container image vendor label | |
Note: AI content is huge and requires a lot of disk space >200GB free to build.
In order to do AI Training you need to build instructlab container images.
Simply execute make instruct-<platform>
. For example:
- make instruct-amd
- make instruct-intel
- make instruct-nvidia
Once you have these container images built it is time to build vllm.
- make vllm
- make deepspeed
In order to build the images (by default based on CentOS Stream), a simple make <platform>
should be enough. For example to build the nvidia
, amd
and intel
bootc containers, respectively:
make nvidia
make amd
make intel
In order to build the training images based on Red Hat Enterprise Linux bootc images, the appropriate base container image must be used in the FROM
field and the build process must be run on an entitled Red Hat 9.x Enterprise Linux with a valid subscription.
For example:
make nvidia FROM=registry.redhat.io/rhel9/rhel-bootc:9.4
make amd FROM=registry.redhat.io/rhel9/rhel-bootc:9.4
make intel FROM=registry.redhat.io/rhel9/rhel-bootc:9.4
Of course, the other Makefile variables are still available, so the following is a valid build command:
make nvidia REGISTRY=myregistry.com REGISTRY_ORG=ai-training IMAGE_NAME=nvidia IMAGE_TAG=v1 FROM=registry.redhat.io/rhel9/rhel-bootc:9.4
bootc-image-builder produces disk images using a bootable container as input. Disk images can be used to directly provision a host The process will write the disk image in -bootc/build
IMPORTANT: osbuild-selinux
package needs to be installed for bootc-image-builder to work in a SELinux enabled host
To invoke bootc-image-builder, execute make disk-
make disk-nvidia
or
make disk-nvidia DISK_TYPE=ami BOOTC_IMAGE=quay.io/ai-lab/nvidia-bootc-custom:latest
In addition to the variables common to all targets, a few extra can be defined to customize disk image creation
Variable | Description | Default |
---|---|---|
BOOTC_IMAGE | Image to use as input | $REGISTRY/$REGISTRY_ORG/$IMAGE_NAME:$IMAGE_TAG |
DISK_TYPE | Type of image to build | qcow2 |
IMAGE_BUILDER_CONFIG | Path to a build-config file | EMPTY |
Image builder config file is documented in bootc-image-builder README
The following image disk types are currently available:
Disk type | Target environment |
---|---|
ami |
Amazon Machine Image |
qcow2 (default) |
QEMU |
vmdk |
VMDK usable in vSphere, among others |
anaconda-iso |
An unattended Anaconda installer that installs to the first disk found. |
raw |
Unformatted raw disk. |
For building images customized for each supported cloud provider, please read the cloud providers section
Sometimes, interrupting the build process may lead to wanting a complete restart of the process. For those cases, we can instruct podman
to start from scratch and discard the cached layers. This is possible by passing the --no-cache
parameter to the build process by using the CONTAINER_TOOL_EXTRA_ARGS
variable:
make <platform> CONTAINER_TOOL_EXTRA_ARGS="--no-cache"
The building of accelerated images requires a lot of temporary disk space. In case you need to specify a directory for temporary storage, this can be done with the TMPDIR
environment variable:
make <platform> TMPDIR=/path/to/tmp