The specification defines an open standard for packaging and distribution Artificial Intelligence models as OCI artifacts, adhering to the OCI image specification.
The goal of this specification is to outline a blueprint and enable the creation of interoperable solutions for packaging and retrieving AI/ML models by leveraging the existing OCI ecosystem, thereby facilitating efficient model management, deployment and serving in cloud-native environments.
- An OCI Registry could storage and manage AI/ML model artifacts with model versions, metadata, and parameters retrievable and displayable.
- A Data Scientist can package models together with their metadata (e.g., format, precision) and upload them to a registry, facilitating collaboration with MLOps Engineers while streamlining the deployment process to efficiently deliver models into production.
- A model serving/deployment platform can read model metadata (e.g., format, precision) from a registry to understand the AI/ML model details, identify the required server runtime (as well as startup parameters, necessary resources, etc.), and serve the model in Kubernetes by mounting it directly as a volume source without needing to pre-download it in an init-container or bundle it within the server runtime container.
At a high level, the Model Format Specification is based on the OCI Image Format Specification and incorporates all its components. The key distinction lies in extending the OCI Image Manifest Specification to accommodate artifact usage specifically tailored for AI/ML models.
The image manifest of model artifacts follows the OCI Image Manifest Specification and adheres to the guidelines for artifacts usage. Specifically, it leverages the extensible artifactType
and annotations
properties to define attributes specific to model artifacts.
-
artifactType
stringThis REQUIRED property MUST be
application/vnd.cnai.model.manifest.v1+json
. -
layers
array of objects-
mediaType
stringThis REQUIRED property MUST be one of the OCI Image Media Types designated for layers. Otherwise, it will not be compatible with the container runtime.
-
artifactType
stringThis REQUIRED property MUST be at least the following media types:
-
application/vnd.cnai.model.layer.v1.tar
: The layer is a tar archive that contains the model weight file. If the model has multiple weight files, they SHOULD be packaged into separate layers. -
application/vnd.cnai.model.layer.v1.tar+gzip
: The layer is a tar archive compressed with gzip that contains the model weight file. If the model has multiple weight files, they SHOULD be packaged in separate layers.Implementers note: It is recommended to package weight files without compression to avoid unnecessary overhead of decompression by the container runtime as model weight files are typically already compressed.
-
application/vnd.cnai.model.doc.v1.tar
: The layer is a tar archive that includes documentation files likeREADME.md
,LICENSE
, etc. -
application/vnd.cnai.model.config.v1.tar
: The layer is a tar archive that includes additional configuration files such asconfig.json
,tokenizer.json
,generation_config.json
, etc.
-
-
annotations
string-string mapThis OPTIONAL property contains arbitrary attributes for the layer. For metadata specific to models, implementations SHOULD use the predefined annotation keys as outlined in the Layer Annotation Keys.
-
As the model format specification conforms to the OCI Image Specification, it naturally aligns with the standard OCI distribution workflow.
This section outlines the typical workflow for a model OCI artifact, which consists of two main stages: BUILD & PUSH
and PULL & SERVE
.
Build tools can package required resources into an OCI artifact following the model format specification.
The generated artifact can then be pushed to OCI registries (e.g., Harbor, DockerHub) for storage and management.
Once the model artifact is stored in an OCI registry, the container runtime (e.g., containerd, CRI-O) can pull it from the OCI registry and mount it as a read-only volume during the model serving process, if required.