-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Config 2.0 unified storage description #30
Comments
@xyhuang @bitfort @dfeddema @davidjurado This really looks doable. Couple comments.
Talking about first item. Since values here are some kind of identifiers for either directories or files, can we partially adapt URI approach? We can introduce a schema named name: example-mlcube
platform:
storage:
K8S_DATA:
spec:
kubernetes:
pvc_name: my-pvc
NFS_DATA:
spec:
nfs:
host: 127.0.0.1
port: 2049
path: some/nfs/path
workspace:
spec:
local:
path: ${runtime.root}/workspace
tmp:
spec:
local:
path: ${oc.env:TMP}/mlcube/workspace/${name}
home:
spec:
local:
path: ${oc.env:HOME}/.mlcube/workspace/${name}
container:
image: mlcommons/mnist:0.0.1
build_context: "mnist"
build_file: "Dockerfile"
tasks:
download:
io:
- {name: data_dir, type: directory, io: output, default: "storage:NFS_DATA/data"}
- {name: log_dir, type: directory, io: output, default: "storage:NFS_DATA/logs"}
train:
io:
- {name: data_dir, type: directory, io: input, default: "storage:K8S_DATA/data"}
- {name: parameters_file, type: file, io: input, default: "storage:K8S_DATA/parameters/default.parameters.yaml"}
- {name: log_dir, type: directory, io: output, default: "storage:K8S_DATA/logs"}
- {name: model_dir, type: directory, io: output, default: "storage:K8S_DATA/model"} We can also use it with mlcube run ... --workspace=storage:home to keep data in user's home directory. |
This proposes a unified storage description for config 2.0.
Today, MLCube relies on a simple "file path" approach to describe the inputs and outputs of their tasks. However, for many platforms, such like Kubernetes, it is not possible to use a single file path, because they either have complex storage backend, or use their own layer of storage abstractions, which do not use "paths" to refer to the corresponding locations in the data storage. This proposal aims to address this problem by providing a unified way of describing storage that can cover both local file systems and more complex storage solutions.
A storage backend can be described in the
platform
section of the config, which is supplied by the user at run-time. The storage description consists of 2 main parts: a name that will be used as a reference in the tasks' I/O paths, and a platform-specific spec that provides the details of the storage backend in the target platform, so that the runner can use it to find the right location of data.We do not change the "path"-like descriptions of task inputs/outputs in order to keep that simple, however, we do introduce a "variable"-like component as a part of the path, so that we can use this "variable" as a reference to the corresponding storage backend and use the rest of the path as a relative path to the given storage.
A most straight-forward example of such "variable" is "$WORKSPACE" which is currently being used to refer to a specific dir in local file system. With the new proposal, the "$WORKSPACE", or any "$CUSTOM_NAME" defined by user, can refer to an arbitrary storage backend as specified in the platform section.
Since the detailed spec of the storage is specified in the
platform
part, it can be decoupled from the shared MLCube config and only appear in the user's config. This also means that how the spec of a given storage backend is writtern should be agreed between a user and a runner, and not relevant to the MLCube publisher.While we do not have to provide a standard for that specs, we may provide some "guidelines/examples" for popular platforms so that there can be a convention for runner implementors.
The following is an example of how the storage backend can be defined, notice the specs in the
platform
section and how they are used in thetasks
section. Notice also that if we give the storage a name of "WORKSPACE" then we may redirect our default workspace to the specified storage backend, without change the values in the task I/Os.The text was updated successfully, but these errors were encountered: