-
Notifications
You must be signed in to change notification settings - Fork 447
Docker apps
In this context, words like 'app' have many possible meanings. To avoid confusion we use these terms:
-
"BOINC app" and "BOINC app version": the BOINC concepts described here.
-
"Science app": a set of programs that execute a job, i.e. that process input files and produce output files.
-
"Science executable": a science app in compiled form.
BOINC lets you use Docker to run science apps on volunteer hosts (Win, Mac and Linux). To do so:
-
Develop your science app in the software environment of your choice (say, particular versions of Linux and Python, with particular libraries and packages installed).
-
Write a Dockerfile that builds this environment.
Note: your Docker image must include the ps
command.
Most Docker Linux images do, but for some reason the Debian image does not.
If you use this image you'll need to add:
RUN apt-get update && apt-get install -y procps && rm -rf /var/lib/apt/lists/*
- Create BOINC app versions that combine the Dockerfile, and your science executables, with a "Docker wrapper" program (supplied by BOINC) that interfaces between Docker and the BOINC client.
Your science application can then run on all major platforms (Linux, Windows, Mac OS). In that sense it's similar to BOINC's support for apps that run in VirtualBox virtual machines. However, the Docker approach has several advantages:
- Docker apps can access GPUs.
- Docker apps use much less disk space (tens of MBs rather than GBs).
- Starting a Docker container takes less time than starting a virtual machine.
The remainder of this document describes BOINC's support for Docker apps. For a simple example, see the Docker app cookbook.
The Docker wrapper (docker_wrapper
) interfaces BOINC to Docker.
It is the main program of Docker apps.
The Docker wrapper reads a config file, job.toml
.
This file, which is in TOML format,
can contain the following items:
project_dir_mount = "/project"
Mounts the job's project directory at the given mount point (an absolute path) in the container.
use_gpu = true
Allow GPU access from the container.
checkpoint_interval = 3600
Specify a checkpoint interval, overriding the computing preferences.
docker_wrapper
mounts the job's slot directory
at WORKDIR in the container.
So there are two ways to access an input file.
Mark the file as <copy_file/>
in the input template.
The BOINC client will copy the file to the slot directory
(with its logical name)
and the science executable can access it directly.
If your science app has large input files (100 MB+) you can avoid the space and time overhead of copying them to the slot directory by accessing the master copy in the project directory.
To do this, don't mark the file as <copy_file/>
.
The client will create a "link file" in the slot directory.
The link file is an XML document that points to
the file in the project directory; for example
<soft_link>../../projects/proj_url/infile</soft_link>
Mount the project directory in the container
by adding this to job.toml
:
project_dir_mount = "/project"
Your executables (in the container) must convert BOINC's link files to physical names. This is easy to do in a shell script:
#! /bin/bash
resolve () {
sed 's/<soft_link>..\/..\/projects\/[^\/]*\//\/project\//; s/<\/soft_link>//' $1 | tr -d '\r\n'
}
./worker $(resolve in) out
Here, the resolve()
function takes the
name of a link file and returns the path of the file
in the project directory
(assuming that this directory is mounted at /project
).
Output files should not be marked <copy_file/>
.
Write them (with logical names) in the WORKDIR.
A BOINC job has
- A BOINC app version.
- A workunit.
Each of these is a collection of files. The files in a BOINC app version are code-signed. This is normally done manually, preventing hackers from using your project to distribute malware even if they are able to break into your server.
The files in a BOINC app version are cached on the client.
They are deleted only when the app version has been superceded
by a later version.
Workunit files are deleted after a job is finished,
unless they are marked as <sticky/>
in the job's input template.
The files of a Docker app can be divided between app version and workunit in two ways.
In this model, there is one BOINC app per science app. The BOINC app version for a platform contains
- Dockerfile
- docker_wrapper (compiled for that platform)
- job.toml
- science executables
and each workunit contains
- input files
To deploy a new science application you need to create a new BOINC app, and to deploy a new science application version you need to create a new BOINC app version. These both require login access to the BOINC server.
In this model, a single BOINC handles multiple science apps. The BOINC app version for a platform contains
- docker_wrapper (compiled for that platform)
and each workunit contains
- Dockerfile
- science executables
- job.toml
- input files
The first three can be marked as sticky to minimize file transfers. The science executables are not code-signed, but this matters less since they run in a container.
If your project has a mechanism for remote (e.g. web-based) job submission, job submitters can deploy new science app versions or new science apps without login access.
Issues with this model:
-
In the current BOINC architecture, each BOINC app has its own validator and assimilator. If multiple science apps are "sharing" the same BOINC app, we'll need a way to let them have different validators and assimilators. This could be built on the script-based framework.
-
Science apps may have different numbers of input files, and they need to control which ones use "copy mode". This could be handled using per-batch input templates.