Skip to content

Docker apps

David Anderson edited this page Nov 7, 2024 · 10 revisions

Terminology

In this context, words like 'app' have many possible meanings. To avoid confusion we use these terms:

  • "BOINC app" and "BOINC app version": the BOINC concepts described here.

  • "Science app": a set of programs that execute a job, i.e. that process input files and produce output files.

  • "Science executable": a science app in compiled form.

Overview

BOINC lets you use Docker to run science apps on volunteer hosts (Win, Mac and Linux). To do so:

  • Develop your science app in the software environment of your choice (say, particular versions of Linux and Python, with particular libraries and packages installed).

  • Write a Dockerfile that builds this environment.

Note: your Docker image must include the ps command. Most Docker Linux images do, but for some reason the Debian image does not. If you use this image you'll need to add:

RUN apt-get update && apt-get install -y procps && rm -rf /var/lib/apt/lists/*
  • Create BOINC app versions that combine the Dockerfile, and your science executables, with a "Docker wrapper" program (supplied by BOINC) that interfaces between Docker and the BOINC client.

Your science application can then run on all major platforms (Linux, Windows, Mac OS). In that sense it's similar to BOINC's support for apps that run in VirtualBox virtual machines. However, the Docker approach has several advantages:

  • Docker apps can access GPUs.
  • Docker apps use much less disk space (tens of MBs rather than GBs).
  • Starting a Docker container takes less time than starting a virtual machine.

The remainder of this document describes BOINC's support for Docker apps. For a simple example, see the Docker app cookbook.

The Docker wrapper

The Docker wrapper (docker_wrapper) interfaces BOINC to Docker. It is the main program of Docker apps.

The Docker wrapper reads a config file, job.toml. This file, which is in TOML format, can contain the following items:

project_dir_mount = "/project"

Mounts the job's project directory at the given mount point (an absolute path) in the container.

use_gpu = true

Allow GPU access from the container.

checkpoint_interval = 3600

Specify a checkpoint interval, overriding the computing preferences.

Accessing input files

docker_wrapper mounts the job's slot directory at WORKDIR in the container. So there are two ways to access an input file.

Direct access

Mark the file as <copy_file/> in the input template. The BOINC client will copy the file to the slot directory (with its logical name) and the science executable can access it directly.

Indirect access

If your science app has large input files (100 MB+) you can avoid the space and time overhead of copying them to the slot directory by accessing the master copy in the project directory.

To do this, don't mark the file as <copy_file/>. The client will create a "link file" in the slot directory. The link file is an XML document that points to the file in the project directory; for example

<soft_link>../../projects/proj_url/infile</soft_link>

Mount the project directory in the container by adding this to job.toml:

project_dir_mount = "/project"

Your executables (in the container) must convert BOINC's link files to physical names. This is easy to do in a shell script:

#! /bin/bash

resolve () {
    sed 's/<soft_link>..\/..\/projects\/[^\/]*\//\/project\//; s/<\/soft_link>//' $1 | tr -d '\r\n'
}

./worker $(resolve in) out

Here, the resolve() function takes the name of a link file and returns the path of the file in the project directory (assuming that this directory is mounted at /project).

Accessing output files

Output files should not be marked <copy_file/>. Write them (with logical names) in the WORKDIR.

Packaging options

A BOINC job has

  • A BOINC app version.
  • A workunit.

Each of these is a collection of files. The files in a BOINC app version are code-signed. This is normally done manually, preventing hackers from using your project to distribute malware even if they are able to break into your server.

The files in a BOINC app version are cached on the client. They are deleted only when the app version has been superceded by a later version. Workunit files are deleted after a job is finished, unless they are marked as <sticky/> in the job's input template.

The files of a Docker app can be divided between app version and workunit in two ways.

Single-purpose BOINC app

In this model, there is one BOINC app per science app. The BOINC app version for a platform contains

  • Dockerfile
  • docker_wrapper (compiled for that platform)
  • job.toml
  • science executables

and each workunit contains

  • input files

To deploy a new science application you need to create a new BOINC app, and to deploy a new science application version you need to create a new BOINC app version. These both require login access to the BOINC server.

Universal BOINC app

In this model, a single BOINC handles multiple science apps. The BOINC app version for a platform contains

  • docker_wrapper (compiled for that platform)

and each workunit contains

  • Dockerfile
  • science executables
  • job.toml
  • input files

The first three can be marked as sticky to minimize file transfers. The science executables are not code-signed, but this matters less since they run in a container.

If your project has a mechanism for remote (e.g. web-based) job submission, job submitters can deploy new science app versions or new science apps without login access.

Issues with this model:

  • In the current BOINC architecture, each BOINC app has its own validator and assimilator. If multiple science apps are "sharing" the same BOINC app, we'll need a way to let them have different validators and assimilators. This could be built on the script-based framework.

  • Science apps may have different numbers of input files, and they need to control which ones use "copy mode". This could be handled using per-batch input templates.

Clone this wiki locally