Skip to content

Files

deploy

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
Apr 26, 2022
Apr 26, 2022
Jul 4, 2023
Feb 16, 2023
Oct 19, 2022
Oct 19, 2022
Apr 26, 2022
Apr 26, 2022

DAPHNE Deployment

Overview

This directory deploy/ can be used to deploy the Daphne System. With these scripts one can:

  • build the Daphne System (using build.sh),
  • package,
  • deliver and install to a deployment platform (e.g. HPC) and
  • utilize the resources of multiple machines/nodes.
  • It can also be used to just try out DAPHNE on a single machine.

Once deployed, Daphne system consists of multiple DistributedWorkers and a single coordinator who is responsible for handling a distributed execution.

Where to Start

  • deployDistributed.sh can be used to manually deploy using only SSH. When executed without parameters, it prints out the help message.
  • deploy-distributed-on-slurm.sh can be used for environments with Slurm tool. When executed without parameters, it prints out the help message.

Deployment Scheme

DAPHNE Deployment Scheme encompasses the following:

  • A Compilation node (where the Daphne System will be compiled)
  • Deployment Platform (e.g. an HPC with SLURM support)
    • Login Node (or, other type of access)
      • HPC Task Submission interface (e.g. SLURM)
    • Compute Node(s)
      • DAPHNE coordinator
      • DAPHNE DistributedWorkers
                    DAPHNE Deployment Scheme

+--------------------------------------------------------------------------------------+
|                                                                                      |
|   +------------------+                                                               |
|   | Compilation node |                                                               |
|   |                  |                                                               |
|   +------------------+                                                               |
|       |                                                                              |
|       |                                                                              |
|       | (SSH connection)                                                             |
|       |                                                                              |
|       |                                                                              |
| +----------------------------------------------------------------------------------+ |
| | Deployment Platform (e.g. an HPC with SLURM support)                             | |
| |                                                                                  | |
| |  +------------------------------+                                                | |
| |  | Access/Submission/Login Node |                                                | |
| |  |                              |                                                | |
| |  +------------------------------+                                                | |
| |      |                                                                           | |
| |      |                                                                           | |
| |      |   Network connections, e.g. Infiniband, to e.g. SLURM interfaces,         | |
| |      |   used also for communications between MT and DWs.                        | |
| |      |-------------------------------------------------------------------+       | |
| |      |                                         |                         |       | |
| |  +--------------------------+     +--------------------------+     +-----------+ | |
| |  | Node 1                   |     | Node 2                   |     | Node n    | | |
| |  | - Resources              | ... |                          | ... |           | | |
| |  |   - CPU/GPU/FPGA         |     | CPU/GPU/FPGAs            |     | Resources | | |
| |  | - Running Tasks          |     |   (e.g. 128+)            |     |           | | |
| |  |   - `coordinator`        |     | {DistributedWorker (DW)} |     | DWs       | | |
| |  |   - (optional: more DWs) |     |   (e.g. DWs 1..128)      |     |           | | |
| |  +--------------------------+     +--------------------------+     +-----------+ | |
| |                                                                                  | |
| +----------------------------------------------------------------------------------+ |
|                                                                                      |
+--------------------------------------------------------------------------------------+

Deployment scripts

This directory includes a set of bash scripts providing support for:

  • packaging/virtualization of the deployment (installation) package,
  • containerized packaging,
  • virtualized installation,
  • managed deployment,
  • deployment of the ˙daphne˙ executable,
  • starting and managing Daphne processes within containerized environments (schedule and execute remotely SLURM tasks), and
  • stopping and cleaning of a deployment.

List of Files in this Directory

  1. This short README file to explain directory structure and point to more documentation at Deploy.
  2. A script that builds the "daphne.sif" singularity image from the Docker image daphneeu/daphne-dev
  3. deploy-distributed-on-slurm script allows the user to deploy DAPHNE with SLURM.
  4. deployDistributed script builds and sends DAPHNE to remote machines manually with SSH (no tools like Slurm needed).
  5. example-time.daphne Daphne example script which prints out the running time of a simple operation.
  6. The Singularity image configuration file.

More Documentation

  1. Documentation about deployment, including tutorial-like explanation examples about how to package, distributively deploy, manage, and execute workloads using DAPHNE.
  2. Getting started guide
  3. Bulding the Daphne System