Instructions for Ansible

RoFL supports the orchestration of servers on AWS out of the box to improve the ability to reproduce experiments. This document describes how to run experiments using Ansible.

Getting Started

cd into ansible folder
install pipenv pip install pipenv
pipenv set up pipenv install in root folder
run pipenv shell in root folder

Analysis Playbook

first run: ansible-playbook analysis.yml -i inventory/analysis -e "exp=demo run=new"
continue a run (with run id): ansible-playbook analysis.yml -i inventory/analysis -e "exp=demo run=1611332286"

Microbenchmark Playbook

Start Microbenchmark and check for a fixed amount of time whether benchmark finished, then fetch results: ansible-playbook microbench.yml -i inventory
Only start the microbenchmark (with specific fp and frac): ansible-playbook microbench.yml -i inventory --tags "start" -e "fp=16 frac=8"
When Microbenchmark is Running, Wait until finished and then fetch results: ansible-playbook microbench.yml -i inventory --tags "result"
you can add --ssh-common-args='-o StrictHostKeyChecking=no' as argument which means that you don't have to type yes when trying to connect to a newly created ec2 instance.

E2E Playbook

Each experiment can consist of different configurations which are run after each other, defined in the experiments key in the config file after the base_experiment. To start a new experiment, pass run=new:

ansible-playbook e2ebench.yml -i inventory --ssh-common-args='-o StrictHostKeyChecking=no' -e "exp=mnist_e2e run=new"

This will set up the required machines and the configurations for the experiments. After the first configuration has finished, invoke the same command but this time with the run id of the current experiment:

ansible-playbook e2ebench.yml -i inventory --ssh-common-args='-o StrictHostKeyChecking=no' -e "exp=mnist_e2e run=<RUN_ID>"

This will retrieve the results of the first configuration of the experiment. To launch the next configuration of the experiment, invoke the same command again with the <RUN_ID>.

Note: The run id can be found in the experiment_results directory and is currently the timestamp of when the experiment was started.

Configuration

Most job configuration parameters are in the .yml config files (e.g. experiments/mnist_basic.yml) under the job key.
Make sure the number of clients in the FL setup is divisible by the number of client machines. If this is not the case, the client -> machine division algorithm does not work properly. In the future, this should be easy to fix to allow for an unbalanced division.
Other configuration parameters such as the machine type and optimization (e.g., skylake) can be found in group_vars/all/main.yml.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Instructions for Ansible

Getting Started

Analysis Playbook

Microbenchmark Playbook

E2E Playbook

Configuration

Files

README.md

Latest commit

History

README.md

File metadata and controls

Instructions for Ansible

Getting Started

Analysis Playbook

Microbenchmark Playbook

E2E Playbook

Configuration