Skip to content

Commit

Permalink
Added writeups, contact, links
Browse files Browse the repository at this point in the history
  • Loading branch information
gokart23 committed Mar 30, 2019
1 parent d353350 commit 27c9e03
Show file tree
Hide file tree
Showing 13 changed files with 259 additions and 20 deletions.
1 change: 0 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,5 @@ models/checkpoints/
models/metrics/*.csv
models/sent140/embs.json
_build/
_static/
_templates/
autodoc/
3 changes: 3 additions & 0 deletions docs/source/_static/customheader.css
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
.wy-side-nav-search, .wy-nav-top {
background: #0b750a;
}
Binary file added docs/source/_static/images/femnist_75_thresh.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/_static/images/leaf_rep_sent140.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
9 changes: 0 additions & 9 deletions docs/source/authors.rst

This file was deleted.

14 changes: 14 additions & 0 deletions docs/source/contact.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
Contact
=======

::

Sebastian Caldas
PhD student, Machine Learning Department
Carnegie Mellon University
Email: [first-letter-first-name][last-name]@cmu.edu

Virginia Smith
Assistant Professor, Electrical and Computer Engineering Department
Carnegie Mellon University
Email: [last-name][first-letter-first-name]@cmu.edu
4 changes: 2 additions & 2 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ See the source code repository on GitHub: https://github.com/TalwalkarLab/leaf
:hidden:
:caption: Additional Information

authors
contact
citations

`Getting started <install/get_leaf.html>`_
Expand All @@ -57,7 +57,7 @@ A set of examples illustrating the use of different models and datasets with LEA
The exact API of all functions and classes, as given in the
docstring.

`Authors <authors.html>`_
`Contact <contact.html>`_
-------------------------

Contact information of LEAF authors
Expand Down
72 changes: 72 additions & 0 deletions docs/source/tutorials/femnist-md.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# Systems Resource Requirement Analysis

In this experiment, we reproduce the systems analysis experiment conducted in the [LEAF paper](https://arxiv.org/abs/1812.01097).

Specifically, we identify the systems budget (in terms of compute [number of FLOPs]
and network bandwidth) required when training with minibatch SGD vs. `FedAvg`, using the LEAF framework.

For this example, we shall use the FEMNIST dataset to perform an image classification task using a
2-layer convolutional neural network.

# Experiment Setup and Execution

For this experiment, we describe how to use the LEAF framework to execute minibatch SGD for 3 clients with a
10% batch-size.

## Quickstart script

In the interest of ease of use, we provide a script for execution of the experiment
for different settings for SGD and `FedAvg`, which may be executed as:

```bash
leaf/ $> ./femnist.sh <result-output-dir>
```

This script will execute the instructions provided using both minibatch SGD and `FedAvg` for different configurations of clients per round, batch size and local epochs, reproducibly generating the data partitions and results observed by the authors during analysis.

## Dataset fetching and pre-processing

LEAF contains powerful scripts for fetching and conversion of data into JSON format for easy utilization.
Additionally, these scripts are also capable of subsampling from the dataset, and splitting the dataset
into training and testing sets.

For our experiment, as a first step, we shall use 5% of the dataset in an 80-20 train/test split. The following command shows
how this can be accomplished (the `--spltseed` and `--smplseed` flags in this case is to enable reproducible generation of the dataset)

```bash
leaf/data/femnist/ $> ./preprocess.sh -s iid --sf 0.05 -k 0 -t sample --smplseed 1549786595 --spltseed 1549786796
```

After running this script, the `data/femnist/data` directory should contain `train/` and `test/` directories.

## Model Execution

Now that we have our data, we can execute our model! For this experiment, the model file is stored
at `models/femnist/cnn.py`. In order train this model using SGD with 3 clients per round
and 10% batch size, we execute the following command:

```bash
leaf/models $> python main.py -dataset femnist -model cnn -lr 0.06 --minibatch 0.1 --clients-per-round 3 --num-rounds 2000
```

## Metrics Collection

Executing the above command will write out system and statistical metrics to `leaf/models/metrics/stat_metrics.csv` and `leaf/models/metrics/sys_metrics.csv` - since these are overwritten for every run, we __highly recommend__ storing the generated metrics files at a different location.

To experiment with a different configuration, re-run the main model script with a different flags. The plots shown below can be generated using `plots.py` file in the repo root.

# Results and Analysis

For an accuracy threshold of 75%, we see an improved systems profile of FedAvg when it comes to the communication
vs. local computation trade-off, though we note that in general methods may vary across these two
dimensions, and it is thus important to consider both aspects depending on the problem at hand.

<div style="text-align:center" markdown="1">

![](../_static/images/femnist_75_thresh.png "Systems profile of different methods (75% accuracy threshold)")

</div>

# More Information

More information about the framework, challenges and experiments can be found in the [LEAF paper](https://arxiv.org/abs/1812.01097).
33 changes: 33 additions & 0 deletions docs/source/tutorials/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,40 @@ General Examples

General-purpose and introductory examples to the LEAF framework.

`Twitter Sentiment Analysis <sent140-md.html>`_
-----------------------------------------------

Investigate the effect of varying the minimum number of samples per user (for training)
on model accuracy when training using `FedAvg` algorithm.

.. toctree::
:maxdepth: 1
:hidden:

sent140-md

`Convergence Properties of FedAvg <shakespeare-md.html>`_
---------------------------------------------------------

Investigate the effect of varying the minimum number of samples per user (for training)
on model accuracy when training using `FedAvg` algorithm.

.. toctree::
:maxdepth: 1
:hidden:

shakespeare-md

`Systems Resource Requirement Analyses <femnist-md.html>`_
-----------------------------------------------------------

Investigate the systems budget (in terms of compute [number of FLOPs]
and network bandwidth) required to achieve an accuracy of 75% on FEMNIST dataset
using both minibatch SGD and `FedAvg`

.. toctree::
:maxdepth: 0
:hidden:

femnist-md

72 changes: 64 additions & 8 deletions docs/source/tutorials/sent140-md.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,78 @@
# Twitter Sentiment Analysis

## Significance of experiments
In this experiment, we reproduce the statistical analysis experiment conducted in the [LEAF paper](https://arxiv.org/abs/1812.01097). Specifically, we investigate the effect of varying the minimum number of
samples per user (for training) on model accuracy when training using `FedAvg` algorithm,
using the LEAF framework.

- Statistical test - vary number of users to see effect on accuracy/performance
For this example, we shall use Sentiment140 dataset (containing 1.6 million tweets),
and we shall train a 2-layer LSTM model with cross-entropy loss, and using pre-trained GloVe embeddings.

# Experiment Setup and Execution

## Quickstart script

In the interest of ease of use, we provide a script for execution of the experiment
for different min-sample counts, which may be executed as:

```bash
leaf/ $> ./sent140.sh <result-output-dir>
```

This script will execute the instructions provided below for min-sample counts of 3, 10, 30 and 100, reproducibly generating the data partitions and results observed by the authors during analysis.

## Pre-requisites

- **Dataset**: Generate data with `k` samples per user, run (`-t sample` required for running baseline impl).
Since this experiment requires pre-trained word embeddings, we recommend running the
`models/sent140/get_embs.sh` file, which fetches 300-dimensional pretrained GloVe vectors.
```bash
leaf/models/sent140/ $> ./get_embs.sh
```
After extraction, this data is stored in `models/sent140/embs.json`.

## Dataset fetching and pre-processing

LEAF contains powerful scripts for fetching and conversion of data into JSON format for easy utilization.
Additionally, these scripts are also capable of subsampling from the dataset, and splitting the dataset
into training and testing sets.

For our experiment, as a first step, we shall use 50% of the dataset in an 80-20 train/test split,
and we shall discard all users with less than 10 tweets. The following command shows
how this can be accomplished (the `--spltseed` flag in this case is to enable reproducible generation of the dataset)

```bash
leaf/data/sent140/ $> ./preprocess.sh --sf 0.5 -t sample --tf 0.8 -k 3 --spltseed 1549775860
```

- **GloVe Embeddings**: Setup glove embeddings as a json file (required even for BoW logistic regression since defines vocab dict) - VOCAB_DIR variable in bag_log_reg, sent140/get_embs.sh fetches and extracts embeddings to correct location
After running this script, the `data/sent140/data` directory should contain `train/` and `test/` directories.

## Model Execution

- Run (2 clients for 10 rounds, converges to accuracy: 0.609676, 10th percentile: 0.25, 90th percentile 1 for k=100). Uses FedAvg for distributed learning (since --minibatch isn’t specified)
Now that we have our data, we can execute our model! For this experiment, the model file is stored
at `models/sent140/stacked_lstm.py`. In order train this model using `FedAvg` with 2 clients every round for 10 rounds,
we execute the following command:

```bash
leaf/models $> python3 main.py -dataset sent140 -model stacked_lstm -lr 0.0003 --clients-per-round 2 --num-rounds 10
```

Alternatively, passing `-t small` in place of the latter 2 flags provides the same functionality (as defined in `models/baseline_constants.py` file).

## Metrics Collection

Executing the above command will write out system and statistical metrics to `leaf/models/metrics/stat_metrics.csv` and `leaf/models/metrics/sys_metrics.csv` - since these are overwritten for every run, we __highly recommend__ storing the generated metrics files at a different location.

To experiment with a different min-sample setting, re-run the preprocessing script with a different `-k` flag. The plots shown below can be generated using `plots.py` file in the repo root.

# Results and Analysis

Upon performing this experiment, we see that, while median performance degrades only slightly with data-deficient users (i.e., k = 3), the 25th percentile (bottom of box) degrades dramatically.

<div style="text-align:center" markdown="1">

- Metrics written out to metrics/stat_metrics.csv and metrics/sys_metrics.csv (configurable via main.py:L20,21)
![](../_static/images/leaf_rep_sent140.png "Sentiment140 Results")

### Quickstart script
</div>

In the root of the LEAF directory, execute `./sent140.sh`
# More Information

More information about the framework, challenges and experiments can be found in the [LEAF paper](https://arxiv.org/abs/1812.01097).
71 changes: 71 additions & 0 deletions docs/source/tutorials/shakespeare-md.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
# Convergence Behaviour of FederatedAveraging

In this experiment, we reproduce the convergence analysis experiment conducted in the [LEAF paper](https://arxiv.org/abs/1812.01097). Specifically, we investigate the convergence behaviour of `FedAvg` algorithm
when varying the number of local epochs, using the LEAF framework.

For this example, we shall use the Shakespeare dataset to perform a character-prediction task;
that is, given a statement by a character in a play, we shall try to predict the next character in the sentence.
For this experiment, we shall train a 2-layer LSTM model with cross-entropy loss, using randomly initialized word embeddings.

# Experiment Setup and Execution

## Quickstart script

In the interest of ease of use, we provide a script for execution of the experiment
for different number of local epochs, which may be executed as:

```bash
leaf/ $> ./shakespeare.sh <result-output-dir>
```

This script will execute the instructions provided below for local epoch values of 1 and 20,
reproducibly generating the data partitions and results observed by the authors during analysis.

## Dataset fetching and pre-processing

LEAF contains powerful scripts for fetching and conversion of data into JSON format for easy utilization.
Additionally, these scripts are also capable of subsampling from the dataset, and splitting the dataset
into training and testing sets.

For our experiment, as a first step, we shall use 5% of the dataset in an 90-10 train/test split,
and we shall discard all users with less than 64 samples. The following command shows
how this can be accomplished (the `--spltseed` and `--smplseed` flags in this case is to enable reproducible generation of the dataset)

```bash
leaf/data/shakespeare/ $> ./preprocess.sh --sf 0.05 -t sample --tf 0.9 -k 64 --smplseed 1550262838 --spltseed 1550262839
```

After running this script, the `data/shakespeare/data` directory should contain `train/` and `test/` directories.

## Model Execution

Now that we have our data, we can execute our model! For this experiment, the model file is stored
at `models/shakespeare/stacked_lstm.py`. In order train this model using `FedAvg` with 10 clients per round
and 1 local epoch per round, we execute the following command:

```bash
leaf/models $> python main.py -dataset shakespeare -model stacked_lstm --num-rounds 81 --clients-per-round 10 --num_epochs 1 -lr 0.8
```

## Metrics Collection

Executing the above command will write out system and statistical metrics to `leaf/models/metrics/stat_metrics.csv` and `leaf/models/metrics/sys_metrics.csv` - since these are overwritten for every run, we __highly recommend__ storing the generated metrics files at a different location.

To experiment with a different number of local epochs, re-run the main model script with a different `--num_epochs` flag. The plots shown below can be generated using `plots.py` file in the repo root.

# Results and Analysis

From the generated data, we computed the aggregate weighted training loss and training accuracy
(note that weighting is important to correct for label imbalance). Also, we note that `FedAvg`
method diverges as the number of local epochs increases.

<div style="text-align:center" markdown="1">

![](../_static/images/shake_small_weighted_test_acc.png "Weighted Test Accuracy on Shakespeare dataset using FedAvg")
![](../_static/images/shake_small_weighted_training_loss.png "Weighted Training Loss on Shakespeare dataset using FedAvg")

</div>

# More Information

More information about the framework, challenges and experiments can be found in the [LEAF paper](https://arxiv.org/abs/1812.01097).

0 comments on commit 27c9e03

Please sign in to comment.