Added writeups, contact, links

zhoudushui · Mar 30, 2019 · 27c9e03 · 27c9e03
1 parent d353350
commit 27c9e03
Show file tree

Hide file tree

Showing 13 changed files with 259 additions and 20 deletions.
diff --git a/.gitignore b/.gitignore
@@ -11,6 +11,5 @@ models/checkpoints/
 models/metrics/*.csv
 models/sent140/embs.json
 _build/
-_static/
 _templates/
 autodoc/
diff --git a/docs/source/_static/customheader.css b/docs/source/_static/customheader.css
@@ -0,0 +1,3 @@
+.wy-side-nav-search, .wy-nav-top {
+    background: #0b750a;
+}
diff --git a/docs/source/_static/images/femnist_75_thresh.png b/docs/source/_static/images/femnist_75_thresh.png
diff --git a/docs/source/_static/images/leaf_rep_sent140.png b/docs/source/_static/images/leaf_rep_sent140.png
diff --git a/docs/source/_static/images/shake_small_weighted_test_acc.png b/docs/source/_static/images/shake_small_weighted_test_acc.png
diff --git a/docs/source/_static/images/shake_small_weighted_training_loss.png b/docs/source/_static/images/shake_small_weighted_training_loss.png
diff --git a/docs/source/authors.rst b/docs/source/authors.rst
diff --git a/docs/source/contact.rst b/docs/source/contact.rst
@@ -0,0 +1,14 @@
+Contact
+=======
+
+::
+
+    Sebastian Caldas
+    PhD student, Machine Learning Department
+    Carnegie Mellon University
+    Email: [first-letter-first-name][last-name]@cmu.edu
+
+    Virginia Smith
+    Assistant Professor, Electrical and Computer Engineering Department
+    Carnegie Mellon University
+    Email: [last-name][first-letter-first-name]@cmu.edu
diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -38,7 +38,7 @@ See the source code repository on GitHub: https://github.com/TalwalkarLab/leaf
    :hidden:
    :caption: Additional Information
 
-   authors
+   contact
    citations
 
 `Getting started <install/get_leaf.html>`_
@@ -57,7 +57,7 @@ A set of examples illustrating the use of different models and datasets with LEA
 The exact API of all functions and classes, as given in the
 docstring. 
 
-`Authors <authors.html>`_
+`Contact <contact.html>`_
 -------------------------
 
 Contact information of LEAF authors

diff --git a/docs/source/tutorials/femnist-md.md b/docs/source/tutorials/femnist-md.md
@@ -0,0 +1,72 @@
+# Systems Resource Requirement Analysis
+
+In this experiment, we reproduce the systems analysis experiment conducted in the [LEAF paper](https://arxiv.org/abs/1812.01097). 
+
+Specifically, we identify the systems budget (in terms of compute [number of FLOPs]
+and network bandwidth) required when training with minibatch SGD vs. `FedAvg`, using the LEAF framework.
+
+For this example, we shall use the FEMNIST dataset to perform an image classification task using a
+2-layer convolutional neural network.
+
+# Experiment Setup and Execution
+
+For this experiment, we describe how to use the LEAF framework to execute minibatch SGD for 3 clients with a
+10% batch-size.
+
+## Quickstart script
+
+In the interest of ease of use, we provide a script for execution of the experiment
+for different settings for SGD and `FedAvg`, which may be executed as:
+
+```bash
+leaf/ $> ./femnist.sh <result-output-dir>
+```
+
+This script will execute the instructions provided using both minibatch SGD and `FedAvg` for different configurations of clients per round, batch size and local epochs, reproducibly generating the data partitions and results observed by the authors during analysis.
+
+## Dataset fetching and pre-processing
+
+LEAF contains powerful scripts for fetching and conversion of data into JSON format for easy utilization.
+Additionally, these scripts are also capable of subsampling from the dataset, and splitting the dataset
+into training and testing sets.
+
+For our experiment, as a first step, we shall use 5% of the dataset in an 80-20 train/test split. The following command shows
+how this can be accomplished (the `--spltseed` and `--smplseed` flags in this case is to enable reproducible generation of the dataset)
+
+```bash
+leaf/data/femnist/ $> ./preprocess.sh -s iid --sf 0.05 -k 0 -t sample --smplseed 1549786595 --spltseed 1549786796
+```
+
+After running this script, the `data/femnist/data` directory should contain `train/` and `test/` directories.
+
+## Model Execution
+
+Now that we have our data, we can execute our model! For this experiment, the model file is stored
+at `models/femnist/cnn.py`. In order train this model using SGD with 3 clients per round
+and 10% batch size, we execute the following command:
+
+```bash
+leaf/models $> python main.py -dataset femnist -model cnn -lr 0.06 --minibatch 0.1 --clients-per-round 3 --num-rounds 2000
+```
+
+## Metrics Collection
+
+Executing the above command will write out system and statistical metrics to `leaf/models/metrics/stat_metrics.csv` and `leaf/models/metrics/sys_metrics.csv` - since these are overwritten for every run, we __highly recommend__ storing the generated metrics files at a different location.
+
+To experiment with a different configuration, re-run the main model script with a different flags. The plots shown below can be generated using `plots.py` file in the repo root.
+
+# Results and Analysis
+
+For an accuracy threshold of 75%, we see an improved systems profile of FedAvg when it comes to the communication
+vs. local computation trade-off, though we note that in general methods may vary across these two
+dimensions, and it is thus important to consider both aspects depending on the problem at hand.
+
+<div style="text-align:center" markdown="1">
+
+![](../_static/images/femnist_75_thresh.png "Systems profile of different methods (75% accuracy threshold)")
+
+</div>
+
+# More Information
+
+More information about the framework, challenges and experiments can be found in the [LEAF paper](https://arxiv.org/abs/1812.01097). 
diff --git a/docs/source/tutorials/index.rst b/docs/source/tutorials/index.rst
@@ -3,7 +3,40 @@ General Examples
 
 General-purpose and introductory examples to the LEAF framework.
 
+`Twitter Sentiment Analysis <sent140-md.html>`_
+-----------------------------------------------
+
+Investigate the effect of varying the minimum number of samples per user (for training)
+on model accuracy when training using `FedAvg` algorithm.
+
 .. toctree::
    :maxdepth: 1
+   :hidden:
 
    sent140-md
+
+`Convergence Properties of FedAvg <shakespeare-md.html>`_
+---------------------------------------------------------
+
+Investigate the effect of varying the minimum number of samples per user (for training)
+on model accuracy when training using `FedAvg` algorithm.
+
+.. toctree::
+   :maxdepth: 1
+   :hidden:
+
+   shakespeare-md
+
+`Systems Resource Requirement Analyses <femnist-md.html>`_
+-----------------------------------------------------------
+
+Investigate the systems budget (in terms of compute [number of FLOPs]
+and network bandwidth) required to achieve an accuracy of 75% on FEMNIST dataset
+using both minibatch SGD and `FedAvg`
+
+.. toctree::
+   :maxdepth: 0
+   :hidden:
+
+   femnist-md
+
diff --git a/docs/source/tutorials/sent140-md.md b/docs/source/tutorials/sent140-md.md
@@ -1,22 +1,78 @@
 # Twitter Sentiment Analysis
 
-## Significance of experiments
+In this experiment, we reproduce the statistical analysis experiment conducted in the [LEAF paper](https://arxiv.org/abs/1812.01097). Specifically, we investigate the effect of varying the minimum number of 
+samples per user (for training) on model accuracy when training using `FedAvg` algorithm,
+using the LEAF framework. 
 
-- Statistical test - vary number of users to see effect on accuracy/performance
+For this example, we shall use Sentiment140 dataset (containing 1.6 million tweets),
+and we shall train a 2-layer LSTM model with cross-entropy loss, and using pre-trained GloVe embeddings.
+
+# Experiment Setup and Execution
+
+## Quickstart script
+
+In the interest of ease of use, we provide a script for execution of the experiment
+for different min-sample counts, which may be executed as:
+
+```bash
+leaf/ $> ./sent140.sh <result-output-dir>
+```
+
+This script will execute the instructions provided below for min-sample counts of 3, 10, 30 and 100, reproducibly generating the data partitions and results observed by the authors during analysis.
 
 ## Pre-requisites
 
-- **Dataset**: Generate data with `k` samples per user, run  (`-t sample` required for running baseline impl).
+Since this experiment requires pre-trained word embeddings, we recommend running the
+`models/sent140/get_embs.sh` file, which fetches 300-dimensional pretrained GloVe vectors.
+```bash
+leaf/models/sent140/ $> ./get_embs.sh
+```
+After extraction, this data is stored in `models/sent140/embs.json`.
+
+## Dataset fetching and pre-processing
+
+LEAF contains powerful scripts for fetching and conversion of data into JSON format for easy utilization.
+Additionally, these scripts are also capable of subsampling from the dataset, and splitting the dataset
+into training and testing sets.
+
+For our experiment, as a first step, we shall use 50% of the dataset in an 80-20 train/test split,
+and we shall discard all users with less than 10 tweets. The following command shows
+how this can be accomplished (the `--spltseed` flag in this case is to enable reproducible generation of the dataset)
+
+```bash
+leaf/data/sent140/ $> ./preprocess.sh --sf 0.5 -t sample --tf 0.8 -k 3 --spltseed 1549775860
+```
 
-- **GloVe Embeddings**: Setup glove embeddings as a json file (required even for BoW logistic regression since defines vocab dict) - VOCAB_DIR variable in bag_log_reg, sent140/get_embs.sh fetches and extracts embeddings to correct location
+After running this script, the `data/sent140/data` directory should contain `train/` and `test/` directories.
 
 ## Model Execution
 
-- Run  (2 clients for 10 rounds, converges to accuracy: 0.609676, 10th percentile: 0.25, 90th percentile 1 for k=100). Uses FedAvg for distributed learning (since --minibatch isn’t specified)
+Now that we have our data, we can execute our model! For this experiment, the model file is stored
+at `models/sent140/stacked_lstm.py`. In order train this model using `FedAvg` with 2 clients every round for 10 rounds,
+we execute the following command:
+
+```bash
+leaf/models $> python3 main.py -dataset sent140 -model stacked_lstm -lr 0.0003 --clients-per-round 2 --num-rounds 10
+```
+
+Alternatively, passing `-t small` in place of the latter 2 flags provides the same functionality (as defined in `models/baseline_constants.py` file).
+
+## Metrics Collection
+
+Executing the above command will write out system and statistical metrics to `leaf/models/metrics/stat_metrics.csv` and `leaf/models/metrics/sys_metrics.csv` - since these are overwritten for every run, we __highly recommend__ storing the generated metrics files at a different location.
+
+To experiment with a different min-sample setting, re-run the preprocessing script with a different `-k` flag. The plots shown below can be generated using `plots.py` file in the repo root.
+
+# Results and Analysis
+
+Upon performing this experiment, we see that, while median performance degrades only slightly with data-deficient users (i.e., k = 3), the 25th percentile (bottom of box) degrades dramatically.
+
+<div style="text-align:center" markdown="1">
 
-- Metrics written out to metrics/stat_metrics.csv and metrics/sys_metrics.csv (configurable via main.py:L20,21)
+![](../_static/images/leaf_rep_sent140.png "Sentiment140 Results")
 
-### Quickstart script
+</div>
 
-In the root of the LEAF directory, execute `./sent140.sh`
+# More Information
 
+More information about the framework, challenges and experiments can be found in the [LEAF paper](https://arxiv.org/abs/1812.01097). 
diff --git a/docs/source/tutorials/shakespeare-md.md b/docs/source/tutorials/shakespeare-md.md
@@ -0,0 +1,71 @@
+# Convergence Behaviour of FederatedAveraging
+
+In this experiment, we reproduce the convergence analysis experiment conducted in the [LEAF paper](https://arxiv.org/abs/1812.01097). Specifically, we investigate the convergence behaviour of `FedAvg` algorithm
+when varying the number of local epochs, using the LEAF framework.
+
+For this example, we shall use the Shakespeare dataset to perform a character-prediction task;
+that is, given a statement by a character in a play, we shall try to predict the next character in the sentence.
+For this experiment, we shall train a 2-layer LSTM model with cross-entropy loss, using randomly initialized word embeddings.
+
+# Experiment Setup and Execution
+
+## Quickstart script
+
+In the interest of ease of use, we provide a script for execution of the experiment
+for different number of local epochs, which may be executed as:
+
+```bash
+leaf/ $> ./shakespeare.sh <result-output-dir>
+```
+
+This script will execute the instructions provided below for local epoch values of 1 and 20,
+reproducibly generating the data partitions and results observed by the authors during analysis.
+
+## Dataset fetching and pre-processing
+
+LEAF contains powerful scripts for fetching and conversion of data into JSON format for easy utilization.
+Additionally, these scripts are also capable of subsampling from the dataset, and splitting the dataset
+into training and testing sets.
+
+For our experiment, as a first step, we shall use 5% of the dataset in an 90-10 train/test split,
+and we shall discard all users with less than 64 samples. The following command shows
+how this can be accomplished (the `--spltseed` and `--smplseed` flags in this case is to enable reproducible generation of the dataset)
+
+```bash
+leaf/data/shakespeare/ $> ./preprocess.sh --sf 0.05 -t sample --tf 0.9 -k 64 --smplseed 1550262838 --spltseed 1550262839
+```
+
+After running this script, the `data/shakespeare/data` directory should contain `train/` and `test/` directories.
+
+## Model Execution
+
+Now that we have our data, we can execute our model! For this experiment, the model file is stored
+at `models/shakespeare/stacked_lstm.py`. In order train this model using `FedAvg` with 10 clients per round
+and 1 local epoch per round, we execute the following command:
+
+```bash
+leaf/models $> python main.py -dataset shakespeare -model stacked_lstm --num-rounds 81 --clients-per-round 10 --num_epochs 1 -lr 0.8
+```
+
+## Metrics Collection
+
+Executing the above command will write out system and statistical metrics to `leaf/models/metrics/stat_metrics.csv` and `leaf/models/metrics/sys_metrics.csv` - since these are overwritten for every run, we __highly recommend__ storing the generated metrics files at a different location.
+
+To experiment with a different number of local epochs, re-run the main model script with a different `--num_epochs` flag. The plots shown below can be generated using `plots.py` file in the repo root.
+
+# Results and Analysis
+
+From the generated data, we computed the aggregate weighted training loss and training accuracy
+(note that weighting is important to correct for label imbalance). Also, we note that `FedAvg`
+method diverges as the number of local epochs increases.
+
+<div style="text-align:center" markdown="1">
+
+![](../_static/images/shake_small_weighted_test_acc.png "Weighted Test Accuracy on Shakespeare dataset using FedAvg")
+![](../_static/images/shake_small_weighted_training_loss.png "Weighted Training Loss on Shakespeare dataset using FedAvg")
+
+</div>
+
+# More Information
+
+More information about the framework, challenges and experiments can be found in the [LEAF paper](https://arxiv.org/abs/1812.01097).