forked from TalwalkarLab/leaf
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
13 changed files
with
259 additions
and
20 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -11,6 +11,5 @@ models/checkpoints/ | |
models/metrics/*.csv | ||
models/sent140/embs.json | ||
_build/ | ||
_static/ | ||
_templates/ | ||
autodoc/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
.wy-side-nav-search, .wy-nav-top { | ||
background: #0b750a; | ||
} |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
Contact | ||
======= | ||
|
||
:: | ||
|
||
Sebastian Caldas | ||
PhD student, Machine Learning Department | ||
Carnegie Mellon University | ||
Email: [first-letter-first-name][last-name]@cmu.edu | ||
|
||
Virginia Smith | ||
Assistant Professor, Electrical and Computer Engineering Department | ||
Carnegie Mellon University | ||
Email: [last-name][first-letter-first-name]@cmu.edu |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,72 @@ | ||
# Systems Resource Requirement Analysis | ||
|
||
In this experiment, we reproduce the systems analysis experiment conducted in the [LEAF paper](https://arxiv.org/abs/1812.01097). | ||
|
||
Specifically, we identify the systems budget (in terms of compute [number of FLOPs] | ||
and network bandwidth) required when training with minibatch SGD vs. `FedAvg`, using the LEAF framework. | ||
|
||
For this example, we shall use the FEMNIST dataset to perform an image classification task using a | ||
2-layer convolutional neural network. | ||
|
||
# Experiment Setup and Execution | ||
|
||
For this experiment, we describe how to use the LEAF framework to execute minibatch SGD for 3 clients with a | ||
10% batch-size. | ||
|
||
## Quickstart script | ||
|
||
In the interest of ease of use, we provide a script for execution of the experiment | ||
for different settings for SGD and `FedAvg`, which may be executed as: | ||
|
||
```bash | ||
leaf/ $> ./femnist.sh <result-output-dir> | ||
``` | ||
|
||
This script will execute the instructions provided using both minibatch SGD and `FedAvg` for different configurations of clients per round, batch size and local epochs, reproducibly generating the data partitions and results observed by the authors during analysis. | ||
|
||
## Dataset fetching and pre-processing | ||
|
||
LEAF contains powerful scripts for fetching and conversion of data into JSON format for easy utilization. | ||
Additionally, these scripts are also capable of subsampling from the dataset, and splitting the dataset | ||
into training and testing sets. | ||
|
||
For our experiment, as a first step, we shall use 5% of the dataset in an 80-20 train/test split. The following command shows | ||
how this can be accomplished (the `--spltseed` and `--smplseed` flags in this case is to enable reproducible generation of the dataset) | ||
|
||
```bash | ||
leaf/data/femnist/ $> ./preprocess.sh -s iid --sf 0.05 -k 0 -t sample --smplseed 1549786595 --spltseed 1549786796 | ||
``` | ||
|
||
After running this script, the `data/femnist/data` directory should contain `train/` and `test/` directories. | ||
|
||
## Model Execution | ||
|
||
Now that we have our data, we can execute our model! For this experiment, the model file is stored | ||
at `models/femnist/cnn.py`. In order train this model using SGD with 3 clients per round | ||
and 10% batch size, we execute the following command: | ||
|
||
```bash | ||
leaf/models $> python main.py -dataset femnist -model cnn -lr 0.06 --minibatch 0.1 --clients-per-round 3 --num-rounds 2000 | ||
``` | ||
|
||
## Metrics Collection | ||
|
||
Executing the above command will write out system and statistical metrics to `leaf/models/metrics/stat_metrics.csv` and `leaf/models/metrics/sys_metrics.csv` - since these are overwritten for every run, we __highly recommend__ storing the generated metrics files at a different location. | ||
|
||
To experiment with a different configuration, re-run the main model script with a different flags. The plots shown below can be generated using `plots.py` file in the repo root. | ||
|
||
# Results and Analysis | ||
|
||
For an accuracy threshold of 75%, we see an improved systems profile of FedAvg when it comes to the communication | ||
vs. local computation trade-off, though we note that in general methods may vary across these two | ||
dimensions, and it is thus important to consider both aspects depending on the problem at hand. | ||
|
||
<div style="text-align:center" markdown="1"> | ||
|
||
![](../_static/images/femnist_75_thresh.png "Systems profile of different methods (75% accuracy threshold)") | ||
|
||
</div> | ||
|
||
# More Information | ||
|
||
More information about the framework, challenges and experiments can be found in the [LEAF paper](https://arxiv.org/abs/1812.01097). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,22 +1,78 @@ | ||
# Twitter Sentiment Analysis | ||
|
||
## Significance of experiments | ||
In this experiment, we reproduce the statistical analysis experiment conducted in the [LEAF paper](https://arxiv.org/abs/1812.01097). Specifically, we investigate the effect of varying the minimum number of | ||
samples per user (for training) on model accuracy when training using `FedAvg` algorithm, | ||
using the LEAF framework. | ||
|
||
- Statistical test - vary number of users to see effect on accuracy/performance | ||
For this example, we shall use Sentiment140 dataset (containing 1.6 million tweets), | ||
and we shall train a 2-layer LSTM model with cross-entropy loss, and using pre-trained GloVe embeddings. | ||
|
||
# Experiment Setup and Execution | ||
|
||
## Quickstart script | ||
|
||
In the interest of ease of use, we provide a script for execution of the experiment | ||
for different min-sample counts, which may be executed as: | ||
|
||
```bash | ||
leaf/ $> ./sent140.sh <result-output-dir> | ||
``` | ||
|
||
This script will execute the instructions provided below for min-sample counts of 3, 10, 30 and 100, reproducibly generating the data partitions and results observed by the authors during analysis. | ||
|
||
## Pre-requisites | ||
|
||
- **Dataset**: Generate data with `k` samples per user, run (`-t sample` required for running baseline impl). | ||
Since this experiment requires pre-trained word embeddings, we recommend running the | ||
`models/sent140/get_embs.sh` file, which fetches 300-dimensional pretrained GloVe vectors. | ||
```bash | ||
leaf/models/sent140/ $> ./get_embs.sh | ||
``` | ||
After extraction, this data is stored in `models/sent140/embs.json`. | ||
|
||
## Dataset fetching and pre-processing | ||
|
||
LEAF contains powerful scripts for fetching and conversion of data into JSON format for easy utilization. | ||
Additionally, these scripts are also capable of subsampling from the dataset, and splitting the dataset | ||
into training and testing sets. | ||
|
||
For our experiment, as a first step, we shall use 50% of the dataset in an 80-20 train/test split, | ||
and we shall discard all users with less than 10 tweets. The following command shows | ||
how this can be accomplished (the `--spltseed` flag in this case is to enable reproducible generation of the dataset) | ||
|
||
```bash | ||
leaf/data/sent140/ $> ./preprocess.sh --sf 0.5 -t sample --tf 0.8 -k 3 --spltseed 1549775860 | ||
``` | ||
|
||
- **GloVe Embeddings**: Setup glove embeddings as a json file (required even for BoW logistic regression since defines vocab dict) - VOCAB_DIR variable in bag_log_reg, sent140/get_embs.sh fetches and extracts embeddings to correct location | ||
After running this script, the `data/sent140/data` directory should contain `train/` and `test/` directories. | ||
|
||
## Model Execution | ||
|
||
- Run (2 clients for 10 rounds, converges to accuracy: 0.609676, 10th percentile: 0.25, 90th percentile 1 for k=100). Uses FedAvg for distributed learning (since --minibatch isn’t specified) | ||
Now that we have our data, we can execute our model! For this experiment, the model file is stored | ||
at `models/sent140/stacked_lstm.py`. In order train this model using `FedAvg` with 2 clients every round for 10 rounds, | ||
we execute the following command: | ||
|
||
```bash | ||
leaf/models $> python3 main.py -dataset sent140 -model stacked_lstm -lr 0.0003 --clients-per-round 2 --num-rounds 10 | ||
``` | ||
|
||
Alternatively, passing `-t small` in place of the latter 2 flags provides the same functionality (as defined in `models/baseline_constants.py` file). | ||
|
||
## Metrics Collection | ||
|
||
Executing the above command will write out system and statistical metrics to `leaf/models/metrics/stat_metrics.csv` and `leaf/models/metrics/sys_metrics.csv` - since these are overwritten for every run, we __highly recommend__ storing the generated metrics files at a different location. | ||
|
||
To experiment with a different min-sample setting, re-run the preprocessing script with a different `-k` flag. The plots shown below can be generated using `plots.py` file in the repo root. | ||
|
||
# Results and Analysis | ||
|
||
Upon performing this experiment, we see that, while median performance degrades only slightly with data-deficient users (i.e., k = 3), the 25th percentile (bottom of box) degrades dramatically. | ||
|
||
<div style="text-align:center" markdown="1"> | ||
|
||
- Metrics written out to metrics/stat_metrics.csv and metrics/sys_metrics.csv (configurable via main.py:L20,21) | ||
![](../_static/images/leaf_rep_sent140.png "Sentiment140 Results") | ||
|
||
### Quickstart script | ||
</div> | ||
|
||
In the root of the LEAF directory, execute `./sent140.sh` | ||
# More Information | ||
|
||
More information about the framework, challenges and experiments can be found in the [LEAF paper](https://arxiv.org/abs/1812.01097). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,71 @@ | ||
# Convergence Behaviour of FederatedAveraging | ||
|
||
In this experiment, we reproduce the convergence analysis experiment conducted in the [LEAF paper](https://arxiv.org/abs/1812.01097). Specifically, we investigate the convergence behaviour of `FedAvg` algorithm | ||
when varying the number of local epochs, using the LEAF framework. | ||
|
||
For this example, we shall use the Shakespeare dataset to perform a character-prediction task; | ||
that is, given a statement by a character in a play, we shall try to predict the next character in the sentence. | ||
For this experiment, we shall train a 2-layer LSTM model with cross-entropy loss, using randomly initialized word embeddings. | ||
|
||
# Experiment Setup and Execution | ||
|
||
## Quickstart script | ||
|
||
In the interest of ease of use, we provide a script for execution of the experiment | ||
for different number of local epochs, which may be executed as: | ||
|
||
```bash | ||
leaf/ $> ./shakespeare.sh <result-output-dir> | ||
``` | ||
|
||
This script will execute the instructions provided below for local epoch values of 1 and 20, | ||
reproducibly generating the data partitions and results observed by the authors during analysis. | ||
|
||
## Dataset fetching and pre-processing | ||
|
||
LEAF contains powerful scripts for fetching and conversion of data into JSON format for easy utilization. | ||
Additionally, these scripts are also capable of subsampling from the dataset, and splitting the dataset | ||
into training and testing sets. | ||
|
||
For our experiment, as a first step, we shall use 5% of the dataset in an 90-10 train/test split, | ||
and we shall discard all users with less than 64 samples. The following command shows | ||
how this can be accomplished (the `--spltseed` and `--smplseed` flags in this case is to enable reproducible generation of the dataset) | ||
|
||
```bash | ||
leaf/data/shakespeare/ $> ./preprocess.sh --sf 0.05 -t sample --tf 0.9 -k 64 --smplseed 1550262838 --spltseed 1550262839 | ||
``` | ||
|
||
After running this script, the `data/shakespeare/data` directory should contain `train/` and `test/` directories. | ||
|
||
## Model Execution | ||
|
||
Now that we have our data, we can execute our model! For this experiment, the model file is stored | ||
at `models/shakespeare/stacked_lstm.py`. In order train this model using `FedAvg` with 10 clients per round | ||
and 1 local epoch per round, we execute the following command: | ||
|
||
```bash | ||
leaf/models $> python main.py -dataset shakespeare -model stacked_lstm --num-rounds 81 --clients-per-round 10 --num_epochs 1 -lr 0.8 | ||
``` | ||
|
||
## Metrics Collection | ||
|
||
Executing the above command will write out system and statistical metrics to `leaf/models/metrics/stat_metrics.csv` and `leaf/models/metrics/sys_metrics.csv` - since these are overwritten for every run, we __highly recommend__ storing the generated metrics files at a different location. | ||
|
||
To experiment with a different number of local epochs, re-run the main model script with a different `--num_epochs` flag. The plots shown below can be generated using `plots.py` file in the repo root. | ||
|
||
# Results and Analysis | ||
|
||
From the generated data, we computed the aggregate weighted training loss and training accuracy | ||
(note that weighting is important to correct for label imbalance). Also, we note that `FedAvg` | ||
method diverges as the number of local epochs increases. | ||
|
||
<div style="text-align:center" markdown="1"> | ||
|
||
![](../_static/images/shake_small_weighted_test_acc.png "Weighted Test Accuracy on Shakespeare dataset using FedAvg") | ||
![](../_static/images/shake_small_weighted_training_loss.png "Weighted Training Loss on Shakespeare dataset using FedAvg") | ||
|
||
</div> | ||
|
||
# More Information | ||
|
||
More information about the framework, challenges and experiments can be found in the [LEAF paper](https://arxiv.org/abs/1812.01097). |