adap · tanertopal · Mar 1, 2024 · Mar 1, 2024 · Mar 1, 2024 · Mar 1, 2024
@@ -124,7 +124,7 @@ python -m depthfl.main --multirun exclusive_learning=true model_size=1,2,3,4
 
 **Table 2** 
 
-100% (a), 75%(b), 50%(c), 25% (d) cases are exclusive learning scenario. 100% (a) exclusive learning means, the global model and every local model are equal to the smallest local model, and 100% clients participate in learning. Likewise, 25% (d) exclusive learning means, the global model and every local model are equal to the larget local model, and only 25% clients participate in learning.
+100% (a), 75%(b), 50%(c), 25% (d) cases are exclusive learning scenario. 100% (a) exclusive learning means, the global model and every local model are equal to the smallest local model, and 100% clients participate in learning. Likewise, 25% (d) exclusive learning means, the global model and every local model are equal to the largest local model, and only 25% clients participate in learning.
 
 | Scaling Method | Dataset | Global Model | 100% (a) | 75% (b) | 50% (c) | 25% (d) | 
 | :---: | :---: | :---: | :---: | :---: | :---: | :---: | 

@@ -33,7 +33,7 @@ Setting up your machine
 
 Common to all baselines is `Poetry <https://python-poetry.org/docs/>`_, a tool to manage Python dependencies. Baselines also make use of `Pyenv <https://github.com/pyenv/pyenv>`_. You'll need to install both on your system before running a baseline. What follows is a step-by-step guide on getting :code:`pyenv` and :code:`Poetry` installed on your system.
 
-Let's begin by installing :code:`pyenv`. We'll be following the standard procedure. Please refere to the `pyenv docs <https://github.com/pyenv/pyenv#installation>`_ for alternative ways of installing it.
+Let's begin by installing :code:`pyenv`. We'll be following the standard procedure. Please refer to the `pyenv docs <https://github.com/pyenv/pyenv#installation>`_ for alternative ways of installing it.
 
 .. code-block:: bash
 
@@ -49,7 +49,7 @@ Let's begin by installing :code:`pyenv`. We'll be following the standard procedu
   command -v pyenv >/dev/null || export PATH="$PYENV_ROOT/bin:$PATH"
   eval "$(pyenv init -)"
 
-Verify your installtion by opening a new terminal and
+Verify your installation by opening a new terminal and
 
 .. code-block:: bash
 
@@ -63,7 +63,7 @@ Then you can proceed and install any version of Python. Most baselines currently
 
   pyenv install 3.10.6
   # this will take a little while
-  # once done, you should see that that version is avaialble
+  # once done, you should see that that version is available
   pyenv versions
   # system
   # * 3.10.6  # <-- you just installed this

@@ -104,7 +104,7 @@ poetry shell
 ```
 
 ### Google Colab
-If you want to setup the environemnt on Google Colab, please executed the script `conf-colab.sh`, just use the Colab terminal and the following:
+If you want to setup the environment on Google Colab, please executed the script `conf-colab.sh`, just use the Colab terminal and the following:
 
 ```bash
 chmod +x conf-colab.sh

@@ -41,7 +41,7 @@ A more detailed explanation of the datasets is given in the following table.
 
 |                    | MNIST              | MNIST-M                                                  | SVHN                      | USPS                                                         | SynthDigits                                                                      |
 | ------------------ | ------------------ | -------------------------------------------------------- | ------------------------- | ------------------------------------------------------------ | -------------------------------------------------------------------------------- |
-| data type          | handwritten digits | MNIST modification randomly colored with colored patches | Street view house numbers | handwritten digits from envelopes by the U.S. Postal Service | Syntehtic digits Windows TM font varying the orientation, blur and stroke colors |
+| data type          | handwritten digits | MNIST modification randomly colored with colored patches | Street view house numbers | handwritten digits from envelopes by the U.S. Postal Service | Synthetic digits Windows TM font varying the orientation, blur and stroke colors |
 | color              | greyscale          | RGB                                                      | RGB                       | greyscale                                                    | RGB                                                                              |
 | pixelsize          | 28x28              | 28 x 28                                                  | 32 x32                    | 16 x16                                                       | 32 x32                                                                           |
 | labels             | 0-9                | 0-9                                                      | 1-10                      | 0-9                                                          | 1-10                                                                             |

@@ -222,7 +222,7 @@ You can override settings directly from the command line in this way:
 ```bash
 python -m fedmlb.main clients_per_round=10 # this will run using 10 clients per round instead of 5 clients as the default config 
 
-# this will select the dataset partitioned with 0.6 concentration paramater instead of 0.3 as the default config
+# this will select the dataset partitioned with 0.6 concentration parameter instead of 0.3 as the default config
 python -m fedmlb.main dataset_config.alpha_dirichlet=0.6
 ```
 
@@ -249,7 +249,7 @@ python -m fedmlb.main dataset_config.alpha_dirichlet=0.6 total_clients=500 clien
 ```
 
 #### Tiny-Imagenet
-For Tiny-ImageNet, as in the orginal paper, batch size of local updates should be set 
+For Tiny-ImageNet, as in the original paper, batch size of local updates should be set 
 to 100 in settings with 100 clients and to 20 in settings with 500 clients;
 this is equal to set the amount of local_updates to 50 (as the default) -- 
 so no change to batch size is required --, in fact

@@ -88,7 +88,7 @@ As for the parameters ratio ($\gamma$) we use the following model sizes. As in t
 
 - The Jacobian correction was not incorporated into our implementation, primarily due to the lack of explicit instructions in the paper regarding the specific implementation of the dual update principle mentioned in the Jacobian correction section.
 
-- It was observed that data generation is crutial for model convergence
+- It was observed that data generation is crucial for model convergence
 
 ## Environment Setup
 To construct the Python environment follow these steps:

@@ -19,7 +19,7 @@ dataset: [CIFAR-10, FLICKR-AES]
 
 **What’s implemented:** The code in this directory replicates the experiments in _Federated Learning with Personalization Layers_ (Arivazhagan et al., 2019) for CIFAR10 and FLICKR-AES datasets, which proposed the `FedPer` model. Specifically, it replicates the results found in figures 2, 4, 7, and 8 in their paper. __Note__ that there is typo in the caption of Figure 4 in the article, it should be CIFAR10 and __not__ CIFAR100. 
 
-**Datasets:** CIFAR10 from PyTorch's Torchvision and FLICKR-AES. FLICKR-AES was proposed as dataset in _Personalized Image Aesthetics_ (Ren et al., 2017) and can be downloaded using a link provided on thier [GitHub](https://github.com/alanspike/personalizedImageAesthetics). One must first download FLICKR-AES-001.zip (5.76GB), extract all inside and place in baseline/FedPer/datasets. To this location, also download the other 2 related files: (1) FLICKR-AES_image_labeled_by_each_worker.csv, and (2) FLICKR-AES_image_score.txt. Images are also scaled to 224x224 for both datasets. This is not explicitly stated in the paper but seems to be boosting performance. Also, for FLICKR dataset, it is stated in the paper that they use data from clients with more than 60 and less than 290 rated images. This amounts to circa 60 clients and we randomly select 30 out of these (as in paper). Therefore, the results might differ somewhat but only slighly. Since the pre-processing steps in the paper are somewhat obscure, the metric values in the plots below may differ slightly, but not the overall results and findings. 
+**Datasets:** CIFAR10 from PyTorch's Torchvision and FLICKR-AES. FLICKR-AES was proposed as dataset in _Personalized Image Aesthetics_ (Ren et al., 2017) and can be downloaded using a link provided on their [GitHub](https://github.com/alanspike/personalizedImageAesthetics). One must first download FLICKR-AES-001.zip (5.76GB), extract all inside and place in baseline/FedPer/datasets. To this location, also download the other 2 related files: (1) FLICKR-AES_image_labeled_by_each_worker.csv, and (2) FLICKR-AES_image_score.txt. Images are also scaled to 224x224 for both datasets. This is not explicitly stated in the paper but seems to be boosting performance. Also, for FLICKR dataset, it is stated in the paper that they use data from clients with more than 60 and less than 290 rated images. This amounts to circa 60 clients and we randomly select 30 out of these (as in paper). Therefore, the results might differ somewhat but only slightly. Since the pre-processing steps in the paper are somewhat obscure, the metric values in the plots below may differ slightly, but not the overall results and findings. 
 
 ```bash
 # These steps are not needed if you are only interested in CIFAR-10
@@ -61,7 +61,7 @@ It's worth mentioning that GPU memory for each client is ~7.5GB. When training o
 
 Please see how models are implemented using a so called model_manager and model_split class since FedPer uses head and base layers in a neural network. These classes are defined in the models.py file and thereafter called when building new models in the directory /implemented_models. Please, extend and add new models as you wish. 
 
-**Dataset:** CIFAR10, FLICKR-AES. CIFAR10 will be partitioned based on number of classes for data that each client shall recieve e.g. 4 allocated classes could be [1, 3, 5, 9]. FLICKR-AES is an unbalanced dataset, so there we only apply random sampling. 
+**Dataset:** CIFAR10, FLICKR-AES. CIFAR10 will be partitioned based on number of classes for data that each client shall receive e.g. 4 allocated classes could be [1, 3, 5, 9]. FLICKR-AES is an unbalanced dataset, so there we only apply random sampling. 
 
 **Training Hyperparameters:** The hyperparameters can be found in conf/base.yaml file which is the configuration file for the main script. 
 
@@ -77,7 +77,7 @@ Please see how models are implemented using a so called model_manager and model_
 | algorithm | fedavg|
 
 **Stateful Clients:**
-In this Baseline (FedPer), we must store the state of the local client head while aggregation of body parameters happen at the server. Flower is currently making this possible but for the time being, we reside to storing client _head_ state in a folder called client_states. We store the values after each fit and evaluate function carried out on each client, and call for the state before executing these funcitons. Moreover, the state of a unique client is accessed using the client ID. 
+In this Baseline (FedPer), we must store the state of the local client head while aggregation of body parameters happen at the server. Flower is currently making this possible but for the time being, we reside to storing client _head_ state in a folder called client_states. We store the values after each fit and evaluate function carried out on each client, and call for the state before executing these functions. Moreover, the state of a unique client is accessed using the client ID. 
 
 > NOTE: This is a work-around so that the local head parameters are not reset before each fit and evaluate. Nevertheless, it can come to change with future releases. 
 

@@ -91,7 +91,7 @@ To run using FedAvg:
 # This is done so to match the experimental setup in the FedProx paper
 python -m fedprox.main --config-name fedavg
 
-# this config can also be overriden from the CLI
+# this config can also be overridden from the CLI
 ```
 
 ## Expected results

@@ -44,11 +44,11 @@ Next, you'll need to download the datasets. In the case of SpeechCommands, some
 # Make the shell script executable
 chmod +x setup_datasets.sh
 
-# The below script will download the datasets and create a directory structure requir to run this experiment.
+# The below script will download the datasets and create a directory structure require to run this experiment.
 ./setup_datasets.sh
 
 # If you want to run the SpeechCommands experiment, pre-process the dataset
-# This will genereate a few training example from the _silence_ category
+# This will generate a few training example from the _silence_ category
 python -m fedstar.dataset_preparation
 # Please note the above will make following changes:
 #    * Add new files to datasets/speech_commands/Data/Train/_silence_
@@ -64,7 +64,7 @@ python -m fedstar.dataset_preparation
 
 ```python
 # For Eg:- We have a system with two GPUs with 8GB and 4GB VRAM.
-#          The modified varaible will looks like below.
+#          The modified variable will looks like below.
 gpu_free_mem = [8000,4000]
 ```
 

@@ -152,7 +152,7 @@ To run using `FedAvg`:
 # This is done so to match the experimental setup in the paper
 python -m fedvssl.main strategy.fedavg=true
 
-# This config can also be overriden.
+# This config can also be overridden.
 ```
 
 Running any of the above will create a directory structure in the form of `outputs/<DATE>/<TIME>/fedvssl_results` to save the global checkpoints and the local clients' training logs.

@@ -59,7 +59,7 @@ For more details, please refer to the relevant section in the paper.
 | `sb_config` | `fedwav2vec2/conf/sb_config/w2v2.yaml` | Speechbrain config file for architecture model. Please refer to [SpeechBrain](https://github.com/speechbrain/speechbrain) for more information |
 | `rounds` | `100` | Indicate the number of Federated Learning (FL) rounds|
 | `local_epochs` | `20` | Specify the number of training epochs at the client side |
-| `total_clients` | `1943` | Size of client pool, with a maxium set at 1943 clients|
+| `total_clients` | `1943` | Size of client pool, with a maximum set at 1943 clients|
 | `server_cid` | `19999` | ID of the server to distinguish from the client's ID |
 | `server_device` | `cuda` | You can choose between `cpu` or `cuda` for centralised evaluation, but it is recommended to use `cuda`|
 | `parallel_backend` | `false` | Multi-gpus training. Only active if you have more than 1 gpu per client | 
@@ -109,7 +109,7 @@ python -m fedwav2vec2.dataset_preparation
 # Run with default arguments (one client per GPU)
 python -m fedwav2vec2.main
 
-# if you have a large GPU (32GB+) you migth want to fit two per GPU
+# if you have a large GPU (32GB+) you might want to fit two per GPU
 python -m fedwav2vec2.main client_resources.num_gpus=0.5
 
 # the global model can be saved at the end of each round if you specify a checkpoint path

@@ -80,7 +80,7 @@ def gen_client_fn(
     Parameters
     ----------
     device : torch.device
-        The device on which the the client will train on and test on.
+        The device on which the client will train on and test on.
     iid : bool
         The way to partition the data for each client, i.e. whether the data
         should be independent and identically distributed between the clients

@@ -95,7 +95,7 @@ To run HeteroFL experiments in poetry activated environment:
 # The main experiment implemented in your baseline using default hyperparameters (that should be setup in the Hydra configs)
 # should run (including dataset download and necessary partitioning) by executing the command:
 
-python -m heterofl.main  # Which runs the heterofl with arguments availbale in heterfl/conf/base.yaml
+python -m heterofl.main  # Which runs the heterofl with arguments available in heterfl/conf/base.yaml
 
 # We could override the settings that were specified in base.yaml using the command-line-arguments
 # Here's an example for changing the dataset name, non-iid and model

@@ -214,7 +214,7 @@ def fit(self, ins: FitIns) -> FitRes:
         aggregated_trees = ins.parameters[1]  # type: ignore # noqa: E501 # pylint: disable=line-too-long
 
         if isinstance(aggregated_trees, list):
-            print("Client " + self.cid + ": recieved", len(aggregated_trees), "trees")
+            print("Client " + self.cid + ": received", len(aggregated_trees), "trees")
         else:
             print("Client " + self.cid + ": only had its own tree")
         trainloader: Any = single_tree_preds_from_each_client(

@@ -105,7 +105,7 @@ python -m moon.main --config-name cifar100_fedprox
 
 ## Expected Results
 
-You can find the output logs of a single run in this [link](https://drive.google.com/drive/folders/1YZEU2NcHWEHVyuJMlc1QvBSAvNMjH-aR?usp=share_link). After running the above commands, you can see the accuracy list at the end of the ouput, which is the test accuracy of the global model. For example, in one running, for CIFAR-10 with MOON, the accuracy after running 100 rounds is 0.7071. 
+You can find the output logs of a single run in this [link](https://drive.google.com/drive/folders/1YZEU2NcHWEHVyuJMlc1QvBSAvNMjH-aR?usp=share_link). After running the above commands, you can see the accuracy list at the end of the output, which is the test accuracy of the global model. For example, in one running, for CIFAR-10 with MOON, the accuracy after running 100 rounds is 0.7071. 
 
 For CIFAR-10 with FedProx, the accuracy after running 100 rounds is 0.6852. For CIFAR100 with MOON, the accuracy after running 100 rounds is 0.6636. For CIFAR100 with FedProx, the accuracy after running 100 rounds is 0.6494. The results are summarized below:
 

@@ -21,7 +21,7 @@ algorithms: [FedAvg, SCAFFOLD, FedProx, FedNova]
 
 **Datasets:** MNIST, CIFAR10, and Fashion-mnist from PyTorch's Torchvision
 
-**Hardware Setup:** These experiments were run on a linux server with 56 CPU threads with 250 GB Ram. There are 105 configurations to run per seed and at any time 7 configurations have been run parallely. The experiments required close to 12 hrs to finish for one seed. Nevertheless, to run a subset of configurations, such as only one FL protocol across all datasets and splits, a machine with 4-8 threads and 16 GB memory can run in reasonable time.
+**Hardware Setup:** These experiments were run on a linux server with 56 CPU threads with 250 GB Ram. There are 105 configurations to run per seed and at any time 7 configurations have been run parallelly. The experiments required close to 12 hrs to finish for one seed. Nevertheless, to run a subset of configurations, such as only one FL protocol across all datasets and splits, a machine with 4-8 threads and 16 GB memory can run in reasonable time.
 
 **Contributors:** Aashish Kolluri, PhD Candidate, National University of Singapore
 
@@ -59,7 +59,7 @@ For FedProx algorithm the proximal parameter is tuned from values {0.001, 0.01,
 ## Environment Setup
 
 ```bash
-# Setup the base poetry enviroment from the niid_bench directory
+# Setup the base poetry environment from the niid_bench directory
 # Set python version
 pyenv local 3.10.6
 # Tell poetry to use python 3.10