Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
emirceyani committed Jun 8, 2021
1 parent a15a2aa commit 9ae1667
Showing 1 changed file with 225 additions and 2 deletions.
227 changes: 225 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,225 @@
# SpreadGNN
SpreadGNN: Serverless Multi-Task Learning Framework for Graph Neural Networks
# SpreadGNN: Serverless Multi-task Federated Learning for Graph Neural Networks

This repository is the official implementation of SpreadGNN: Serverless Multi-task Federated Learning for Graph Neural Networks

## 1. Introduction


Graph Neural Networks (GNNs) are the first choice methods for graph machine learning problems thanks to their ability to learn state-of-the-art level representations from graph-structured data. However, centralizing a massive amount of real-world graph data for GNN training is prohibitive due to user-side privacy concerns, regulation restrictions, and commercial competition. Federated Learning is the de-facto standard for collaborative training of machine learning models over many distributed edge devices without the need for centralization. Nevertheless, training graph neural networks in a federated setting is vaguely defined and brings statistical and systems challenges. This work proposes SpreadGNN, a novel multi-task federated training framework capable of operating in the presence of partial labels and absence of a central server for the first time in the literature. SpreadGNN extends federated multi-task learning to realistic serverless settings for GNNs, and utilizes a novel optimization algorithm with a convergence guarantee, Decentralized Periodic Averaging SGD (DPA-SGD), to solve decentralized multi-task learning problems. We empirically demonstrate the efficacy of our framework on a variety of non-I.I.D. distributed graph-level molecular property prediction datasets with partial labels. Our results show that SpreadGNN outperforms GNN models trained over a central server-dependent federated learning system, even in constrained topologies.


## 2. Installation


```bash
conda create -n spreadgnn python=3.7
conda activate spreadgnn
conda install pytorch torchvision torchaudio cudatoolkit=11.1 -c pytorch -c nvidia
conda install -c anaconda mpi4py grpcio
conda install scikit-learn numpy h5py setproctitle networkx
pip install -r requirements.txt
cd FedML; git submodule init; git submodule update; cd ../;
pip install -r FedML/requirements.txt
```


## 3. Data Preparation
For each dataset you want to try run the .sh file located in the dataset folder.
For more datasets, visit http://moleculenet.ai/


## 4. Experiments


### Distributed/Federated Molecule Property Classification experiments
```
sh run_fedavg_distributed_pytorch.sh 6 1 1 1 graphsage homo 150 1 1 0.0015 256 256 0.3 256 256 sider "./../../../data/sider/" 0
##run on background
nohup sh run_fedavg_distributed_pytorch.sh 6 1 1 1 graphsage homo 150 1 1 0.0015 256 256 0.3 256 256 sider "./../../../data/sider/" 0 > ./fedavg-graphsage.log 2>&1 &
```

### Distributed/Federated Molecule Property Regression experiments
```
sh run_fedavg_distributed_reg.sh 6 1 1 1 graphsage homo 150 1 1 0.0015 256 256 0.3 256 256 freesolv "./../../../data/freesolv/" 0
##run on background
nohup sh run_fedavg_distributed_reg.sh 6 1 1 1 graphsage homo 150 1 1 0.0015 256 256 0.3 256 256 freesolv "./../../../data/freesolv/" 0 > ./fedavg-graphsage.log 2>&1 &
```

#### Arguments for Distributed/Federated Training
This is an ordered list of arguments used in distributed/federated experiments. Note, there are additional parameters for this setting.
```
CLIENT_NUM=$1 -> Number of clients in dist/fed setting
WORKER_NUM=$2 -> Number of workers
SERVER_NUM=$3 -> Number of servers
GPU_NUM_PER_SERVER=$4 -> GPU number per server
MODEL=$5 -> Model name
DISTRIBUTION=$6 -> Dataset distribution. homo for IID splitting. hetero for non-IID splitting.
ROUND=$7 -> Number of Distiributed/Federated Learning Rounds
EPOCH=$8 -> Number of epochs to train clients' local models
BATCH_SIZE=$9 -> Batch size
LR=${10} -> learning rate
SAGE_DIM=${11} -> Dimenionality of GraphSAGE embedding
NODE_DIM=${12} -> Dimensionality of node embeddings
SAGE_DR=${13} -> Dropout rate applied between GraphSAGE Layers
READ_DIM=${14} -> Dimensioanlity of readout embedding
GRAPH_DIM=${15} -> Dimensionality of graph embedding
DATASET=${16} -> Dataset name (Please check data folder to see all available datasets)
DATA_DIR=${17} -> Dataset directory
CI=${18}
```

### Distributed/Federated Molecule Property Classification with FedGMTL
```
sh run_fedavg_distributed_pytorch.sh 6 1 1 1 graphsage homo 150 1 1 0.0015 256 256 0.3 256 256 sider "./../../../data/sider/" 0
##run on background
nohup sh run_fedavg_distributed_pytorch.sh 6 1 1 1 graphsage homo 150 1 1 0.0015 256 256 0.3 256 256 sider "./../../../data/sider/" 0 > ./fedavg-graphsage.log 2>&1 &
```

#FedGMTL Classification experiments

```
sh run_fedgmtl.sh 8 8 1 1 graphsage hetero 0.5 70 1 1 0.0015 0.3 1 0 64 64 0.3 64 64 1 sider =./../../../data/sider/ 1 0
```

#FedGMTL Regression experiments

```
sh run_fedgmtl_reg.sh 8 8 1 1 graphsage hetero 0.5 70 1 1 0.0015 0.3 1 0 64 64 0.3 64 64 1 qm8 "./../../../data/qm8/" 1 0
```

#### Arguments for FedGMTL
This is an ordered list of arguments used in distributed/federated experiments. Note, there are additional parameters for this setting.
```
CLIENT_NUM=$1 -> Number of clients in dist/fed setting
WORKER_NUM=$2 -> Number of workers
SERVER_NUM=$3 -> Number of servers
GPU_NUM_PER_SERVER=$4 -> GPU number per server
MODEL=$5 -> Model name
DISTRIBUTION=$6 -> Dataset distribution. homo for IID splitting. hetero for non-IID splitting.
PARTITION_ALPHA=$7 -> Alpha parameter for Dirichlet distribution
ROUND=$8 -> Number of Distributed/Federated Learning Rounds
EPOCH=$9 -> Number of epochs to train clients' local models
BATCH_SIZE=${10} -> Batch size
LR=${11} -> Learning rate
TASK_W=${12} -> Task-Relationship regularizer weight
TASK_W_DECAY=${13} -> Decay for Task-Relationship regularizer
WD=${14} -> Weight Decay Coefficient
HIDDEN_DIM=${15} -> Dimensionality of GNN Hidden Layer
NODE_DIM=${16} -> Dimensionality of Node embeddings
DR=${17} -> Dropout rate applied between GraphSAGE Layers
READ_DIM=${18} -> Dimensionality of readout embedding
GRAPH_DIM=${19} -> Dimensionality of graph embedding
MASK_TYPE=${20} -> Mask scenario (0,1,2)
DATASET=${21} -> Dataset name
DATA_DIR=${22} -> Directory
CI=${23}
```

#SpreadGNN Classification experiments

```
sh run_spreadgnn.sh 8 8 1 1 graphsage hetero 0.5 70 1 1 0.0015 0.3 1 0 64 64 0.3 64 64 1 sider =./../../../data/sider/ 1 0
```

#SpreadGNN Regression experiments

```
sh run_spreadgnn_reg.sh 8 8 1 1 graphsage hetero 0.5 70 1 1 0.0015 0.3 1 0 64 64 0.3 64 64 1 qm8 "./../../../data/qm8/" 1 0
```

#### Arguments for SpreadGNN
This is an ordered list of arguments used in distributed/federated experiments. Note, there are additional parameters for this setting.
```
CLIENT_NUM=$1 -> Number of clients in dist/fed setting
WORKER_NUM=$2 -> Number of workers
SERVER_NUM=$3 -> Number of servers
GPU_NUM_PER_SERVER=$4 -> GPU number per server
MODEL=$5 -> Model name
DISTRIBUTION=$6 -> Dataset distribution. homo for IID splitting. hetero for non-IID splitting.
PARTITION_ALPHA=$7 -> Alpha parameter for Dirichlet distribution
ROUND=$8 -> Number of Distributed/Federated Learning Rounds
EPOCH=$9 -> Number of epochs to train clients' local models
BATCH_SIZE=${10} -> Batch size
LR=${11} -> Learning rate
TASK_W=${12} -> Task-Relationship regularizer weight
TASK_W_DECAY=${13} -> Decay for Task-Relationship regularizer
WD=${14} -> Weight Decay Coefficient
HIDDEN_DIM=${15} -> Dimensionality of GNN Hidden Layer
NODE_DIM=${16} -> Dimensionality of Node embeddings
DR=${17} -> Dropout rate applied between GraphSAGE Layers
READ_DIM=${18} -> Dimensionality of readout embedding
GRAPH_DIM=${19} -> Dimensionality of graph embedding
MASK_TYPE=${20} -> Mask scenario (0,1,2)
DATASET=${21} -> Dataset name
DATA_DIR=${22} -> Directory
PERIOD=${23} -> Communication Period for Parameter Exchange
CI=${24}
```


## 6. Code Structure of SpreadGNN

- `FedML`: a soft repository link generated using `git submodule add https://github.com/FedML-AI/FedML`.

- `data`: provide data downloading scripts and store the downloaded datasets.


- `data_preprocessing`: data loaders

- `model`: advanced molecular ML models.

- `trainer`: please define your own `trainer.py` by inheriting the base class in `FedML/fedml-core/trainer/fedavg_trainer.py`.
Some tasks can share the same trainer.

- `experiments/distributed`:
1. `experiments` is the entry point for training. It contains experiments in different platforms.
2. Every experiment integrates FOUR building blocks `FedML` (federated optimizers), `data_preprocessing`, `model`, `trainer`.


## 5. Update FedML Submodule
```
cd FedML
git checkout master && git pull
cd ..
git add FedML
git commit -m "updating submodule FedML to latest"
git push
```



## 6. Citation
Please cite our FedML paper if it helps your research.
You can describe us in your paper like this: "We develop our experiments based on FedML".
```
@misc{he2020fedml,
title={FedML: A Research Library and Benchmark for Federated Machine Learning},
author={Chaoyang He and Songze Li and Jinhyun So and Xiao Zeng and Mi Zhang and Hongyi Wang and Xiaoyang Wang and Praneeth Vepakomma and Abhishek Singh and Hang Qiu and Xinghua Zhu and Jianzong Wang and Li Shen and Peilin Zhao and Yan Kang and Yang Liu and Ramesh Raskar and Qiang Yang and Murali Annavaram and Salman Avestimehr},
year={2020},
eprint={2007.13518},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
@misc{he2021fedgraphnn,
title={FedGraphNN: A Federated Learning System and Benchmark for Graph Neural Networks},
author={Chaoyang He and Keshav Balasubramanian and Emir Ceyani and Yu Rong and Peilin Zhao and Junzhou Huang and Murali Annavaram and Salman Avestimehr},
year={2021},
eprint={2104.07145},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
@misc{he2021spreadgnn,
title={SpreadGNN: Serverless Multi-task Federated Learning for Graph Neural Networks},
author={Chaoyang He and Emir Ceyani and Keshav Balasubramanian and Murali Annavaram and Salman Avestimehr},
year={2021},
eprint={2106.02743},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
```

0 comments on commit 9ae1667

Please sign in to comment.