GitHub - Mrxiaoyuer/Hackthon-GMU

Federated Learning system for Transporation Mode Prediction based on Personal Mobility Data

Backgrounds

Personal mobility trajectories/traces could benefit a number of practical application scenarios, e.g., pandemic control, transportation system management, user analysis and product recommendation, etc. For example, Google COVID-19 Community Mobility Reports[1] demonstrated the daily people movement trend and have been utilized to accurately predict the influence of the community[2] like traveling agents, retail enterprises, etc. On the other hand, such mobility traces are also privacy-critical to the users, as they contain or can be used to infer highly private personal information, like home/work addresses, activity patterns, etc. Therefore, how to effectively utilize the data with high privacy-preserving degree, as well as to benefit the real-world applications remains challengeable.

Project Overview

In this project, we propose to apply the novel Federated Learning (FL)[3] framework to address the transportation mode prediction task with the privacy preserving service-level requirement.

As a deep-learning (DL) distributed training framework, Federated Learning could enable model training on the local devices without needs to upload the users' private data, thus greatly enhancing the privacy preserving capability as well as maintaining similar convergence accuracy of the model. Therefore, applying FL in the personal mobility data-related use scenarios have three major benefits:

###(1) High Privacy-Preserving Capability: The personal mobility data stays in the users' local devices and do not need to be sent to the central server, thus greatly reducing the risk of personal data leakage;

###(2) Implementation Efficiency: As there is no need to transmit the data to the central server, both the communication cost and the information transmission encryption efforts could be saved, thus achieving higher implementation efficiency;

###(3) Flexible User Participation: Meanwhile, the distributed training capability of FL enables salable amounts of users to flexibly participate in the training process, thus contributing and enhancing the overall application performance.

How to use the repo.

Enviorment Setup.

The project is implemented using PyTorch, and tested under the following enviorments:

Ubuntu 16.04
NVIDIA Driver == 440.64
CUDA == 10.2
PyTorch == 1.5.0
torchvision == 0.6.0
Scikit-learn == 0.23.2

Tensorboard is recommended but not required to visualize the training log.

Data Preparation.

Geolife Dataset: Raw Data
Preprocessed trajectories dataset (in numpy format):

Trajectory Data
Labels

After downloading the preprocessed data, place the data images.npy & labels.npy into the \data folder.

Command Lines.

Baseline: Centralized training.

In this case, all training data is sent/stored to the central node and conduct central training.

python main.py --lr 0.1 --node 1

Federated Training: Simulate federated training.

In this case, training data is split into 2, 4 or 8 nodes and conduct federated learning with FedAvg.

python main.py --lr 0.1 --nodes 2 --bs 32 # Fed Learning with 2 nodes.
python main.py --lr 0.1 --nodes 4 --bs 32 # Fed Learning with 4 nodes.
python main.py --lr 0.1 --nodes 8 --bs 32 # Fed Learning with 8 nodes.

Evaluation: Evaluate trained models. The default location of saved model is in \checkpoint folder.

Run the following commands to evaluate the model performance.

python eval.py --model ckpt_1 # model name.
python eval.py --model ckpt_2
python eval.py --model ckpt_4
python eval.py --model ckpt_8

Released Models: We have released our centralized and FL-trained models in the \checkpoint folder including 4 models:

ckpt_1node_67.52.pth, corresponding to the centralized training model;
ckpt_2node_64.46.pth, corresponding to FL training with 2 clients;
ckpt_4node_66.88.pth, corresponding to FL training with 4 clients;
ckpt_8node_67.13.pth, corresponding to FL training with 8 clients.

Questions

If you have any questions, please reach to the author (email: fyu2@gmu.edu).

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
__pycache__		__pycache__
checkpoint		checkpoint
data		data
models		models
Hackthon-GMU-Report.pdf		Hackthon-GMU-Report.pdf
README.md		README.md
dataset.py		dataset.py
eval.py		eval.py
main.py		main.py
shrun.sh		shrun.sh
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Federated Learning system for Transporation Mode Prediction based on Personal Mobility Data

Backgrounds

Project Overview

How to use the repo.

Enviorment Setup.

Data Preparation.

Command Lines.

Questions

About

Releases

Packages

Languages

Mrxiaoyuer/Hackthon-GMU

Folders and files

Latest commit

History

Repository files navigation

Federated Learning system for Transporation Mode Prediction based on Personal Mobility Data

Backgrounds

Project Overview

How to use the repo.

Enviorment Setup.

Data Preparation.

Command Lines.

Questions

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages