Skip to content

Mrxiaoyuer/Hackthon-GMU

Repository files navigation

Federated Learning system for Transporation Mode Prediction based on Personal Mobility Data

Backgrounds

Personal mobility trajectories/traces could benefit a number of practical application scenarios, e.g., pandemic control, transportation system management, user analysis and product recommendation, etc. For example, Google COVID-19 Community Mobility Reports[1] demonstrated the daily people movement trend and have been utilized to accurately predict the influence of the community[2] like traveling agents, retail enterprises, etc. On the other hand, such mobility traces are also privacy-critical to the users, as they contain or can be used to infer highly private personal information, like home/work addresses, activity patterns, etc. Therefore, how to effectively utilize the data with high privacy-preserving degree, as well as to benefit the real-world applications remains challengeable.

Project Overview

In this project, we propose to apply the novel Federated Learning (FL)[3] framework to address the transportation mode prediction task with the privacy preserving service-level requirement.

As a deep-learning (DL) distributed training framework, Federated Learning could enable model training on the local devices without needs to upload the users' private data, thus greatly enhancing the privacy preserving capability as well as maintaining similar convergence accuracy of the model. Therefore, applying FL in the personal mobility data-related use scenarios have three major benefits:

###(1) High Privacy-Preserving Capability: The personal mobility data stays in the users' local devices and do not need to be sent to the central server, thus greatly reducing the risk of personal data leakage;

###(2) Implementation Efficiency: As there is no need to transmit the data to the central server, both the communication cost and the information transmission encryption efforts could be saved, thus achieving higher implementation efficiency;

###(3) Flexible User Participation: Meanwhile, the distributed training capability of FL enables salable amounts of users to flexibly participate in the training process, thus contributing and enhancing the overall application performance.

alt text

How to use the repo.

Enviorment Setup.

The project is implemented using PyTorch, and tested under the following enviorments:

Ubuntu 16.04
NVIDIA Driver == 440.64
CUDA == 10.2
PyTorch == 1.5.0
torchvision == 0.6.0
Scikit-learn == 0.23.2

Tensorboard is recommended but not required to visualize the training log.

Data Preparation.

  1. Geolife Dataset: Raw Data

  2. Preprocessed trajectories dataset (in numpy format):

    Trajectory Data
    Labels

After downloading the preprocessed data, place the data images.npy & labels.npy into the \data folder.

Command Lines.

Baseline: Centralized training.

In this case, all training data is sent/stored to the central node and conduct central training.

python main.py --lr 0.1 --node 1

Federated Training: Simulate federated training.

In this case, training data is split into 2, 4 or 8 nodes and conduct federated learning with FedAvg.

python main.py --lr 0.1 --nodes 2 --bs 32 # Fed Learning with 2 nodes.
python main.py --lr 0.1 --nodes 4 --bs 32 # Fed Learning with 4 nodes.
python main.py --lr 0.1 --nodes 8 --bs 32 # Fed Learning with 8 nodes.

Evaluation: Evaluate trained models. The default location of saved model is in \checkpoint folder.

Run the following commands to evaluate the model performance.

python eval.py --model ckpt_1 # model name.
python eval.py --model ckpt_2
python eval.py --model ckpt_4
python eval.py --model ckpt_8

Released Models: We have released our centralized and FL-trained models in the \checkpoint folder including 4 models:

  1. ckpt_1node_67.52.pth, corresponding to the centralized training model;
  2. ckpt_2node_64.46.pth, corresponding to FL training with 2 clients;
  3. ckpt_4node_66.88.pth, corresponding to FL training with 4 clients;
  4. ckpt_8node_67.13.pth, corresponding to FL training with 8 clients.

Questions

If you have any questions, please reach to the author (email: fyu2@gmu.edu).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published