[NeurIPS2023] A Framework for Semi-Supervised Federated Object Detection 🚌

💡 Introduction

This official repository contains the implementation and dataset setup for the research paper "Navigating Data Heterogeneity in Federated Learning: A Semi-Supervised Federated Object Detection", presented at NeurIPS 2023 [Link], by Taehyeon Kim, Eric Lin, Junu Lee, Christian Lau, and Vaikkunth Mugunthan.

🎤 TL;DR: We introduce a groundbreaking Semi-Supervised Federated Object Detection (SSFOD) framework where the server has labeled data and all clients have only unlabeled data. We believe that our generic framework could be a game-changer for autonomous driving systems and CCTV stuff!

🤔 Getting Started

✅ Requirements

This codebase is written for python3 (used python 3.7.6 while implementing).
To install necessary python packages, run pip install -r requirements.txt.

🚗 Dataset

💻 1. Download

BDD100K [Link]

Description: The BDD100K dataset features 100,000 driving videos from various U.S. locations, covering diverse weather conditions.
Utilization: We selected 20,000 data points from this dataset, focusing on cloudy, rainy, overcast, and snowy conditions.
Purpose: This setup helps in investigating the impact of data heterogeneity on our framework and assessing its robustness in realistic conditions.
Download: bash ./data/download/download_bdd.sh

SODA10M [Link]

Description: The SODA10M dataset offers a diverse range of geographies, weather conditions, and object categories.
Utilization: In an IID setup, 20,000 labeled data points are distributed among one server and three clients. For a more realistic setup, these labeled data are kept on the server, while 100,000 unlabeled data points are distributed across the clients.
Purpose: This configuration allows for performance evaluation under varying weather conditions like clear, overcast, and rainy, demonstrating the resilience and robustness of our approach.
Download: bash ./data/download/download_soda10m.sh

🏭 2. Setup FL Environment

Our setup scripts are designed to prepare the datasets and training environment for various configurations of federated learning and semi-supervised learning. Here's a breakdown of what each script does:

Centralized Training

setup_bdd_centralized.sh & setup_soda_centralized.sh
- Description: These scripts configure the environment for centralized training where all data is stored in a single data source.
- Usage: Employ these scripts when you intend to train your model in a fully supervised manner with all available labeled data.

Semi-Supervised Learning (SSL)

setup_{dataset}_1ds_ssl_{iid/non-iid}.sh
- Description: For semi-supervised learning scenarios, these scripts set up a single data source on the server with both labeled and unlabeled data.
- Variants:
  - IID: setup_bdd_1ds_ssl_iid.sh & setup_soda_1ds_ssl_iid.sh for setups where the data distribution is independent and identically distributed (IID) across labeled and unlabeled data.
  - Non-IID: setup_bdd_1ds_ssl_noniid.sh & setup_soda_1ds_ssl_noniid.sh for setups where there is heterogeneity in weather conditions between labeled and unlabeled data.

Semi-Supervised Federated Learning

setup_bdd_{4ds/100ds}_ssfl_noniid.sh
- Description: These scripts are tailored for federated learning with a semi-supervised approach, dealing with non-IID data (Weather Condition Heterogeneity) across multiple data sources.
- Usage: Use these for simulating a federated learning environment with data heterogeneity.

Data Source Specific Scripts

setup_bdd_ssfl_iid_{4/100}datasources.sh
- Description: Scripts specific to the BDD100K dataset that configure SSFL training across multiple IID data sources.
- Usage: Choose based on the number of data sources (4 or 100) you wish to simulate.

Etc: Percentages for Labeled Dataset

setup_bdd_ssl_{iid/non-iid}_20perc.sh
- Description: 20perc means the percentages for the use of labeled datasets. You can freely control this percentage in the bash file.
- Usage: Select IID for homogeneous and Non-IID for heterogeneous weather conditions across labeled data.

Our scripts are named following the convention: setup_{dataset name}_{# of data sources}_{algorithm}_{iid or non-iid}.sh for clarity and ease of use. For more detailed instructions on how to use each script, please refer to the individual script headers.

💡 Code Implementation

Still in progress. To be uploaded!! 🏃🏻🏃🏻🏃🏻

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
data		data
LICENSE		LICENSE
README.md		README.md
img.png		img.png
setup_soda_weather.py		setup_soda_weather.py
split_data.py		split_data.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

[NeurIPS2023] A Framework for Semi-Supervised Federated Object Detection 🚌

💡 Introduction

🤔 Getting Started

✅ Requirements

🚗 Dataset

💻 1. Download

BDD100K [Link]

SODA10M [Link]

🏭 2. Setup FL Environment

Centralized Training

Semi-Supervised Learning (SSL)

Semi-Supervised Federated Learning

Data Source Specific Scripts

Etc: Percentages for Labeled Dataset

💡 Code Implementation

😎 Personal Note

About

Releases

Packages

Languages

License

Kthyeon/ssfod

Folders and files

Latest commit

History

Repository files navigation

[NeurIPS2023] A Framework for Semi-Supervised Federated Object Detection 🚌

💡 Introduction

🤔 Getting Started

✅ Requirements

🚗 Dataset

💻 1. Download

BDD100K [Link]

SODA10M [Link]

🏭 2. Setup FL Environment

Centralized Training

Semi-Supervised Learning (SSL)

Semi-Supervised Federated Learning

Data Source Specific Scripts

Etc: Percentages for Labeled Dataset

💡 Code Implementation

😎 Personal Note

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages