Data-Market-via-Adaptive-Sampling

Introduction

This repo aims to provide the code and instructions to reproduce the experimental results in the following paper

Boxin Zhao, Boxiang Lyu, Raul Castro Fernandez. Mladen Kolar. Addressing Budget Allocation and Revenue Allocation in Data Market Environment Using an Adaptive Sampling Algorithm. International Conference on Machine Learning (ICML), 2023.

Please refer the paper for more details.

Preparation

Run data_load.py. The code will automatically download the dataset for you.

python data_load.py

Run rand_seed_generator.py to generate the list of random seeds.

python rand_seed_generator.py

Budget Allocation and Revenue Allocation

Run train_DSV.py, train_FedAvg_FedDSV.py, train_FedAvg_OSMD.py, train_FedAvg_uniform.py individually. Note that the user needs to manually set dataset_name variable, the acceptable value needs to be one of ['MNIST', 'KMNIST', 'FMNIST', 'CIFAR10']. Besides, user also needs to set the index of the list of random seeds, the acceptable variable is from 0-9.

python train_FedAvg_uniform.py 0

After running all methods with all four datasets and all 10 random seeds, run result_reorg.py to reorganize the results, and then run visualize_FedAvg.py to get the plots of budget allocation and revenue allocation.

python result_reorg.py
python visualize_revenue_budget.py

Time Analysis

Change to time_analysis directory
Run data_load_time.py

python data_load_time.py

Run train_DSV_time.py, train_FedAvg_FedDSV_time.py, train_FedAvg_OSMD_time.py, train_FedAvg_uniform_time.py individually. Note that the user needs to manually set dataset_name variable, the acceptable value needs to be one of ['MNIST', 'KMNIST', 'FMNIST', 'CIFAR10']. Besides, user also needs to set the number of providers by setting the variable n_providers. In the paper, we choose n_providers=50, 100, 200, 400. Finally, user also needs to set the index of the list of random seeds, the acceptable variable is from 0-9.

python train_FedAvg_uniform_time.py 5

After running all methods with all four datasets, all data provider numbers (50, 100, 200 and 400) and all 10 random seeds, run result_reorg.py to reorganize the results, and then run visualize_time.py get the plots of time analysis.

python result_reorg.py
python visualize_time.py

Mixture Linear Regression

Change to mixture_regression directory
Run rand_seed_generator.py to generate the list of random seeds.

python rand_seed_generator.py

Run data_generate.py to generate the data

python data_generate.py

Run train_DSV_mr.py, train_FedAvg_FedDSV_mr.py, train_FedAvg_OSMD_mr.py, and train_FedAvg_uniform_mr.py individually. User needs to set the index of the list of random seeds, the acceptable variable is from 0-99.

python train_FedAvg_uniform_mr.py 87

After running all methods with all random seeds, run result_reorg.py to reorganize the results, and then run visualize_mr.py get the plots of time analysis.

python result_reorg.py
python visualize_mr.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data-Market-via-Adaptive-Sampling

Introduction

Preparation

Budget Allocation and Revenue Allocation

Time Analysis

Mixture Linear Regression

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
data		data
mixture_regression		mixture_regression
plots		plots
result		result
time_analysis		time_analysis
README.md		README.md
data_load.py		data_load.py
rand_seed_generator.py		rand_seed_generator.py
rd_seed_array.pickle		rd_seed_array.pickle
train_DSV.py		train_DSV.py
train_FedAvg_FedDSV.py		train_FedAvg_FedDSV.py
train_FedAvg_OSMD.py		train_FedAvg_OSMD.py
train_FedAvg_uniform.py		train_FedAvg_uniform.py
util.py		util.py
visualize_revenue_budget.py		visualize_revenue_budget.py

boxinz17/Data-Market-via-Adaptive-Sampling

Folders and files

Latest commit

History

Repository files navigation

Data-Market-via-Adaptive-Sampling

Introduction

Preparation

Budget Allocation and Revenue Allocation

Time Analysis

Mixture Linear Regression

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages