Towards Algorithmic Fairness by means of Instance-level Data Re-weighting based on Shapley Values

Adrian Arnaiz-Rodríguez, Nuria Oliver - (ELLIS Alicante) - DMLR @ ICLR'24

Fair Data Valuation using Shapley Values for Algorithmic Fairness Re-weighting

@inproceedings{arnaiz2024fairshap,
title={Towards Algorithmic Fairness by means of Instance-level Data Re-weighting based on Shapley Values},
author={Arnaiz-Rodriguez, Adrian and Oliver, Nuria},
booktitle={ICLR 2024 Workshop on Data-centric Machine Learning Research (DMLR): Harnessing Momentum for Science},
year={2024},
url={https://openreview.net/forum?id=ivf1QaxEGQ}
}

Implementation

NumBa speed-up implementation:

from fairSV.fair_shapley_sklearn import get_SV_matrix_numba_memory, get_sv_arrays

protected_attributes_dict = {'values': A_array, #array of protected attributes,
                             'privileged_protected_attribute': int(priv_attr), #id privileged
                             'unprivileged_protected_attribute': int(unpriv_attr), #id unprivileged
                             'favorable_label':int(fav_lab), 'unfavorable_label':int(unfav_lab)} #id favorable and unfavorable label


SV = get_SV_matrix_numba_memory(X_train, X_valid, y_train, y_valid) #SV Matrix
svs_acc, svs_eop, _, _ = get_sv_arrays(SV, y_valid, protected_attributes_dict, 'all')

SV refers to $\mathbf{\Phi}$ in the paper. svs_acc, svs_eop,... refers to the data valuation $\phi(.)$ for the given value function.

Then use them for your prefered goal: bias mitigation through re-weighting, data generation, exploratoty data analysis, data minimization, data acquisition policies...

Re-weighting

weights = sv_eop #choose prefered one
# use prefered normalization. We normalize such that the distribution of weights has mean 1 (they sum up to the number of training samples)
weights = (weights - weights.min())/(weights.max()-weights.min())
weights *= (N/weights.sum()) # N the number of training samples

model = GradientBoostingClassifier(random_state=seed)
model.fit(X_train, y_train, sample_weight=weights)

Data Analysis

Distribution of $\phi()$'s and weights (Section 4.1. experiment with images)
Embedding space exploration (Section 4.1. experiment with images)

Acknowledgments

Funded by the European Union. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the European Health and Digital Executive Agency (HaDEA). EU - HE ELIAS -- Grant Agreement 101120237. Funded also by Intel corporation, a nominal grant received at the ELLIS Unit Alicante Foundation from the Regional Government of Valencia in Spain (Convenio Singular signed with Generalitat Valenciana, Conselleria de Innovación, Industria, Comercio y Turismo, Dirección General de Innovación) and a grant by the Banc Sabadell Foundation.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
fairSV		fairSV
figs		figs
utils		utils
.gitignore		.gitignore
FairShapley.ipynb		FairShapley.ipynb
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Towards Algorithmic Fairness by means of Instance-level Data Re-weighting based on Shapley Values

Implementation

Re-weighting

Data Analysis

Acknowledgments

About

Releases

Packages

Languages

AdrianArnaiz/fair-shap

Folders and files

Latest commit

History

Repository files navigation

Towards Algorithmic Fairness by means of Instance-level Data Re-weighting based on Shapley Values

Implementation

Re-weighting

Data Analysis

Acknowledgments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages