Automated decision-making systems can potentially introduce biases, raising ethical concerns. This has led to the development of numerous bias mitigation techniques. However, the selection of a fairness-aware model for a specific dataset often involves a process of trial and error, as it is not always feasible to predict in advance whether the mitigation measures provided by the model will meet the user's requirements, or what impact these measures will have on other model metrics such as accuracy and run time.
Existing fairness toolkits lack a comprehensive benchmarking framework. To bridge this gap, we present FairnessEval, a framework specifically designed to evaluate fairness in Machine Learning models. FairnessEval streamlines dataset preparation, fairness evaluation, and result presentation, while also offering customization options. In this demonstration, we highlight the functionality of FairnessEval in the selection and validation of fairness-aware models. We compare various approaches and simulate deployment scenarios to showcase FairnessEval effectiveness.
Open a terminal in a directory where you want to install Fairnesseval and run the following command:
git clone https://github.com/softlab-unimore/fairnesseval.git
pip install -e fairnesseval
A demonstration web application for FairnessEval is available.
To run the demonstration, navigate to the streamlit
folders and run the streamlit server:
cd fairnesseval
cd streamlit
streamlit run page_welcome.py
In here you can find more information about the Streamlit UI and examples of how to use it.
A page on your local host should be opened: you can start to use Fairnesseval.
Here you can find a demo notebook with working examples.
You can interact with the notebook and run the library with your experiments.
Here you can find a quick start guide to the fairnesseval API with working examples.
Parameter | Description |
---|---|
experiment_id |
ID of the experiment to run. Required. |
dataset_name |
List of dataset names. Required. |
model_name |
List of model names. Required. |
results_path RESULTS_PATH |
Path to save results. |
train_fractions |
List of fractions to be used for training. |
random_seeds |
List of random seeds to use. All random seeds set are related to this random seed. For each random_seed a new train_test split is done. |
metrics |
Metric set to be used for evaluation. Available metric set names are default , conversion_to_binary_sensitive_attribute . To use custom metrics add a new key to metrics_code_map in fairnesseval.metrics.py . |
preprocessing |
Preprocessing function to be used. Available preprocessing functions are conversion_to_binary_sensitive_attribute , binary_split_by_mean_y , default . To add a new preprocessing function add a new key to preprocessing_function_map in fairnesseval.utils_prepare_data.py . |
split_strategy |
Splitting strategy. Available split strategies are stratified_train_test_split , StratifiedKFold . |
train_test_fold |
List of train_test_fold to run with k-fold. |
model_params |
Dict with key, value pairs of model hyper parameter names (key) and list of values to be iterated (values). When multiple list of parameters are specified the cross product is used to generate all the combinations to test. |
debug |
Debug mode if set, the program will stop at the first exception. |
base_model_grid_params |
Base model hyper parameters for Grid search. Dict (or List[Dict]) with pairs of names (key) and list of values to be iterated (values). This object will be passed to GridSearch as is to perform the grid search. |
This table provides a clear and concise overview of the parameters and their descriptions.
To add a new custom model to the FairnessEval framework, follow these steps:
Create a new wrapper class for your custom model in the wrappers
module. This class should implement the necessary
methods for fitting and predicting with your model.
The wrapper must receive the random_state
parameter.
# In fairnesseval/models/wrappers.py
class CustomModelWrapper:
def __init__(self, random_state, **kwargs): # Random state is required here
# Initialize your model with any required parameters
self.model = YourCustomModel(
[random_state, ] ** kwargs) # Random state is NOT required here, but it is recommended to set it.
def fit(self, X, y): # Or fit(self, X, y, sensitive_features) if your model require sensitive features
# Fit the model to the data
pass
def predict(self, X): # Or predict(self, X, sensitive_features): if your model requires sensitive features
# Predict using the model
pass
Add your custom model to the additional_models_dict
in the wrapper.py
file.
This dictionary maps the model name to the corresponding wrapper class.
# In fairnesseval/models/models.py
additional_models_dict = {
'most_frequent': partial(sklearn.dummy.DummyClassifier, strategy="most_frequent"),
'LogisticRegression': sklearn.linear_model.LogisticRegression,
'CustomModel': CustomModelWrapper, # <-- Add your custom model here
}
You can now use your custom model in the experiment configuration by specifying its name (CustomModel
) in the
model_names
list.
experiment_conf = {
'experiment_id': 'custom_experiment',
'dataset_names': ['your_dataset'],
'model_names': 'CustomModel',
'random_seed': 42,
'model_params': {'param1': value1, 'param2': value2}, # Add any hyperparameters for your model here
'train_fractions': [0.8],
'results_path': 'path/to/results',
'params': ['--debug']
}
By following these steps, you can integrate a new custom model into the FairnessEval framework and use it for your experiments.
Adding a New Dataset To add a new dataset to the FairnessEval framework, follow these steps:
- Prepare the Dataset
Ensure your dataset is in CSV format. The second last column should be the target variable, and the last column should be the sensitive attribute.
- Save the Dataset
Save the CSV file in the datasets folder. Use the naming convention '[dataset_name].csv' for the file name.
- Load the Dataset
The framework will automatically detect the dataset based on the file name and structure.
- Use the Dataset
You can now use your new dataset in the experiment configuration by specifying its name in the dataset_names list (include the '.csv' extension in the name).
experiment_conf = {
'experiment_id': 'new_dataset_experiment',
'dataset_names': ['new_dataset.csv'], # <-- Add your new dataset here
'model_names': ['LogisticRegression'],
'random_seed': 42,
'model_params': {'C': [0.1, 1, 10]},
'train_fractions': [0.8],
'results_path': 'path/to/results',
'params': ['--debug']
}
By following these steps, you can integrate a new dataset into the FairnessEval framework and use it for your experiments.
The experiment configurations for the article 'Fair Classification with a Scalable Reduction Approach' are available at utils_experiment_parameters.py