SAGED: A Holistic Bias-Benchmarking Pipeline for Language Models with Customisable Fairness Calibration
Authors: Xin Guan, Nathaniel Demchak, Saloni Gupta, Ze Wang, Ediz Ertekin Jr., Adriano Koshiyama, Emre Kazim, Zekun Wu
Conference: COLING 2025 Main Conference
DOI: https://doi.org/10.48550/arXiv.2409.11149
SAGED(-Bias) is the first comprehensive benchmarking pipeline designed to detect and mitigate bias in large language models. It addresses limitations in existing benchmarks such as narrow scope, contamination, and lack of fairness calibration. The SAGED pipeline includes the following five core stages:
This diagram illustrates the core stages of the SAGED pipeline:
- Scraping Materials: Collects and processes benchmark data from various sources.
- Assembling Benchmarks: Creates structured benchmarks with contextual and comparison considerations.
- Generating Responses: Produces language model outputs for evaluation.
- Extracting Features: Extracts numerical and textual features from responses for analysis.
- Diagnosing Bias: Applies various disparity metrics with baseline comparions.
pip install sagedbias
from saged import Pipeline
from saged import Scraper, KeywordFinder, SourceFinder
from saged import PromptAssembler
from saged import FeatureExtractor
from saged import DisparityDiagnoser
If you use SAGED in your work, please cite the following paper:
@article{guan2025saged,
title={SAGED: A Holistic Bias-Benchmarking Pipeline for Language Models with Customisable Fairness Calibration},
author={Xin Guan and Nathaniel Demchak and Saloni Gupta and Ze Wang and Ediz Ertekin Jr. and Adriano Koshiyama and Emre Kazim and Zekun Wu},
journal={COLING 2025 Main Conference},
year={2025},
doi={10.48550/arXiv.2409.11149}
}
SAGED-bias is released under the MIT License.