Skip to content

Simple data balancing baselines for worst-group-accuracy benchmarks.

License

Notifications You must be signed in to change notification settings

nalzok/BalancingGroups

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

60 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TTLSA

Code to replicate the experimental results from our TTLSA paper. Based on the facebookresearch/BalancingGroups repo by FAIR.

Replicating the main results

Installing dependencies

Easiest way to have a working environment for this repo is to create a conda environement with the following commands

conda create --name ttlsa --file conda-spec/spec-file.txt

If conda is not available, please install the dependencies listed in the requirements.txt file.

Download, extract and Generate metadata for datasets

This script downloads, extracts and formats the datasets metadata so that it works with the rest of the code out of the box.

python setup_datasets.py --download --data_path data celeba waterbirds civilcomments multinli

Launch jobs

To reproduce the experiments in the paper:

make train

Aggregate results

The parse.py script can generate the main plots and tables from the paper. This script can be called while the experiments are still running.

# worst group accuracy
python3 -m aggregate --path paper --selector1 min --selector2 min --split te
# average accuracy
python3 -m aggregate --path paper --selector1 avg --selector2 avg --split te

License

This source code is released under the CC-BY-NC license, included here.

About

Simple data balancing baselines for worst-group-accuracy benchmarks.

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 98.3%
  • Makefile 1.6%
  • Shell 0.1%