Dissecting Out-of-Distribution Detection and Open-Set Recognition: A Critical Analysis of Methods and Benchmarks (IJCV 2024)

Hongjun Wang¹, Sagar Vaze², Kai Han¹
¹Visual AI Lab, The University of Hong Kong ²Visual Geometry Group, University of Oxford

Abstract: Detecting test-time distribution shift has emerged as a key capability for safely deployed machine learning models, with the question being tackled under various guises in recent years. In this paper, we aim to provide a consolidated view of the two largest sub-fields within the community: out-of-distribution (OOD) detection and open-set recognition (OSR). In particular, we aim to provide rigorous empirical analysis of different methods across settings and provide actionable takeaways for practitioners and researchers. Concretely, we make the following contributions: (i) We perform rigorous cross-evaluation between state-of-the-art methods in the OOD detection and OSR settings and identify a strong correlation between the performances of methods for them; (ii) We propose a new, large-scale benchmark setting which we suggest better disentangles the problem tackled by OOD detection and OSR, re-evaluating state-of-the-art OOD detection and OSR methods in this setting; (iii) We surprisingly find that the best performing method on standard benchmarks (Outlier Exposure) struggles when tested at scale, while scoring rules which are sensitive to the deep feature magnitude consistently show promise; and (iv) We conduct empirical analysis to explain these phenomena and highlight directions for future research.

Running

Prerequisite 🛠️

pip install -r requirements.txt

Datasets

A number of datasets are used for OOD and OSR cross-benchmarking:

Standard Benchmarks: SVHN, CIFAR-10/100, TinyImageNet, Textures, LSUN, LSUN-R, iSUN, Places,
Proposed SSB Benchmarks: ImageNet-21K-P, ImageNet-C, ImageNet-R, CUB, Standford Cars, FGVC-Aircraft

For TinyImageNet, you also need to run create_val_img_folder in data/tinyimagenet.py to create a directory with the test data.

Open-set Splits:

For the proposed open-set benchmarks, the directory data/open_set_splits contains the proposed class splits as .pkl files. For the FGVC datasets, the files also include information on which open-set classes are most similar to which closed-set classes.

Config

For config.py, please set paths to datasets and pre-trained models (for SSB experiments)

For train_configs.yaml, we offer the default training configurations for different datatsets. Datasets which are not included in the list can be use either of them.

Scripts

Train models: This will verify different training methods (e.g. CE, ARPL/ARPL+CS, OE or GODIN). To train models on a specified dataset, run the scirpt in the below path:

bash bash_scripts/new_bash/train/xxx.sh

Please change PYTHON (path to python interpreter) in the specific bash_scripts scripts to fit your own environment.

Evaluating models: This will verify different scoring rules (e.g. MSP, MLS, Energy, ect). After you obtain the model checkpoint (no matter downloaded or trained by youself), set DIRS to the path of the model you would like to evaluate.

To evaluate `all' the scoring rules, run the scripts in the below path:

bash bash_scripts/new_bash/eval/xxx.sh

By default, the sciprt will sweep all the scoring rules in `OOD_METHODS'.

BibTex

If you find this repo useful for your research, please consider citing our paper:

@article{wang2024dissect,
    author    = {Wang, Hongjun and Vaze, Sagar and Han, Kai},
    title     = {Dissecting Out-of-Distribution Detection and Open-Set Recognition: A Critical Analysis of Methods and Benchmarks},
    journal = {International Journal of Computer Vision (IJCV)},
    year      = {2024}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dissecting Out-of-Distribution Detection and Open-Set Recognition: A Critical Analysis of Methods and Benchmarks (IJCV 2024)

Running

Prerequisite 🛠️

Datasets

Config

Scripts

BibTex

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
assets		assets
data		data
eval		eval
methods		methods
models		models
new_bash		new_bash
train		train
utils		utils
README.md		README.md
config.py		config.py
find_corrupt_img.py		find_corrupt_img.py
requirements.txt		requirements.txt
train_configs.yaml		train_configs.yaml

Visual-AI/Dissect-OOD-OSR

Folders and files

Latest commit

History

Repository files navigation

Dissecting Out-of-Distribution Detection and Open-Set Recognition: A Critical Analysis of Methods and Benchmarks (IJCV 2024)

Running

Prerequisite 🛠️

Datasets

Config

Scripts

BibTex

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages