Evaluate your experiments on the AutoVI dataset

The evaluation scripts can be used to assess the performance of a method on the AutoVI dataset. Given a directory with anomaly maps, the scripts compute the area under the sPRO curve for anomaly localization.

The dataset can be found at the following address: https://doi.org/10.5281/zenodo.10459003

This code is adapted from the code made available by MVTec GmbH at https://www.mvtec.com/company/research/datasets/mvtec-loco

Installation.

Our evaluation scripts require a python 3.7 installation as well as the following packages:

numpy
pillow
tqdm

For Linux, we provide an conda environment file. It can be used to create a new conda environment with all required packages readily installed:

conda env create --name autovi_eval --file=conda_environment.yml
conda activate autovi_eval

Evaluating a single experiment.

The evaluation script requires an anomaly map to be present for each test sample in our dataset in .png format. Anomaly maps must contain real-valued anomaly scores and their size must match the one of the corresponding dataset images. Anomaly maps must all share the same base directory and adhere to the following folder structure: <anomaly_maps_dir>/<object_name>/test/<defect_name>/<image_id>.png

To evaluate a single experiment on one of the dataset objects, the script evaluate_experiment.py can be used.
It requires the following user arguments:

object_name: Name of the dataset object to be evaluated.
dataset_base_dir: Base directory that contains the AutoVI dataset.
anomaly_maps_dir: Base directory that contains the corresponding anomaly maps.
output_dir: Directory to store evaluation results as .json files.

A possible example call to this script would be:

python evaluate_experiment.py \
    --object_name pushpins \
    --dataset_base_dir 'path/to/dataset/' \
    --anomaly_maps_dir 'path/to/anomaly_maps/' \
    --output_dir 'metrics/'

The evaluation script computes the area under the sPRO curve up to a limited false positive rate as described in our paper. The integration limits are specified by the variable MAX_FPRS.

Evaluate multiple experiments

If more than one experiment should be evaluated simultaneously, the script evaluate_multiple_experiments.py can be used. Multiple directories conatining anomaly maps should be specified in a config.json file with the following structure:

{
    "exp_base_dir": "/path/to/all/experiments/",
    "anomaly_maps_dirs": {
    "experiment_id_1": "eg/model_1/anomaly_maps/",
    "experiment_id_2": "eg/model_2/anomaly_maps/",
    "experiment_id_3": "eg/model_3/anomaly_maps/",
    "...": "..."
    }
}

exp_base_dir: Base directory that contains all experimental results for each evaluated method.
anomaly_maps_dirs: Dictionary that contains an identifier for each evaluated experiment and the location of its anomaly maps relative to the exp_base_dir.

The evaluation is run by calling evaluate_multiple_experiments.py with the following user arguments:

dataset_base_dir: Base directory that contains the AutoVI dataset.
experiment_configs: Path to the above config.json that contains all experiments to be evaluated.
output_dir: Directory to store evaluation results as .json files.

A possible example call to this script would be:

  python evaluate_multiple_experiments.py \
    --dataset_base_dir 'path/to/dataset/' \
    --experiment_configs 'configs.json' \
    --output_dir 'metrics/'

Visualize the evaluation results.

After running evaluate_experiment.py or evaluate_multiple_experiments.py, the script print_metrics.py can be used to visualize all computed metrics in a table. In total, three tables are printed to the standard output. The first two tables display the performance for the structural and logical anomalies, respectively. The third table shows the mean performance over both anomaly types.

The script requires the following user arguments:

metrics_folder: The base directory that contains the computed metrics for each evaluated method. This directory is usually identical to the output directory specified in evaluate_experiment.py or evaluate_multiple_experiments.py.
metric_type: Select either localization or classification. When selecting localization, the AUC-sPRO results for the pixelwise localization of anomalies is shown. When selecting classification, the image level AUC-ROC for anomaly classification is shown.
integration_limit: The integration limit until which the area under the sPRO curve is computed. This parameter is only applicable when metric_type is set to localization.

License

The license agreement for our evaluation code is found in the accompanying LICENSE.txt file.

The version of this evaluation script is: 3.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Evaluate your experiments on the AutoVI dataset

Installation.

Evaluating a single experiment.

Evaluate multiple experiments

Visualize the evaluation results.

License

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
src		src
LICENSE.txt		LICENSE.txt
README.md		README.md
conda_environment.yml		conda_environment.yml
evaluate_experiment.py		evaluate_experiment.py
evaluate_multiple_experiments.py		evaluate_multiple_experiments.py

License

phcarval/autovi_evaluation_code

Folders and files

Latest commit

History

Repository files navigation

Evaluate your experiments on the AutoVI dataset

Installation.

Evaluating a single experiment.

Evaluate multiple experiments

Visualize the evaluation results.

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages