Skip to content

Latest commit

 

History

History
240 lines (181 loc) · 3.78 KB

Instructions for preparing the datasets.md

File metadata and controls

240 lines (181 loc) · 3.78 KB

Instructions for preparing the datasets

[TOC]

Download

Most of the datasets, including PACS, OfficeHome, Terra Incognita, and WILDSCamelyon, can be downloaded with this script. Other datasets are also publicly available but need to be downloaded manually.

Directory structure

Make sure that the directory structure of each dataset is arranged as follows:

Datasets in Domainbed

PACS

PACS
├── art_painting
  ├── dog
  ├── elephant
  ├── ...
├── cartoon
├── photo
└── sketch

VLCS

VLCS
├── Caltech101
  ├── bird
  ├── car
  ├── ...
├── LabelMe
├── SUN09
└── VOC2007

OfficeHome

office_home
├── Art
  ├── Alarm_Clock
  ├── Backpack
  ├── ...
├── Clipart
├── Product
└── Real World

Terra Incognita

terra_incognita
├── location_38
  ├── bird
  ├── bobcat
  ├── ...
├── location_43
├── location_46
└── location_100

Camelyon17-WILDS

camelyon17_v1.0
├── patches
└── metadata.csv

DomainNet

domain_net
├── clipart
├── infograph
├── painting
├── quickdraw
├── real
└── sketch

COVID

domain_net
├── source
  ├── normal
  ├── pneumonia
├── target
  ├── normal
  ├── pneumonia
  ├── COVID19

DrugOOD_assay

run To_image_assay.py with drugood_assay.txt in data to get DrugOOD_assay

DrugOOD_assay
├── domain01
  ├── inactive
  ├── active
├── ...
└── domain80

DrugOOD_scaffold

run To_image_scaffold.py with drugood_assay.txt in data to get DrugOOD_scaffold

DrugOOD_scaffold
├── domain01
  ├── inactive
  ├── active
├── ...
└── domain12542

PACS_gaussion

run add_gaussion.py in data to get PACS_gaussion

PACS_gaussion
├── art_painting
  ├── dog
  ├── elephant
  ├── ...
├── cartoon
├── photo
└── sketch

PACS_unseen

run gradio_seg2imag_offline.py in data with ControlNet to get PACS_unseen

PACS_unseen
├── art_painting
  ├── dog
  ├── elephant
  ├── ...
├── cartoon
├── photo
└── sketch

Datasets in CLIP

ImageNet-V2

# download from https://huggingface.co/datasets/vaishaal/ImageNetV2/tree/main
imagenetv2-matched-frequency-format-val
├── 118
└── ...

ImageNet-R

# official website: https://github.com/hendrycks/imagenet-r
# download: https://people.eecs.berkeley.edu/~hendrycks/imagenet-r.tar
imagenet-r
├── n01616318
└── ...

ImageNet Sketch

# download from https://github.com/HaohanWang/ImageNet-Sketch
imagenet-sketch
├── n01498041
└── ...

ImageNet-A

# download from https://github.com/hendrycks/natural-adv-examples
imagenet-a
├── n01498041
└── ...

ObjectNet


Todo

CelebA

celeba
├── img_align_celeba
└── blond_split
    ├── tr_env1_df.csv
    ├── tr_env2_df.csv
    └── te_env_df.csv

NICO

NICO
├── animal
├── vehicle
└── mixed_split_corrected
    ├── env_train1.csv
    ├── env_train2.csv
    ├── env_val.csv
    └── env_test.csv

Reference:

OoD-Bench, DomainBed,