Skip to content

Latest commit

 

History

History
113 lines (88 loc) · 3.48 KB

DATASET_SETUP.md

File metadata and controls

113 lines (88 loc) · 3.48 KB

Dataset Preparation

Data Structure

FOMO/
└── data/
    ├── OWOD/
    |   ├── JPEGImages/
    |   |   ├── SOWODB/
    |   |   └── MOWODB/
    |   ├── Annotations/
    |   |   ├── SOWODB/
    |   |   └── MOWODB/
    |   └── ImageSets/
    |       ├── SOWODB/
    |       └── MOWODB/
    └── RWD/
        ├── JPEGImages/
        |   ├── Aerial/
        |   ├── Aquatic/
        |   ├── Game/
        |   ├── Medical/
        |   └── Surgical/
        ├── Annotations
        |   ├── Aerial/
        |   ├── Aquatic/
        |   ├── Game/
        |   ├── Medical/
        |   └── Surgical/
        └── ImageSets/
            ├── Aerial/
            ├── Aquatic/
            ├── Game/
            ├── Medical/
            └── Surgical/

Open World Object Detection Datasets

The splits are present inside the data/OWOD/ImageSets/MOWODB and data/OWOD/ImageSets/SOWODB folders.

  1. Download the COCO Images and Annotations from coco dataset into the data/ directory.
  2. Unzip train2017 and val2017 folder. The current directory structure should look like:
FOMO/
└── data/
    └── coco/
        ├── annotations/
        ├── train2017/
        └── val2017/
  1. Move all images from train2017/ and val2017/ to JPEGImages folder.
  2. Use the code coco2voc.py for converting json annotations to xml files.
  3. Download the PASCAL VOC 2007 & 2012 Images and Annotations from pascal dataset into the data/ directory.
  4. untar the trainval 2007 and 2012 and test 2007 folders.
  5. Move all the images to JPEGImages folder and annotations to Annotations folder.

NOTE: I created just one folder of all the JPEG images and Annotations, for SOWODB and a symbolic link for MOWODB. We follow the VOC format for data loading and evaluation.

Real World Object Detection Datasets

Download each dataset and place them in data/ROOT, and re-name the directory to the one used below:

FOMO/       
└── data/   
    ├── OWOD/
    ├── RWD/
    └── ROOT/
        ├── Aquatic/
        ├── Aerial/
        ├── Game/
        ├── Medical/
        └── Surgical/
  1. Aquatic: https://universe.roboflow.com/roboflow-100/aquarium-qlnqy

Download in COCO format.

  1. Aerial: https://gcheng-nwpu.github.io

Download the dataset from here.

  1. Game: https://universe.roboflow.com/roboflow-100/team-fight-tactics

Download in COCO format.

  1. Medical: https://universe.roboflow.com/roboflow-100/x-ray-rheumatology

Download in COCO format.

  1. Surgical: https://medicis.univ-rennes1.fr/software#neurosurgicaltools_dataset

Download the "NeuroSurgicalToolsDataset/NeuroSurgicalToolsDataset.zip".

Setup

NOTE: I created a data/data_setup.sh bash file to setup all the dataset if all the datasets have been downloaded into the correct folders in ROOT.

To convert the annotations of the roboflow100 datasets (Aquatic,Game,Medical):

python datasets/roboflow100_dataset_setup.py --dataset Aquatic

For the Aerial and Surgical dataset:

Simply move the xml files to the Annotations/Surgical and png files to JPEGImages/Surgical.

For the Aerial dataset, use the Horizontal Bounding Boxes annotations.