This repo mainly contains a jupyter notebook in which some basic Exploratory Data Analysis (EDA) is performed on MS COCO. While EDA is widely used for categorical data, I haven't seen a lot of repos doing EDA for object detection datasets, which is the reason I created this repo.
- eda_coco_style.ipynb
- coco_train2017.html (EDA results on MS COCO 2017 - train)
- voc2012.html (EDA results on VOC 2012)
- Some basic high level statistics
- Distribution of objects across images
- Class wise distribution of objects
- Observing average bounding box sizes for each class
- Viewing random images
Used this tool to convert PASCAL VOC XML style datasets to MS COCO style JSONs.
This tool is developed and maintained by Vikas Desai (me).