Custom dataset using custom_dataset_script.py #52
Replies: 2 comments 1 reply
-
Custom detection detasetAs detection dataset format more custom, here take COCO dataset for example. For other specific format, may use this as a template and create own one. coco # dataset base
├── images # images
│ ├── train2017 # train images folder
│ │ ├── 100.jpg # image
│ │ ├── 101.jpg # image
│ │ └── 102.jpg # image
│ ├── val2017 # val images folder
│ │ ├── 211.jpg
│ │ ├── 212.jpg
│ │ └── 213.jpg
...
├── labels # labels
│ ├── train2017 # train labels folder
│ │ ├── 100.txt # label + bbox info
│ │ ├── 101.txt # label + bbox info
│ │ └── 102.txt # label + bbox info
│ ├── val2017 # val labels folder
│ │ ├── 211.txt
│ │ └── 213.txt Each ! cat 100.txt
0 0.643867 0.404833 0.050891 0.117708
31 0.659750 0.451573 0.021875 0.056521 Another dataset structure is using COCO format annotations json file. coco # dataset base
├── images # images
│ ├── train2017 # train images folder
│ │ ├── 100.jpg # image
│ ├── val2017 # val images folder
│ │ ├── 211.jpg
...
└── annotations # annotations
├── instances_train2017.json # COCO format train annotations
└── instances_val2017.json # COCO format test annotations Create dataset json file using # --bbox_source_format cxcywh means source bbox format `[center_width, center_height, width, height]`
python3 custom_dataset_script.py --train_images ../coco/images/train2017/ --train_labels ../coco/labels/train2017/ \
--test_images ../coco/images/val2017/ --test_labels ../coco/labels/val2017 \
--bbox_source_format cxcywh -s coco Or using # Default --bbox_source_format is yxyx, means source bbox format `[top, left, bottom, right]`
python3 custom_dataset_script.py --train_images ../coco/images/train2017/ --train_labels ../coco/labels/train2017/ \
--test_split 0.1 -s dodo Or providing the annotations json file as python3 custom_dataset_script.py --train_images ../coco/images/train2017/ --train_labels ../coco/annotations/instances_train2017.json \
--test_images ../coco/images/val2017/ --test_labels ../coco/annotations/instances_val2017.json -s fofo Then this json file can be used as training dataset: CUDA_VISIBLE_DEVICES='1' python3 coco_train_script.py --data_name dodo.json Required json format detailIt's a json file containing at least 2 keys:
{
"info": {'num_classes': 80, "base_path": "/datasets"}, # optional
"train": [
{"image": "/dataset/coco/images/train2017/100.jpg",
"objects": {
"label": [65, 65, 49],
"bbox":[[0.548703, 0.476851, 0.321469, 0.523592], [0.264453, 0.457306, 0.380063, 0.478647], [0.498773, 0.489612, 0.997547, 0.979224]]
}
},
{"image": "/dataset/coco/images/train2017/101.jpg",
"objects": {
"label": [0, 21],
"bbox":[[0.643867, 0.404833, 0.050891, 0.117708], [0.65975, 0.451573, 0.021875, 0.056521]]
}
},
],
"test": [
{"image": "/dataset/coco/images/val2017/211.jpg",
"objects": {
"label": [0, 27],
"bbox":[[0.580031, 0.355855, 0.2855, 0.682436], [0.408117, 0.646054, 0.379734, 0.172857]]
}
},
],
"indices_2_labels": {0: "cat", 1: "dog"}, # optional
} Check datasetfrom keras_cv_attention_models.coco import data
# Set `anchors_mode="anchor_free"` will just return original bbox
tt = data.init_dataset('coco.json', batch_size=16, anchors_mode="anchor_free")[0]
indices_2_labels = None # For label different from COCO, specify map dict like {0: "foo", 1: "goo"} for better display
ax = data.show_batch_sample(tt, anchors_mode="anchor_free", indices_2_labels=indices_2_labels) Example Usage
|
Beta Was this translation helpful? Give feedback.
-
Custom caption detasetAs caption dataset format more custom, here take flickr30k dataset for example. For other specific format, may use this as a template and create own one. flickr30k # dataset base
├── flickr30k-images # images
│ ├── 100.jpg # image
│ ├── 101.jpg # image
│ └── 102.jpg # image
...
└── results_20130124.token # caption table, or coco caption annotation json file The caption tabel file contains the image name and caption mapping info, could be a tsv or json file, or COCO caption annotation json format file. A tsv format one could be with 2 columns ! $ head -n 2 flickr30k/results_20130124.token
1000092795.jpg#0 Two young guys with shaggy hair look at their hands while hanging out in the yard .
1000092795.jpg#1 Two young , White males are outside near many bushes . Or json file contains a list, and each element is a dict with keys [
{"image": "flickr30k/flickr30k-images/3391453209.jpg", "caption": "A woman in a black coat stands on a curb outside a market ."},
{"image": "flickr30k/flickr30k-images/44904567.jpg", "caption": "A man using an electric razor shaves someone 's head ."},
] Create dataset json file by: python3 custom_dataset_script.py --train_images flickr30k/flickr30k-images/ \
--train_captions flickr30k/results_20130124.token --test_split 0.1 Or provides standalone python3 custom_dataset_script.py --train_images coco_dog_cat/train2017/images/ --test_images coco_dog_cat/val2017/images/ \
--train_captions annotations/captions_train2017.json --test_captions annotations/captions_val2017.json \
-s coco_captions Target saving format can be python3 custom_dataset_script.py --train_images flickr30k/flickr30k-images/ \
--train_captions flickr30k/results_20130124.token --test_split 0.1 --save_format tsv
# >>>> total_train_samples: 143023, total_test_samples: 15892
# >>>> Saved to: flickr30k.tsv Then this file can be used as training dataset: CUDA_VISIBLE_DEVICES='1' python3 train_script.py --data_name flickr30k.tsv --model BeitBasePatch16 --text_model GPT2_Base \
--optimizer adam --disable_positional_related_ops --random_crop_min 1 Required json/tsv format detailIt's a json file containing at least 2 keys:
{
"info": {"base_path": "/datasets"}, # optional
"train": [
{"image": "flickr30k/flickr30k-images/2224450291.jpg", "caption": "The man is outdoors , holding a camera ."},
{"image": "flickr30k/flickr30k-images/3643175169.jpg", "caption": "A man stands on a ladder propped up against a brick building ."}
],
"test": [
{"image": "flickr30k/flickr30k-images/374538975.jpg", "caption": "A choir performing for an audience ."}
]
} Or base_path /datasets
flickr30k/flickr30k-images/5403974296.jpg A woman pulling a slingshot back .
flickr30k/flickr30k-images/1263801010.jpg A person in a red coat looking out at a snowy landscape .
TEST TEST
flickr30k/flickr30k-images/3192005501.jpg woman in the hospital sticking her tongue out
flickr30k/flickr30k-images/8189395281.jpg A woman wearing fishnet stockings is practicing her skating while her coach watches her Check datasetfrom keras_cv_attention_models import clip
caption_tokenizer = clip.GPT2Tokenizer('gpt2')
tt = clip.init_dataset('flickr30k.tsv', batch_size=16, caption_tokenizer=caption_tokenizer)[0]
ax = clip.show_batch_sample(tt, caption_tokenizer=caption_tokenizer, rescale_mode='torch') |
Beta Was this translation helpful? Give feedback.
-
Custom recognition dataset
For data folder in format:
Create dataset json file by:
Or use
--test_split
for dataset not having standalonetest
folder:Then this json file can be used as training dataset:
CUDA_VISIBLE_DEVICES='1' python3 train_script.py --data_name goo.json
Required json format detail
It's a json file containing at least 2 keys:
['train', 'test']
or['train', 'validation']
.'train'
/'test'
/'validation'
is a list containing elements of dict, each has 2 keys'image'
and'label'
. If both'test'
and'validation'
provided, will pick'validation'
one.info
containing elements"num_classes"
and"base_path"
."base_path"
is the absolute path of./
, may change this value if move the dataset to a new path."num_classes"
is also optional, will use the max value from all labels if not provided.indices_2_labels
is used to map int indices to class names.Check dataset
Beta Was this translation helpful? Give feedback.
All reactions