Skip to content

Commit 98df627

Browse files
committed
Add level-up documentation - level 10 data generation
1 parent ef6ab36 commit 98df627

File tree

2 files changed

+43
-28
lines changed

2 files changed

+43
-28
lines changed

docs/source/docs/level-up/intermediate_skills/10_data_generation.md

-28
This file was deleted.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
===========================
2+
Level 10: Data Generation
3+
===========================
4+
5+
6+
Pre-training of deep learning models for vision tasks can increase model accuracy.
7+
Training model with the synthetic dataset is one of famouse pre-training approach
8+
since the manual annotations is quite expensive work.
9+
10+
Base on the `previous research <https://arxiv.org/abs/2103.13023>`_,
11+
Datumaro provides a fractal image dataset (FractalDB) generator that can be utilized to pre-train the vision models.
12+
Learning visual features of FractalDB is known to increase the performance of Vision Transformer (ViT) models.
13+
Note that a fractal patterns in FractalDB is calculated mathmatically using the interated function system (IFS) with random parameters.
14+
We thus not need to concern about any privacy issues.
15+
16+
17+
.. tab-set::
18+
19+
.. tab-item:: CLI
20+
21+
We can generate the synthetic images by the following CLI command:
22+
23+
.. code-block:: bash
24+
25+
datum generate -o <path/to/data> --count GEN_IMG_COUNT --shape GEN_IMG_SHAPE
26+
27+
``GEN_IMG_COUNT`` is an integer that indicates the number of images to be generated. (e.g. `--count 300`)
28+
``GEN_IMG_SHAPE`` is the shape (width height) of generated images (e.g. `--shape 240 180`)
29+
30+
.. tab-item:: Python
31+
32+
With Pthon API, we can generate the synthetic images as below.
33+
34+
.. code-block:: python
35+
36+
from datumaro.plugins.synthetic_data import FractalImageGenerator
37+
38+
FractalImageGenerator(output_dir=<path/to/data>, count=GEN_IMG_COUNT, shape=GEN_IMG_SHAPE).generate_dataset()
39+
40+
``GEN_IMG_COUNT`` is an integer that indicates the number of images to be generated. (e.g. `count=300`)
41+
``GEN_IMG_SHAPE`` is a tuple representing the shape of generated images as (width, height) (e.g. `shape=(240, 180))
42+
43+
Congratulations! You complete reading all Datumaro level-up documents for the intermediate skills.

0 commit comments

Comments
 (0)