|
| 1 | +=========================== |
| 2 | +Level 10: Data Generation |
| 3 | +=========================== |
| 4 | + |
| 5 | + |
| 6 | +Pre-training of deep learning models for vision tasks can increase model accuracy. |
| 7 | +Training model with the synthetic dataset is one of famouse pre-training approach |
| 8 | +since the manual annotations is quite expensive work. |
| 9 | + |
| 10 | +Base on the `previous research <https://arxiv.org/abs/2103.13023>`_, |
| 11 | +Datumaro provides a fractal image dataset (FractalDB) generator that can be utilized to pre-train the vision models. |
| 12 | +Learning visual features of FractalDB is known to increase the performance of Vision Transformer (ViT) models. |
| 13 | +Note that a fractal patterns in FractalDB is calculated mathmatically using the interated function system (IFS) with random parameters. |
| 14 | +We thus not need to concern about any privacy issues. |
| 15 | + |
| 16 | + |
| 17 | +.. tab-set:: |
| 18 | + |
| 19 | + .. tab-item:: CLI |
| 20 | + |
| 21 | + We can generate the synthetic images by the following CLI command: |
| 22 | + |
| 23 | + .. code-block:: bash |
| 24 | +
|
| 25 | + datum generate -o <path/to/data> --count GEN_IMG_COUNT --shape GEN_IMG_SHAPE |
| 26 | +
|
| 27 | + ``GEN_IMG_COUNT`` is an integer that indicates the number of images to be generated. (e.g. `--count 300`) |
| 28 | + ``GEN_IMG_SHAPE`` is the shape (width height) of generated images (e.g. `--shape 240 180`) |
| 29 | + |
| 30 | + .. tab-item:: Python |
| 31 | + |
| 32 | + With Pthon API, we can generate the synthetic images as below. |
| 33 | + |
| 34 | + .. code-block:: python |
| 35 | +
|
| 36 | + from datumaro.plugins.synthetic_data import FractalImageGenerator |
| 37 | +
|
| 38 | + FractalImageGenerator(output_dir=<path/to/data>, count=GEN_IMG_COUNT, shape=GEN_IMG_SHAPE).generate_dataset() |
| 39 | +
|
| 40 | + ``GEN_IMG_COUNT`` is an integer that indicates the number of images to be generated. (e.g. `count=300`) |
| 41 | + ``GEN_IMG_SHAPE`` is a tuple representing the shape of generated images as (width, height) (e.g. `shape=(240, 180)) |
| 42 | + |
| 43 | +Congratulations! You complete reading all Datumaro level-up documents for the intermediate skills. |
0 commit comments