open-mmlab · MeowZheng · Nov 2, 2022 · Sep 19, 2022 · Oct 8, 2022
diff --git a/docs/en/advanced_guides/transforms.md b/docs/en/advanced_guides/transforms.md
@@ -1,5 +1,13 @@
 # Data Transforms
 
+In this tutorial, we introduce the design of transforms pipeline in MMSegmentation.
+
+The structure of this guide is as follows:
+
+- [Data Transforms](#data-transforms)
+ - [Design of Data pipelines](#design-of-data-pipelines)
+ - [Customization data transformation](#customization-data-transformation)
+
 ## Design of Data pipelines
 
 Following typical conventions, we use `Dataset` and `DataLoader` for data loading
@@ -10,6 +18,24 @@ we introduce a new `DataContainer` type in MMCV to help collect and distribute
 data of different size.
 See [here](https://github.com/open-mmlab/mmcv/blob/master/mmcv/parallel/data_container.py) for more details.
 
+In 1.x version of MMSegmentation, all data transformations are inherited from `BaseTransform`.
+The input and output types of transformations are both dict. A simple example is as follow:
+
+```python
+>>> from mmseg.datasets.transforms import LoadAnnotations
+>>> transforms = LoadAnnotations()
+>>> img_path = './data/cityscapes/leftImg8bit/train/aachen/aachen_000000_000019_leftImg8bit.png.png'
+>>> gt_path = './data/cityscapes/gtFine/train/aachen/aachen_000015_000019_gtFine_instanceTrainIds.png'
+>>> results = dict(
+>>> img_path=img_path,
+>>> seg_map_path=gt_path,
+>>> reduce_zero_label=False,
+>>> seg_fields=[])
+>>> data_dict = transforms(results)
+>>> print(data_dict.keys())
+dict_keys(['img_path', 'seg_map_path', 'reduce_zero_label', 'seg_fields', 'gt_seg_map'])
+```
+
 The data preparation pipeline and the dataset is decomposed. Usually a dataset
 defines how to process the annotations and a data pipeline defines all the steps to prepare a data dict.
 A pipeline consists of a sequence of operations. Each operation takes a dict as input and also output a dict for the next transform.
@@ -43,47 +69,104 @@ test_pipeline = [
 ]
 ```
 
-For each operation, we list the related dict fields that are added/updated/removed.
-Before pipelines, the information we can directly obtain from the datasets are img_path, seg_map_path.
+For each operation, we list the related dict fields that are `added`/`updated`/`removed`.
+Before pipelines, the information we can directly obtain from the datasets are `img_path` and `seg_map_path`.
 
 ### Data loading
 
-`LoadImageFromFile`
+`LoadImageFromFile`: Load an image from file.
 
-- add: img, img_shape, ori_shape
+- add: `img`, `img_shape`, `ori_shape`
 
-`LoadAnnotations`
+`LoadAnnotations`: Load semantic segmentation maps provided by dataset.
 
-- add: seg_fields, gt_seg_map
+- add: `seg_fields`, `gt_seg_map`
 
 ### Pre-processing
 
-`RandomResize`
+`RandomResize`: Random resize image & segmentation map.
 
-- add: scale, scale_factor, keep_ratio
-- update: img, img_shape, gt_seg_map
+- add: `scale`, `scale_factor`, `keep_ratio`
+- update: `img`, `img_shape`, `gt_seg_map`
 
-`Resize`
+`Resize`: Resize image & segmentation map.
 
-- add: scale, scale_factor, keep_ratio
-- update: img, gt_seg_map, img_shape
+- add: `scale`, `scale_factor`, `keep_ratio`
+- update: `img`, `gt_seg_map`, `img_shape`
 
-`RandomCrop`
+`RandomCrop`: Random crop image & segmentation map.
 
-- update: img, pad_shape, gt_seg_map
+- update: `img`, `gt_seg_map`, `img_shape`.
 
-`RandomFlip`
+`RandomFlip`: Flip the image & segmentation map.
 
-- add: flip, flip_direction
-- update: img, gt_seg_map
+- add: `flip`, `flip_direction`
+- update: `img`, `gt_seg_map`
 
-`PhotoMetricDistortion`
+`PhotoMetricDistortion`: Apply photometric distortion to image sequentially,
+every transformation is applied with a probability of 0.5.
+The position of random contrast is in second or second to last(mode 0 or 1 below, respectively).
 
-- update: img
+```
+1. random brightness
+2. random contrast (mode 0)
+3. convert color from BGR to HSV
+4. random saturation
+5. random hue
+6. convert color from HSV to BGR
+7. random contrast (mode 1)
+```
+
+- update: `img`
 
 ### Formatting
 
-`PackSegInputs`
+`PackSegInputs`: Pack the inputs data for the semantic segmentation.
 
-- add: inputs, data_sample
+- add: `inputs`, `data_sample`
 - remove: keys specified by `meta_keys` (merged into the metainfo of data_sample), all other keys
+
+## Customization data transformation
+
+The customized data transformation must inherinted from `BaseTransform` and implement `transform` function.
+Here we use a simple flipping transformation as example:
+
+```python
+import random
+import mmcv
+from mmcv.transforms import BaseTransform, TRANSFORMS
+
+@TRANSFORMS.register_module()
+class MyFlip(BaseTransform):
+ def __init__(self, direction: str):
+ super().__init__()
+ self.direction = direction
+
+ def transform(self, results: dict) -> dict:
+ img = results['img']
+ results['img'] = mmcv.imflip(img, direction=self.direction)
+ return results
+```
+
+Thus, we can instantiate a `MyFlip` object and use it to process the data dict.
+
+```python
+import numpy as np
+
+transform = MyFlip(direction='horizontal')
+data_dict = {'img': np.random.rand(224, 224, 3)}
+data_dict = transform(data_dict)
+processed_img = data_dict['img']
+```
+
+Or, we can use `MyFlip` transformation in data pipeline in our config file.
+
+```python
+pipeline = [
+ ...
+ dict(type='MyFlip', direction='horizontal'),
+ ...
+]
+```
+
+Note that if you want to use `MyFlip` in config, you must ensure the file containing `MyFlip` is imported during the program run.