[Refactor] Refine documentation (#1993)

* [WIP] Refine documentation * get started done * config refine * train_test * refine user guides * add contribution * add contribution * refine visualization * advanced tutorial * advanced guides * tricks * refine zh doc * refactor changelog
open-mmlab · Aug 31, 2022 · 309528d · 309528d
1 parent 8ba8017
commit 309528d
Show file tree

Hide file tree

Showing 46 changed files with 1,399 additions and 1,404 deletions.
diff --git a/docs/en/advanced_guides/add_dataset.md b/docs/en/advanced_guides/add_dataset.md
@@ -1,83 +1,4 @@
-# Tutorial 2: Customize Datasets
-
-## Data configuration
-
-`data` in config file is the variable for data configuration, to define the arguments that are used in datasets and dataloaders.
-
-Here is an example of data configuration:
-
-```python
-data = dict(
-    samples_per_gpu=4,
-    workers_per_gpu=4,
-    train=dict(
-        type='ADE20KDataset',
-        data_root='data/ade/ADEChallengeData2016',
-        img_dir='images/training',
-        ann_dir='annotations/training',
-        pipeline=train_pipeline),
-    val=dict(
-        type='ADE20KDataset',
-        data_root='data/ade/ADEChallengeData2016',
-        img_dir='images/validation',
-        ann_dir='annotations/validation',
-        pipeline=test_pipeline),
-    test=dict(
-        type='ADE20KDataset',
-        data_root='data/ade/ADEChallengeData2016',
-        img_dir='images/validation',
-        ann_dir='annotations/validation',
-        pipeline=test_pipeline))
-```
-
-- `train`, `val` and `test`: The [`config`](https://github.com/open-mmlab/mmcv/blob/master/docs/en/understand_mmcv/config.md)s to build dataset instances for model training, validation and testing by
-  using [`build and registry`](https://github.com/open-mmlab/mmcv/blob/master/docs/en/understand_mmcv/registry.md) mechanism.
-
-- `samples_per_gpu`: How many samples per batch and per gpu to load during model training, and the `batch_size` of training is equal to `samples_per_gpu` times gpu number, e.g. when using 8 gpus for distributed data parallel trainig and `samples_per_gpu=4`, the `batch_size` is `8*4=16`.
-  If you would like to define `batch_size` for testing and validation, please use `test_dataloaser` and
-  `val_dataloader` with mmseg >=0.24.1.
-
-- `workers_per_gpu`: How many subprocesses per gpu to use for data loading. `0` means that the data will be loaded in the main process.
-
-**Note:** `samples_per_gpu` only works for model training, and the default setting of `samples_per_gpu` is 1 in mmseg when model testing and validation (DO NOT support batch inference yet).
-
-**Note:** before v0.24.1, except `train`, `val` `test`, `samples_per_gpu` and `workers_per_gpu`, the other keys in `data` must be the
-input keyword arguments for `dataloader` in pytorch, and the dataloaders used for model training, validation and testing have the same input arguments.
-In v0.24.1, mmseg supports to use `train_dataloader`, `test_dataloaser` and `val_dataloader` to specify different keyword arguments, and still supports the overall arguments definition but the specific dataloader setting has a higher priority.
-
-Here is an example for specific dataloader:
-
-```python
-data = dict(
-    samples_per_gpu=4,
-    workers_per_gpu=4,
-    shuffle=True,
-    train=dict(type='xxx', ...),
-    val=dict(type='xxx', ...),
-    test=dict(type='xxx', ...),
-    # Use different batch size during validation and testing.
-    val_dataloader=dict(samples_per_gpu=1, workers_per_gpu=4, shuffle=False),
-    test_dataloader=dict(samples_per_gpu=1, workers_per_gpu=4, shuffle=False))
-```
-
-Assume only one gpu used for model training and testing, as the priority of the overall arguments definition is low, the batch_size
-for training is `4` and dataset will be shuffled, and batch_size for testing and validation is `1`, and dataset will not be shuffled.
-
-To make data configuration much clearer, we recommend use specific dataloader setting instead of overall dataloader setting after v0.24.1, just like:
-
-```python
-data = dict(
-    train=dict(type='xxx', ...),
-    val=dict(type='xxx', ...),
-    test=dict(type='xxx', ...),
-    # Use specific dataloader setting
-    train_dataloader=dict(samples_per_gpu=4, workers_per_gpu=4, shuffle=True),
-    val_dataloader=dict(samples_per_gpu=1, workers_per_gpu=4, shuffle=False),
-    test_dataloader=dict(samples_per_gpu=1, workers_per_gpu=4, shuffle=False))
-```
-
-**Note:** in model training, default values in the script of mmseg for dataloader are `shuffle=True, and drop_last=True`,
-in model validation and testing, default values are `shuffle=False, and drop_last=False`
+# Add New Datasets
 
 ## Customize datasets by reorganizing data
 
@@ -150,65 +71,15 @@ dataset_A_train = dict(
 
 ### Concatenate dataset
 
-There 2 ways to concatenate the dataset.
-
-1. If the datasets you want to concatenate are in the same type with different annotation files,
-   you can concatenate the dataset configs like the following.
-
-   1. You may concatenate two `ann_dir`.
-
-      ```python
-      dataset_A_train = dict(
-          type='Dataset_A',
-          img_dir = 'img_dir',
-          ann_dir = ['anno_dir_1', 'anno_dir_2'],
-          pipeline=train_pipeline
-      )
-      ```
-
-   2. You may concatenate two `split`.
-
-      ```python
-      dataset_A_train = dict(
-          type='Dataset_A',
-          img_dir = 'img_dir',
-          ann_dir = 'anno_dir',
-          split = ['split_1.txt', 'split_2.txt'],
-          pipeline=train_pipeline
-      )
-      ```
-
-   3. You may concatenate two `ann_dir` and `split` simultaneously.
+In case the dataset you want to concatenate is different, you can concatenate the dataset configs like the following.
 
-      ```python
-      dataset_A_train = dict(
-          type='Dataset_A',
-          img_dir = 'img_dir',
-          ann_dir = ['anno_dir_1', 'anno_dir_2'],
-          split = ['split_1.txt', 'split_2.txt'],
-          pipeline=train_pipeline
-      )
-      ```
-
-      In this case, `ann_dir_1` and `ann_dir_2` are corresponding to `split_1.txt` and `split_2.txt`.
-
-2. In case the dataset you want to concatenate is different, you can concatenate the dataset configs like the following.
-
-   ```python
-   dataset_A_train = dict()
-   dataset_B_train = dict()
-
-   data = dict(
-       imgs_per_gpu=2,
-       workers_per_gpu=2,
-       train = [
-           dataset_A_train,
-           dataset_B_train
-       ],
-       val = dataset_A_val,
-       test = dataset_A_test
-       )
-   ```
+```python
+dataset_A_train = dict()
+dataset_B_train = dict()
+concatenate_dataset = dict(
+    type='ConcatDataset',
+    datasets=[dataset_A_train, dataset_B_train])
+```
 
 A more complex example that repeats `Dataset_A` and `Dataset_B` by N and M times, respectively, and then concatenates the repeated datasets is as the following.
 
@@ -239,19 +110,16 @@ dataset_B_train = dict(
         pipeline=train_pipeline
     )
 )
-data = dict(
-    imgs_per_gpu=2,
-    workers_per_gpu=2,
-    train = [
-        dataset_A_train,
-        dataset_B_train
-    ],
-    val = dataset_A_val,
-    test = dataset_A_test
-)
+train_dataloader = dict(
+    dataset=dict('ConcatDataset', datasets=[dataset_A_train, dataset_B_train]))
+
+val_dataloader = dict(dataset=dataset_A_val)
+test_dataloader = dict(dataset=dataset_A_test)
 
 ```
 
+You can refer base dataset [tutorial](TODO) from mmengine for more details
+
 ### Multi-image Mix Dataset
 
 We use `MultiImageMixDataset` as a wrapper to mix images from multiple datasets.
@@ -265,9 +133,7 @@ train_pipeline = [
     dict(type='RandomMosaic', prob=1),
     dict(type='Resize', img_scale=(1024, 512), keep_ratio=True),
     dict(type='RandomFlip', prob=0.5),
-    dict(type='Normalize', **img_norm_cfg),
-    dict(type='DefaultFormatBundle'),
-    dict(type='Collect', keys=['img', 'gt_semantic_seg']),
+    dict(type='PackSegInputs')
 ]
 
 train_dataset = dict(

diff --git a/docs/en/advanced_guides/add_modules.md b/docs/en/advanced_guides/add_modules.md
@@ -1,9 +1,9 @@
-# Tutorial 4: Customize Models
+# Add New Modules
 
 ## Customize optimizer
 
 Assume you want to add a optimizer named as `MyOptimizer`, which has arguments `a`, `b`, and `c`.
-You need to first implement the new optimizer in a file, e.g., in `mmseg/core/optimizer/my_optimizer.py`:
+You need to first implement the new optimizer in a file, e.g., in `mmseg/engine/optimizers/my_optimizer.py`:
 
 ```python
 from mmcv.runner import OPTIMIZERS
@@ -17,7 +17,7 @@ class MyOptimizer(Optimizer):
 
 ```
 
-Then add this module in `mmseg/core/optimizer/__init__.py` thus the registry will
+Then add this module in `mmseg/engine/optimizers/__init__.py` thus the registry will
 find the new module and add it:
 
 ```python
@@ -51,14 +51,12 @@ The users can directly set arguments following the [API doc](https://pytorch.org
 Some models may have some parameter-specific settings for optimization, e.g. weight decay for BatchNoarm layers.
 The users can do those fine-grained parameter tuning through customizing optimizer constructor.
 
-```
-from mmcv.utils import build_from_cfg
-
-from mmcv.runner import OPTIMIZER_BUILDERS
+```python
+from mmseg.registry import OPTIM_WRAPPER_CONSTRUCTORS
 from .cocktail_optimizer import CocktailOptimizer
 
 
-@OPTIMIZER_BUILDERS.register_module
+@OPTIM_WRAPPER_CONSTRUCTORS.register_module
 class CocktailOptimizerConstructor(object):
 
     def __init__(self, optim_wrapper_cfg, paramwise_cfg=None):
@@ -85,10 +83,10 @@ Here we show how to develop new components with an example of MobileNet.
 ```python
 import torch.nn as nn
 
-from ..registry import BACKBONES
+from mmseg.registry import MODELS
 
 
-@BACKBONES.register_module
+@MODELS.register_module
 class MobileNet(nn.Module):
 
     def __init__(self, arg1, arg2):
@@ -121,7 +119,7 @@ model = dict(
 
 ### Add new heads
 
-In MMSegmentation, we provide a base [BaseDecodeHead](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/models/decode_heads/decode_head.py) for all segmentation head.
+In MMSegmentation, we provide a base [BaseDecodeHead](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/mmseg/models/decode_heads/decode_head.py) for all segmentation head.
 All newly implemented decode heads should be derived from it.
 Here we show how to develop a new head with the example of [PSPNet](https://arxiv.org/abs/1612.01105) as the following.
 
@@ -130,7 +128,9 @@ PSPNet implements a decode head for segmentation decode.
 To implement a decode head, basically we need to implement three functions of the new module as the following.
 
 ```python
-@HEADS.register_module()
+from mmseg.registry import MODELS
+
+@MODELS.register_module()
 class PSPHead(BaseDecodeHead):
 
     def __init__(self, pool_scales=(1, 2, 3, 6), **kwargs):
@@ -187,7 +187,7 @@ The decorator `weighted_loss` enable the loss to be weighted for each element.
 import torch
 import torch.nn as nn
 
-from ..builder import LOSSES
+from mmseg.registry import MODELS
 from .utils import weighted_loss
 
 @weighted_loss

diff --git a/docs/en/advanced_guides/add_transform.md b/docs/en/advanced_guides/add_transform.md
@@ -6,7 +6,7 @@
    from mmseg.datasets import TRANSFORMS
    @TRANSFORMS.register_module()
    class MyTransform:
-       def __call__(self, results):
+       def transform(self, results):
            results['dummy'] = True
            return results
    ```

diff --git a/docs/en/advanced_guides/customize_runtime.md b/docs/en/advanced_guides/customize_runtime.md
@@ -1,4 +1,4 @@
-# Tutorial 6: Customize Runtime Settings
+# Customize Runtime Settings
 
 ## Customize optimization settings
 

diff --git a/docs/en/advanced_guides/training_tricks.md b/docs/en/advanced_guides/training_tricks.md
@@ -1,4 +1,4 @@
-# Tutorial 5: Training Tricks
+# Training Tricks
 
 MMSegmentation support following training tricks out of box.
 
@@ -24,7 +24,7 @@ We implement pixel sampler [here](https://github.com/open-mmlab/mmsegmentation/t
 Here is an example config of training PSPNet with OHEM enabled.
 
 ```python
-_base_ = './pspnet_r50-d8_512x1024_40k_cityscapes.py'
+_base_ = './pspnet_r50-d8_4xb2-40k_cityscapes-512x1024.py'
 model=dict(
     decode_head=dict(
         sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=100000)) )
@@ -38,7 +38,7 @@ For dataset that is not balanced in classes distribution, you may change the los
 Here is an example for cityscapes dataset.
 
 ```python
-_base_ = './pspnet_r50-d8_512x1024_40k_cityscapes.py'
+_base_ = './pspnet_r50-d8_4xb2-40k_cityscapes-512x1024.py'
 model=dict(
     decode_head=dict(
         loss_decode=dict(
@@ -76,7 +76,7 @@ In default setting, `avg_non_ignore=False` which means each pixel counts for los
 For loss calculation, we support ignore index of certain label by `avg_non_ignore` and `ignore_index`. In this way, the average loss would only be calculated in non-ignored labels which may achieve better performance, and here is the [reference](https://github.com/open-mmlab/mmsegmentation/pull/1409). Here is an example config of training `unet` on `Cityscapes` dataset: in loss calculation it would ignore label 0 which is background and loss average is only calculated on non-ignore labels:
 
 ```python
-_base_ = './fcn_unet_s5-d16_4x4_512x1024_160k_cityscapes.py'
+_base_ = './unet-s5-d16_fcn_4xb4-160k_cityscapes-512x1024.py'
 model = dict(
     decode_head=dict(
         ignore_index=0,