From ac213d3bf1481474e4fcf5ec091f97ddef396bb4 Mon Sep 17 00:00:00 2001 From: MengzhangLI Date: Tue, 18 Oct 2022 16:32:04 +0800 Subject: [PATCH 01/11] [Doc] Update FAQ doc about binary segmentation and ReduceZeroLabel --- docs/en/faq.md | 80 +++++++++++++++++++++++++++++++++++++++++++++++ docs/zh_cn/faq.md | 80 +++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 160 insertions(+) diff --git a/docs/en/faq.md b/docs/en/faq.md index 5151f1fd36..09c69b915f 100644 --- a/docs/en/faq.md +++ b/docs/en/faq.md @@ -66,3 +66,83 @@ In the test script, we provide `show-dir` argument to control whether output the ```shell python tools/test.py {config} {checkpoint} --show-dir {/path/to/save/image} --opacity 1 ``` + +## Why is the IoU always 0, NaN or very low in binary segmentation task + +Sometimes when training customized dataset, the IoU of certain class is 0, NaN or very low like below: + +``` ++--------------+-------+-------+ +| Class | IoU | Acc | ++--------------+-------+-------+ +| label_a | 80.19 | 100.0 | +| label_b | nan | nan | ++--------------+-------+-------+ +2022-10-18 10:56:56,032 - mmseg - INFO - Summary: +2022-10-18 10:56:56,032 - mmseg - INFO - ++-------+-------+-------+ +| aAcc | mIoU | mAcc | ++-------+-------+-------+ +| 100.0 | 80.19 | 100.0 | ++-------+-------+-------+ +``` + +or + +``` ++------------+------+-------+ +| Class | IoU | Acc | ++------------+------+-------+ +| label_a | 0.0 | 0.0 | +| label_b | 1.77 | 100.0 | ++------------+------+-------+ +2022-10-18 00:57:12,082 - mmseg - INFO - Summary: +2022-10-18 00:57:12,083 - mmseg - INFO - ++------+------+------+ +| aAcc | mIoU | mAcc | ++------+------+------+ +| 1.77 | 0.88 | 50.0 | ++------+------+------+ +``` + +- Solution One: You can follow our config file of dataset [`DRIVE`](https://github.com/open-mmlab/mmsegmentation/blob/master/docs/en/dataset_prepare.md#drive) for reference, whose [dataset class](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/datasets/drive.py) is like below: + +```python +class DRIVEDataset(CustomDataset): + CLASSES = ('background', 'vessel') + + PALETTE = [[120, 120, 120], [6, 230, 230]] + + def __init__(self, **kwargs): + super(DRIVEDataset, self).__init__( + img_suffix='.png', + seg_map_suffix='_manual1.png', + reduce_zero_label=False, + **kwargs) + assert self.file_client.exists(self.img_dir) +``` + +And in corresponding config files of [dataset](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/_base_/datasets/drive.py) and [model](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/_base_/models/fcn_unet_s5-d16.py#L23-L48): + +```python +xxx_head=dict( + num_classes=2, + loss_decode=dict( + type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)) +``` + +- Solution Two: In [#2016](https://github.com/open-mmlab/mmsegmentation/pull/2016), we fix the binary segmentation task when `num_classes=1`. You can follow this [#2201](https://github.com/open-mmlab/mmsegmentation/issues/2201) by setting `num_classes=1` and `use_sigmoid=True` in `CrossEntropyLoss`. + +## What does `reduce_zero_label` work for? + +When [loading annotation](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/datasets/pipelines/loading.py#L91) in MMSegmentation, `reduce_zero_label (bool)` is provided to determine whether reduce all label value by 1: + +```python +if self.reduce_zero_label: + # avoid using underflow conversion + gt_semantic_seg[gt_semantic_seg == 0] = 255 + gt_semantic_seg = gt_semantic_seg - 1 + gt_semantic_seg[gt_semantic_seg == 254] = 255 +``` + +`reduce_zero_label` is usually used for datasets where 0 is background label, if `reduce_zero_label=True`, the pixels whose corresponding label is 0 would not be involved in loss calculation. diff --git a/docs/zh_cn/faq.md b/docs/zh_cn/faq.md index 9a55b8e3c8..e71fdb2f38 100644 --- a/docs/zh_cn/faq.md +++ b/docs/zh_cn/faq.md @@ -66,3 +66,83 @@ ```shell python tools/test.py {config} {checkpoint} --show-dir {/path/to/save/image} --opacity 1 ``` + +## 为什么在二值分割任务里 IoU 总是 0, NaN 或者非常低? + +有时候在训练自定义的数据集时, 如下所示, 某个类别的 IoU 总是 0, NaN 或者很低: + +``` ++--------------+-------+-------+ +| Class | IoU | Acc | ++--------------+-------+-------+ +| label_a | 80.19 | 100.0 | +| label_b | nan | nan | ++--------------+-------+-------+ +2022-10-18 10:56:56,032 - mmseg - INFO - Summary: +2022-10-18 10:56:56,032 - mmseg - INFO - ++-------+-------+-------+ +| aAcc | mIoU | mAcc | ++-------+-------+-------+ +| 100.0 | 80.19 | 100.0 | ++-------+-------+-------+ +``` + +或者 + +``` ++------------+------+-------+ +| Class | IoU | Acc | ++------------+------+-------+ +| label_a | 0.0 | 0.0 | +| label_b | 1.77 | 100.0 | ++------------+------+-------+ +2022-10-18 00:57:12,082 - mmseg - INFO - Summary: +2022-10-18 00:57:12,083 - mmseg - INFO - ++------+------+------+ +| aAcc | mIoU | mAcc | ++------+------+------+ +| 1.77 | 0.88 | 50.0 | ++------+------+------+ +``` + +- 解决方案 (一): 您可以参考数据集 [`DRIVE`](https://github.com/open-mmlab/mmsegmentation/blob/master/docs/en/dataset_prepare.md#drive) 的配置文件, 它的 [数据集类](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/datasets/drive.py) 如下所示: + +```python +class DRIVEDataset(CustomDataset): + CLASSES = ('background', 'vessel') + + PALETTE = [[120, 120, 120], [6, 230, 230]] + + def __init__(self, **kwargs): + super(DRIVEDataset, self).__init__( + img_suffix='.png', + seg_map_suffix='_manual1.png', + reduce_zero_label=False, + **kwargs) + assert self.file_client.exists(self.img_dir) +``` + +并且在 [数据集](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/_base_/datasets/drive.py) 和 [模型](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/_base_/models/fcn_unet_s5-d16.py#L23-L48) 对应的配置文件里设置: + +```python +xxx_head=dict( + num_classes=2, + loss_decode=dict( + type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)) +``` + +- 解决方案 (二): 在 [#2016](https://github.com/open-mmlab/mmsegmentation/pull/2016), 我们修复了当 `num_classes=1` 时的二值分割的问题. 您可以参考这个 issue [#2201](https://github.com/open-mmlab/mmsegmentation/issues/2201), 设置 `num_classes=1` 和 `CrossEntropyLoss` 里的 `use_sigmoid=True`. + +## `reduce_zero_label` 的作用 + +在 MMSegmentation 里面, 当 [加载注释](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/datasets/pipelines/loading.py#L91) 时, `reduce_zero_label (bool)` 被用来决定是否将所有 label 减去 1: + +```python +if self.reduce_zero_label: + # avoid using underflow conversion + gt_semantic_seg[gt_semantic_seg == 0] = 255 + gt_semantic_seg = gt_semantic_seg - 1 + gt_semantic_seg[gt_semantic_seg == 254] = 255 +``` + +`reduce_zero_label` 常常被用来处理 label 0 是背景的数据集, 如果 `reduce_zero_label=True`, label 0 对应的像素将不会参与损失函数的计算. From 0255eb969cb4bb5f9d062c4151478c8084d5d884 Mon Sep 17 00:00:00 2001 From: MengzhangLI Date: Tue, 18 Oct 2022 17:40:36 +0800 Subject: [PATCH 02/11] update --- docs/en/faq.md | 33 ++++++++++++++++++++++++ docs/zh_cn/faq.md | 33 ++++++++++++++++++++++++ mmseg/models/decode_heads/decode_head.py | 2 +- 3 files changed, 67 insertions(+), 1 deletion(-) diff --git a/docs/en/faq.md b/docs/en/faq.md index 09c69b915f..b526c80d26 100644 --- a/docs/en/faq.md +++ b/docs/en/faq.md @@ -133,6 +133,39 @@ xxx_head=dict( - Solution Two: In [#2016](https://github.com/open-mmlab/mmsegmentation/pull/2016), we fix the binary segmentation task when `num_classes=1`. You can follow this [#2201](https://github.com/open-mmlab/mmsegmentation/issues/2201) by setting `num_classes=1` and `use_sigmoid=True` in `CrossEntropyLoss`. +In summary, we encourage beginners to use `num_classes=2`, `out_channels=1` and `use_sigmoid=True` in `CrossEntropyLoss`, below is a modification example of [pspnet_unet_s5-d16.py](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/_base_/models/pspnet_unet_s5-d16.py): + +```python +decode_head=dict( + type='PSPHead', + in_channels=64, + in_index=4, + num_classes=2, + out_channels=1, + loss_decode=dict( + type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0)), +auxiliary_head=dict( + type='FCNHead', + in_channels=128, + in_index=3, + num_classes=2, + out_channels=1, + loss_decode=dict( + type='CrossEntropyLoss', use_sigmoid=True, loss_weight=0.4)), +``` + +In [#2016](https://github.com/open-mmlab/mmsegmentation/pull/2016) we also provide a parameter `threshold` for binary segmentation in the case of `out_channels=1`. It would be used to calculate segmentation prediction in [encoder_decoder.py](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/models/segmentors/encoder_decoder.py): + +```python +if self.out_channels == 1: + seg_pred = (seg_logit > + self.decode_head.threshold).to(seg_logit).squeeze(1) +else: + seg_pred = seg_logit.argmax(dim=1) +``` + +By setting different value of `threshold`, users can calculate ROC(Receiver Operating Characteristic) Curve and AUC(Area Under Curve). + ## What does `reduce_zero_label` work for? When [loading annotation](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/datasets/pipelines/loading.py#L91) in MMSegmentation, `reduce_zero_label (bool)` is provided to determine whether reduce all label value by 1: diff --git a/docs/zh_cn/faq.md b/docs/zh_cn/faq.md index e71fdb2f38..20a1cd5218 100644 --- a/docs/zh_cn/faq.md +++ b/docs/zh_cn/faq.md @@ -133,6 +133,39 @@ xxx_head=dict( - 解决方案 (二): 在 [#2016](https://github.com/open-mmlab/mmsegmentation/pull/2016), 我们修复了当 `num_classes=1` 时的二值分割的问题. 您可以参考这个 issue [#2201](https://github.com/open-mmlab/mmsegmentation/issues/2201), 设置 `num_classes=1` 和 `CrossEntropyLoss` 里的 `use_sigmoid=True`. +综上所述, 我们鼓励初学者设置 `num_classes=2`, `out_channels=1` 和 `CrossEntropyLoss` 里的 `use_sigmoid=True`, 下面是 [pspnet_unet_s5-d16.py](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/_base_/models/pspnet_unet_s5-d16.py) 的一个修改样例: + +```python +decode_head=dict( + type='PSPHead', + in_channels=64, + in_index=4, + num_classes=2, + out_channels=1, + loss_decode=dict( + type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0)), +auxiliary_head=dict( + type='FCNHead', + in_channels=128, + in_index=3, + num_classes=2, + out_channels=1, + loss_decode=dict( + type='CrossEntropyLoss', use_sigmoid=True, loss_weight=0.4)), +``` + +对于二值分割任务, 在 [#2016](https://github.com/open-mmlab/mmsegmentation/pull/2016) 里面我们还提供了一个 `threshold` 参数来处理 `out_channels=1` 的情况. 它会在 [encoder_decoder.py](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/models/segmentors/encoder_decoder.py) 里面用来得到预测的结果: + +```python +if self.out_channels == 1: + seg_pred = (seg_logit > + self.decode_head.threshold).to(seg_logit).squeeze(1) +else: + seg_pred = seg_logit.argmax(dim=1) +``` + +通过设置 `threshold` 不同的值, 用户可以由此计算出 ROC 曲线和 AUC 的值. + ## `reduce_zero_label` 的作用 在 MMSegmentation 里面, 当 [加载注释](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/datasets/pipelines/loading.py#L91) 时, `reduce_zero_label (bool)` 被用来决定是否将所有 label 减去 1: diff --git a/mmseg/models/decode_heads/decode_head.py b/mmseg/models/decode_heads/decode_head.py index f6b05dd3eb..c893f76e75 100644 --- a/mmseg/models/decode_heads/decode_head.py +++ b/mmseg/models/decode_heads/decode_head.py @@ -21,7 +21,7 @@ class BaseDecodeHead(BaseModule, metaclass=ABCMeta): num_classes (int): Number of classes. out_channels (int): Output channels of conv_seg. threshold (float): Threshold for binary segmentation in the case of - `num_classes==1`. Default: None. + `out_channels==1`. Default: None. dropout_ratio (float): Ratio of dropout layer. Default: 0.1. conv_cfg (dict|None): Config of conv layers. Default: None. norm_cfg (dict|None): Config of norm layers. Default: None. From 207f3c400decc754bcf577c4c08d7b94a5eca278 Mon Sep 17 00:00:00 2001 From: MengzhangLI Date: Tue, 18 Oct 2022 23:55:39 +0800 Subject: [PATCH 03/11] modify --- docs/en/faq.md | 117 +++++++++++++++++++-------------------------- docs/zh_cn/faq.md | 118 +++++++++++++++++++--------------------------- 2 files changed, 98 insertions(+), 137 deletions(-) diff --git a/docs/en/faq.md b/docs/en/faq.md index b526c80d26..ecabb3316d 100644 --- a/docs/en/faq.md +++ b/docs/en/faq.md @@ -67,73 +67,65 @@ In the test script, we provide `show-dir` argument to control whether output the python tools/test.py {config} {checkpoint} --show-dir {/path/to/save/image} --opacity 1 ``` -## Why is the IoU always 0, NaN or very low in binary segmentation task +## How to handle binary segmentation task -Sometimes when training customized dataset, the IoU of certain class is 0, NaN or very low like below: +MMSegmentation uses `num_classes` and `out_channels` to control output of last layer `self.conv_seg` (More details could be found [here](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/models/decode_heads/decode_head.py).): -``` -+--------------+-------+-------+ -| Class | IoU | Acc | -+--------------+-------+-------+ -| label_a | 80.19 | 100.0 | -| label_b | nan | nan | -+--------------+-------+-------+ -2022-10-18 10:56:56,032 - mmseg - INFO - Summary: -2022-10-18 10:56:56,032 - mmseg - INFO - -+-------+-------+-------+ -| aAcc | mIoU | mAcc | -+-------+-------+-------+ -| 100.0 | 80.19 | 100.0 | -+-------+-------+-------+ -``` - -or - -``` -+------------+------+-------+ -| Class | IoU | Acc | -+------------+------+-------+ -| label_a | 0.0 | 0.0 | -| label_b | 1.77 | 100.0 | -+------------+------+-------+ -2022-10-18 00:57:12,082 - mmseg - INFO - Summary: -2022-10-18 00:57:12,083 - mmseg - INFO - -+------+------+------+ -| aAcc | mIoU | mAcc | -+------+------+------+ -| 1.77 | 0.88 | 50.0 | -+------+------+------+ +```python +def __init__(self, + ..., + ): + ... + if out_channels is None: + if num_classes == 2: + warnings.warn('For binary segmentation, we suggest using' + '`out_channels = 1` to define the output' + 'channels of segmentor, and use `threshold`' + 'to convert seg_logist into a prediction' + 'applying a threshold') + out_channels = num_classes + + if out_channels != num_classes and out_channels != 1: + raise ValueError( + 'out_channels should be equal to num_classes,' + 'except binary segmentation set out_channels == 1 and' + f'num_classes == 2, but got out_channels={out_channels}' + f'and num_classes={num_classes}') + + if out_channels == 1 and threshold is None: + threshold = 0.3 + warnings.warn('threshold is not defined for binary, and defaults' + 'to 0.3') + self.num_classes = num_classes + self.out_channels = out_channels + self.threshold = threshold + ... + self.conv_seg = nn.Conv2d(channels, self.out_channels, kernel_size=1) ``` -- Solution One: You can follow our config file of dataset [`DRIVE`](https://github.com/open-mmlab/mmsegmentation/blob/master/docs/en/dataset_prepare.md#drive) for reference, whose [dataset class](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/datasets/drive.py) is like below: +There are two types of calculating binary segmentation methods. First, when `out_channels=2`, using `F.softmax()` and `argmax()` to get prediction and then calculating by Cross Entropy Loss. Second, in the case of `out_channels=1`, in [#2016](https://github.com/open-mmlab/mmsegmentation/pull/2016) we provide a parameter `threshold(default to 0.3)` for binary segmentation. Using `F.sigmoid()` and `threshold` to get prediction and then calculating by Binacry Cross Entropy Loss: ```python -class DRIVEDataset(CustomDataset): - CLASSES = ('background', 'vessel') - - PALETTE = [[120, 120, 120], [6, 230, 230]] - - def __init__(self, **kwargs): - super(DRIVEDataset, self).__init__( - img_suffix='.png', - seg_map_suffix='_manual1.png', - reduce_zero_label=False, - **kwargs) - assert self.file_client.exists(self.img_dir) -``` +... +if self.out_channels == 1: + seg_logit = F.sigmoid(seg_logit) +else: + seg_logit = F.softmax(seg_logit, dim=1) -And in corresponding config files of [dataset](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/_base_/datasets/drive.py) and [model](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/_base_/models/fcn_unet_s5-d16.py#L23-L48): +... -```python -xxx_head=dict( - num_classes=2, - loss_decode=dict( - type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)) +if self.out_channels == 1: + seg_pred = (seg_logit > + self.decode_head.threshold).to(seg_logit).squeeze(1) +else: + seg_pred = seg_logit.argmax(dim=1) ``` -- Solution Two: In [#2016](https://github.com/open-mmlab/mmsegmentation/pull/2016), we fix the binary segmentation task when `num_classes=1`. You can follow this [#2201](https://github.com/open-mmlab/mmsegmentation/issues/2201) by setting `num_classes=1` and `use_sigmoid=True` in `CrossEntropyLoss`. +More details about calculating segmentation prediction could be found in [encoder_decoder.py](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/models/segmentors/encoder_decoder.py): + +In summary, we encourage beginners to take solution (1) `num_classes=2`, `out_channels=2` and `use_sigmoid=False` in `CrossEntropyLoss` or (2) `num_classes=2`, `out_channels=1` and `use_sigmoid=True` in `CrossEntropyLoss`. -In summary, we encourage beginners to use `num_classes=2`, `out_channels=1` and `use_sigmoid=True` in `CrossEntropyLoss`, below is a modification example of [pspnet_unet_s5-d16.py](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/_base_/models/pspnet_unet_s5-d16.py): +When taking solution (2), below is a modification example of [pspnet_unet_s5-d16.py](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/_base_/models/pspnet_unet_s5-d16.py): ```python decode_head=dict( @@ -154,18 +146,6 @@ auxiliary_head=dict( type='CrossEntropyLoss', use_sigmoid=True, loss_weight=0.4)), ``` -In [#2016](https://github.com/open-mmlab/mmsegmentation/pull/2016) we also provide a parameter `threshold` for binary segmentation in the case of `out_channels=1`. It would be used to calculate segmentation prediction in [encoder_decoder.py](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/models/segmentors/encoder_decoder.py): - -```python -if self.out_channels == 1: - seg_pred = (seg_logit > - self.decode_head.threshold).to(seg_logit).squeeze(1) -else: - seg_pred = seg_logit.argmax(dim=1) -``` - -By setting different value of `threshold`, users can calculate ROC(Receiver Operating Characteristic) Curve and AUC(Area Under Curve). - ## What does `reduce_zero_label` work for? When [loading annotation](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/datasets/pipelines/loading.py#L91) in MMSegmentation, `reduce_zero_label (bool)` is provided to determine whether reduce all label value by 1: @@ -179,3 +159,4 @@ if self.reduce_zero_label: ``` `reduce_zero_label` is usually used for datasets where 0 is background label, if `reduce_zero_label=True`, the pixels whose corresponding label is 0 would not be involved in loss calculation. +Noted that in binary segmentation task it is unnecessary to use `reduce_zero_label=True`, take solutions we mentioned above please. diff --git a/docs/zh_cn/faq.md b/docs/zh_cn/faq.md index 20a1cd5218..a26f9e27d8 100644 --- a/docs/zh_cn/faq.md +++ b/docs/zh_cn/faq.md @@ -67,73 +67,65 @@ python tools/test.py {config} {checkpoint} --show-dir {/path/to/save/image} --opacity 1 ``` -## 为什么在二值分割任务里 IoU 总是 0, NaN 或者非常低? +## 如何处理二值分割任务? -有时候在训练自定义的数据集时, 如下所示, 某个类别的 IoU 总是 0, NaN 或者很低: +MMSegmentation 使用 `num_classes` 和 `out_channels` 来控制模型最后一层 `self.conv_seg` 的输出. (更多细节可以参考 [这里](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/models/decode_heads/decode_head.py).): -``` -+--------------+-------+-------+ -| Class | IoU | Acc | -+--------------+-------+-------+ -| label_a | 80.19 | 100.0 | -| label_b | nan | nan | -+--------------+-------+-------+ -2022-10-18 10:56:56,032 - mmseg - INFO - Summary: -2022-10-18 10:56:56,032 - mmseg - INFO - -+-------+-------+-------+ -| aAcc | mIoU | mAcc | -+-------+-------+-------+ -| 100.0 | 80.19 | 100.0 | -+-------+-------+-------+ -``` - -或者 - -``` -+------------+------+-------+ -| Class | IoU | Acc | -+------------+------+-------+ -| label_a | 0.0 | 0.0 | -| label_b | 1.77 | 100.0 | -+------------+------+-------+ -2022-10-18 00:57:12,082 - mmseg - INFO - Summary: -2022-10-18 00:57:12,083 - mmseg - INFO - -+------+------+------+ -| aAcc | mIoU | mAcc | -+------+------+------+ -| 1.77 | 0.88 | 50.0 | -+------+------+------+ +```python +def __init__(self, + ..., + ): + ... + if out_channels is None: + if num_classes == 2: + warnings.warn('For binary segmentation, we suggest using' + '`out_channels = 1` to define the output' + 'channels of segmentor, and use `threshold`' + 'to convert seg_logist into a prediction' + 'applying a threshold') + out_channels = num_classes + + if out_channels != num_classes and out_channels != 1: + raise ValueError( + 'out_channels should be equal to num_classes,' + 'except binary segmentation set out_channels == 1 and' + f'num_classes == 2, but got out_channels={out_channels}' + f'and num_classes={num_classes}') + + if out_channels == 1 and threshold is None: + threshold = 0.3 + warnings.warn('threshold is not defined for binary, and defaults' + 'to 0.3') + self.num_classes = num_classes + self.out_channels = out_channels + self.threshold = threshold + ... + self.conv_seg = nn.Conv2d(channels, self.out_channels, kernel_size=1) ``` -- 解决方案 (一): 您可以参考数据集 [`DRIVE`](https://github.com/open-mmlab/mmsegmentation/blob/master/docs/en/dataset_prepare.md#drive) 的配置文件, 它的 [数据集类](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/datasets/drive.py) 如下所示: +有两种计算二值分割任务的方法. 第一种是当 `out_channels=2` 时, 使用 `F.softmax()` 然后通过 `argmax()` 得到预测结果, 再以 Cross Entropy Loss 作为损失函数. 第二种是当 `out_channels=1` 时, 使用在 [#2016](https://github.com/open-mmlab/mmsegmentation/pull/2016) 提供的参数 `threshold (默认为 0.3)`. 通过 `F.sigmoid()` 和 `threshold` 得到预测结果, 再~~~~以 Binary Cross Entropy Loss 作为损失函数. ```python -class DRIVEDataset(CustomDataset): - CLASSES = ('background', 'vessel') - - PALETTE = [[120, 120, 120], [6, 230, 230]] - - def __init__(self, **kwargs): - super(DRIVEDataset, self).__init__( - img_suffix='.png', - seg_map_suffix='_manual1.png', - reduce_zero_label=False, - **kwargs) - assert self.file_client.exists(self.img_dir) -``` +... +if self.out_channels == 1: + seg_logit = F.sigmoid(seg_logit) +else: + seg_logit = F.softmax(seg_logit, dim=1) -并且在 [数据集](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/_base_/datasets/drive.py) 和 [模型](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/_base_/models/fcn_unet_s5-d16.py#L23-L48) 对应的配置文件里设置: +... -```python -xxx_head=dict( - num_classes=2, - loss_decode=dict( - type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)) +if self.out_channels == 1: + seg_pred = (seg_logit > + self.decode_head.threshold).to(seg_logit).squeeze(1) +else: + seg_pred = seg_logit.argmax(dim=1) ``` -- 解决方案 (二): 在 [#2016](https://github.com/open-mmlab/mmsegmentation/pull/2016), 我们修复了当 `num_classes=1` 时的二值分割的问题. 您可以参考这个 issue [#2201](https://github.com/open-mmlab/mmsegmentation/issues/2201), 设置 `num_classes=1` 和 `CrossEntropyLoss` 里的 `use_sigmoid=True`. +更多关于计算语义分割预测的细节可以参考 [encoder_decoder.py](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/models/segmentors/encoder_decoder.py): + +综上所述, 我们建议使用者采取两种解决方案 (1) `num_classes=2`, `out_channels=2` 并在 `CrossEntropyLoss` 里面设置 `use_sigmoid=False` 或者 (2) `num_classes=2`, `out_channels=1` 并在 `CrossEntropyLoss` 里面设置 `use_sigmoid=True`. -综上所述, 我们鼓励初学者设置 `num_classes=2`, `out_channels=1` 和 `CrossEntropyLoss` 里的 `use_sigmoid=True`, 下面是 [pspnet_unet_s5-d16.py](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/_base_/models/pspnet_unet_s5-d16.py) 的一个修改样例: +当采用解决方案 (2) 时, 下面是对样例 [pspnet_unet_s5-d16.py](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/_base_/models/pspnet_unet_s5-d16.py) 的对应的修改: ```python decode_head=dict( @@ -154,18 +146,6 @@ auxiliary_head=dict( type='CrossEntropyLoss', use_sigmoid=True, loss_weight=0.4)), ``` -对于二值分割任务, 在 [#2016](https://github.com/open-mmlab/mmsegmentation/pull/2016) 里面我们还提供了一个 `threshold` 参数来处理 `out_channels=1` 的情况. 它会在 [encoder_decoder.py](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/models/segmentors/encoder_decoder.py) 里面用来得到预测的结果: - -```python -if self.out_channels == 1: - seg_pred = (seg_logit > - self.decode_head.threshold).to(seg_logit).squeeze(1) -else: - seg_pred = seg_logit.argmax(dim=1) -``` - -通过设置 `threshold` 不同的值, 用户可以由此计算出 ROC 曲线和 AUC 的值. - ## `reduce_zero_label` 的作用 在 MMSegmentation 里面, 当 [加载注释](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/datasets/pipelines/loading.py#L91) 时, `reduce_zero_label (bool)` 被用来决定是否将所有 label 减去 1: @@ -178,4 +158,4 @@ if self.reduce_zero_label: gt_semantic_seg[gt_semantic_seg == 254] = 255 ``` -`reduce_zero_label` 常常被用来处理 label 0 是背景的数据集, 如果 `reduce_zero_label=True`, label 0 对应的像素将不会参与损失函数的计算. +`reduce_zero_label` 常常被用来处理 label 0 是背景的数据集, 如果 `reduce_zero_label=True`, label 0 对应的像素将不会参与损失函数的计算. 需要说明的是在二值分割任务中没有必要设置 `reduce_zero_label=True`, 请采用上面我们提到的解决方案. From 8a72af73aad43999bae4a27d4776f2d18729148c Mon Sep 17 00:00:00 2001 From: MengzhangLI Date: Thu, 20 Oct 2022 13:22:51 +0800 Subject: [PATCH 04/11] fix typo and add modification --- docs/en/faq.md | 14 +++++++++++--- docs/zh_cn/faq.md | 14 +++++++++++--- 2 files changed, 22 insertions(+), 6 deletions(-) diff --git a/docs/en/faq.md b/docs/en/faq.md index ecabb3316d..146de53ddb 100644 --- a/docs/en/faq.md +++ b/docs/en/faq.md @@ -103,7 +103,7 @@ def __init__(self, self.conv_seg = nn.Conv2d(channels, self.out_channels, kernel_size=1) ``` -There are two types of calculating binary segmentation methods. First, when `out_channels=2`, using `F.softmax()` and `argmax()` to get prediction and then calculating by Cross Entropy Loss. Second, in the case of `out_channels=1`, in [#2016](https://github.com/open-mmlab/mmsegmentation/pull/2016) we provide a parameter `threshold(default to 0.3)` for binary segmentation. Using `F.sigmoid()` and `threshold` to get prediction and then calculating by Binacry Cross Entropy Loss: +There are two types of calculating binary segmentation methods: ```python ... @@ -121,9 +121,17 @@ else: seg_pred = seg_logit.argmax(dim=1) ``` +- When `out_channels=2`, using Cross Entropy Loss in training, using `F.softmax()` and `argmax()` to get prediction of each pixel in inference. + +- When `out_channels=1`, we provide a parameter `threshold(default to 0.3)` in [#2016](https://github.com/open-mmlab/mmsegmentation/pull/2016), using Binary Cross Entropy Loss in training, using `F.sigmoid()` and `threshold` to get prediction of each pixel in inference. + More details about calculating segmentation prediction could be found in [encoder_decoder.py](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/models/segmentors/encoder_decoder.py): -In summary, we encourage beginners to take solution (1) `num_classes=2`, `out_channels=2` and `use_sigmoid=False` in `CrossEntropyLoss` or (2) `num_classes=2`, `out_channels=1` and `use_sigmoid=True` in `CrossEntropyLoss`. +In summary, to implement binary segmentation methods users should modify below parameters in config files: + +- (1) `num_classes=2`, `out_channels=2` and `use_sigmoid=False` in `CrossEntropyLoss`. + +- (2) `num_classes=2`, `out_channels=1` and `use_sigmoid=True` in `CrossEntropyLoss`. When taking solution (2), below is a modification example of [pspnet_unet_s5-d16.py](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/_base_/models/pspnet_unet_s5-d16.py): @@ -159,4 +167,4 @@ if self.reduce_zero_label: ``` `reduce_zero_label` is usually used for datasets where 0 is background label, if `reduce_zero_label=True`, the pixels whose corresponding label is 0 would not be involved in loss calculation. -Noted that in binary segmentation task it is unnecessary to use `reduce_zero_label=True`, take solutions we mentioned above please. +Noted that in binary segmentation task it is unnecessary to use `reduce_zero_label=True`, please take solutions we mentioned above. diff --git a/docs/zh_cn/faq.md b/docs/zh_cn/faq.md index a26f9e27d8..3363ed9dff 100644 --- a/docs/zh_cn/faq.md +++ b/docs/zh_cn/faq.md @@ -103,7 +103,11 @@ def __init__(self, self.conv_seg = nn.Conv2d(channels, self.out_channels, kernel_size=1) ``` -有两种计算二值分割任务的方法. 第一种是当 `out_channels=2` 时, 使用 `F.softmax()` 然后通过 `argmax()` 得到预测结果, 再以 Cross Entropy Loss 作为损失函数. 第二种是当 `out_channels=1` 时, 使用在 [#2016](https://github.com/open-mmlab/mmsegmentation/pull/2016) 提供的参数 `threshold (默认为 0.3)`. 通过 `F.sigmoid()` 和 `threshold` 得到预测结果, 再~~~~以 Binary Cross Entropy Loss 作为损失函数. +有两种计算二值分割任务的方法: + +- 当 `out_channels=2` 时, 在训练时以 Cross Entropy Loss 作为损失函数, 在推理时使用 `F.softmax()` 归一化 logits 值, 然后通过 `argmax()` 得到每个像素的预测结果. + +- 当 `out_channels=1` 时, 我们在 [#2016](https://github.com/open-mmlab/mmsegmentation/pull/2016) 里提供了阈值参数 `threshold (默认为 0.3)`, 在训练时以 Binary Cross Entropy Loss 作为损失函数, 在推理时使用 `F.sigmoid()` 和在 `threshold` 得到预测结果. ```python ... @@ -123,9 +127,13 @@ else: 更多关于计算语义分割预测的细节可以参考 [encoder_decoder.py](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/models/segmentors/encoder_decoder.py): -综上所述, 我们建议使用者采取两种解决方案 (1) `num_classes=2`, `out_channels=2` 并在 `CrossEntropyLoss` 里面设置 `use_sigmoid=False` 或者 (2) `num_classes=2`, `out_channels=1` 并在 `CrossEntropyLoss` 里面设置 `use_sigmoid=True`. +对于实现上述两种计算二值分割的方法, 需要分别在配置文件里修改: + +- (1) `num_classes=2`, `out_channels=2` 并在 `CrossEntropyLoss` 里面设置 `use_sigmoid=False` + +- (2) `num_classes=2`, `out_channels=1` 并在 `CrossEntropyLoss` 里面设置 `use_sigmoid=True`. -当采用解决方案 (2) 时, 下面是对样例 [pspnet_unet_s5-d16.py](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/_base_/models/pspnet_unet_s5-d16.py) 的对应的修改: +如果采用解决方案 (2), 下面是对样例 [pspnet_unet_s5-d16.py](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/_base_/models/pspnet_unet_s5-d16.py) 做出的对应修改: ```python decode_head=dict( From f63d67561b5be3cda3c4f948e448b77506dfb643 Mon Sep 17 00:00:00 2001 From: MengzhangLI Date: Fri, 21 Oct 2022 18:06:29 +0800 Subject: [PATCH 05/11] fix typo --- docs/en/faq.md | 2 +- docs/zh_cn/faq.md | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/en/faq.md b/docs/en/faq.md index 146de53ddb..8ff2ecef04 100644 --- a/docs/en/faq.md +++ b/docs/en/faq.md @@ -127,7 +127,7 @@ else: More details about calculating segmentation prediction could be found in [encoder_decoder.py](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/models/segmentors/encoder_decoder.py): -In summary, to implement binary segmentation methods users should modify below parameters in config files: +In summary, to implement binary segmentation methods users should modify below parameters in the `decode_head` and `auxiliary_head` configs: - (1) `num_classes=2`, `out_channels=2` and `use_sigmoid=False` in `CrossEntropyLoss`. diff --git a/docs/zh_cn/faq.md b/docs/zh_cn/faq.md index 3363ed9dff..24515a659a 100644 --- a/docs/zh_cn/faq.md +++ b/docs/zh_cn/faq.md @@ -107,7 +107,7 @@ def __init__(self, - 当 `out_channels=2` 时, 在训练时以 Cross Entropy Loss 作为损失函数, 在推理时使用 `F.softmax()` 归一化 logits 值, 然后通过 `argmax()` 得到每个像素的预测结果. -- 当 `out_channels=1` 时, 我们在 [#2016](https://github.com/open-mmlab/mmsegmentation/pull/2016) 里提供了阈值参数 `threshold (默认为 0.3)`, 在训练时以 Binary Cross Entropy Loss 作为损失函数, 在推理时使用 `F.sigmoid()` 和在 `threshold` 得到预测结果. +- 当 `out_channels=1` 时, 我们在 [#2016](https://github.com/open-mmlab/mmsegmentation/pull/2016) 里提供了阈值参数 `threshold (默认为 0.3)`, 在训练时以 Binary Cross Entropy Loss 作为损失函数, 在推理时使用 `F.sigmoid()` 和 `threshold` 得到预测结果. ```python ... @@ -127,7 +127,7 @@ else: 更多关于计算语义分割预测的细节可以参考 [encoder_decoder.py](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/models/segmentors/encoder_decoder.py): -对于实现上述两种计算二值分割的方法, 需要分别在配置文件里修改: +对于实现上述两种计算二值分割的方法, 需要在 `decode_head` 和 `auxiliary_head` 的配置里修改: - (1) `num_classes=2`, `out_channels=2` 并在 `CrossEntropyLoss` 里面设置 `use_sigmoid=False` From 49261e207e670acdfe2c4487b4348f0d0af91d2e Mon Sep 17 00:00:00 2001 From: MengzhangLI Date: Mon, 24 Oct 2022 12:45:08 +0800 Subject: [PATCH 06/11] fix comments --- docs/en/faq.md | 83 ++++++++++++++------------------------------- docs/zh_cn/faq.md | 86 +++++++++++++++-------------------------------- 2 files changed, 54 insertions(+), 115 deletions(-) diff --git a/docs/en/faq.md b/docs/en/faq.md index 8ff2ecef04..142ba976e9 100644 --- a/docs/en/faq.md +++ b/docs/en/faq.md @@ -71,69 +71,39 @@ python tools/test.py {config} {checkpoint} --show-dir {/path/to/save/image} --op MMSegmentation uses `num_classes` and `out_channels` to control output of last layer `self.conv_seg` (More details could be found [here](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/models/decode_heads/decode_head.py).): -```python -def __init__(self, - ..., - ): - ... - if out_channels is None: - if num_classes == 2: - warnings.warn('For binary segmentation, we suggest using' - '`out_channels = 1` to define the output' - 'channels of segmentor, and use `threshold`' - 'to convert seg_logist into a prediction' - 'applying a threshold') - out_channels = num_classes - - if out_channels != num_classes and out_channels != 1: - raise ValueError( - 'out_channels should be equal to num_classes,' - 'except binary segmentation set out_channels == 1 and' - f'num_classes == 2, but got out_channels={out_channels}' - f'and num_classes={num_classes}') - - if out_channels == 1 and threshold is None: - threshold = 0.3 - warnings.warn('threshold is not defined for binary, and defaults' - 'to 0.3') - self.num_classes = num_classes - self.out_channels = out_channels - self.threshold = threshold - ... - self.conv_seg = nn.Conv2d(channels, self.out_channels, kernel_size=1) -``` - -There are two types of calculating binary segmentation methods: - -```python -... -if self.out_channels == 1: - seg_logit = F.sigmoid(seg_logit) -else: - seg_logit = F.softmax(seg_logit, dim=1) - -... - -if self.out_channels == 1: - seg_pred = (seg_logit > - self.decode_head.threshold).to(seg_logit).squeeze(1) -else: - seg_pred = seg_logit.argmax(dim=1) -``` +- Set `out_channels=2`, using Cross Entropy Loss in training, using `F.softmax()` and `argmax()` to get prediction of each pixel in inference. -- When `out_channels=2`, using Cross Entropy Loss in training, using `F.softmax()` and `argmax()` to get prediction of each pixel in inference. +- Set `out_channels=1`, using Binary Cross Entropy Loss in training, using `F.sigmoid()` and `threshold` to get prediction of each pixel in inference. `threshold` is set 0.3 as default. -- When `out_channels=1`, we provide a parameter `threshold(default to 0.3)` in [#2016](https://github.com/open-mmlab/mmsegmentation/pull/2016), using Binary Cross Entropy Loss in training, using `F.sigmoid()` and `threshold` to get prediction of each pixel in inference. +More details about implementation could be found in [encoder_decoder.py](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/models/segmentors/encoder_decoder.py): -More details about calculating segmentation prediction could be found in [encoder_decoder.py](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/models/segmentors/encoder_decoder.py): +In summary, to implement binary segmentation methods users should modify below parameters in the `decode_head` and `auxiliary_head` configs. Here is a modification example of [pspnet_unet_s5-d16.py](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/_base_/models/pspnet_unet_s5-d16.py): -In summary, to implement binary segmentation methods users should modify below parameters in the `decode_head` and `auxiliary_head` configs: +`num_classes` should be the same as number of types of labels, in binary segmentation task, dataset only has two types of labels: foreground and background, so `num_classes=2`. `out_channels` controls the output channel of last layer of model, it usually equals to `num_classes`. +But in binary segmentation task, there are two solutions: - (1) `num_classes=2`, `out_channels=2` and `use_sigmoid=False` in `CrossEntropyLoss`. -- (2) `num_classes=2`, `out_channels=1` and `use_sigmoid=True` in `CrossEntropyLoss`. +```python +decode_head=dict( + type='PSPHead', + in_channels=64, + in_index=4, + num_classes=2, + out_channels=2, + loss_decode=dict( + type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)), +auxiliary_head=dict( + type='FCNHead', + in_channels=128, + in_index=3, + num_classes=2, + out_channels=2, + loss_decode=dict( + type='CrossEntropyLoss', use_sigmoid=False, loss_weight=0.4)), +``` -When taking solution (2), below is a modification example of [pspnet_unet_s5-d16.py](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/_base_/models/pspnet_unet_s5-d16.py): +- (2) `num_classes=2`, `out_channels=1` and `use_sigmoid=True` in `CrossEntropyLoss`. ```python decode_head=dict( @@ -166,5 +136,4 @@ if self.reduce_zero_label: gt_semantic_seg[gt_semantic_seg == 254] = 255 ``` -`reduce_zero_label` is usually used for datasets where 0 is background label, if `reduce_zero_label=True`, the pixels whose corresponding label is 0 would not be involved in loss calculation. -Noted that in binary segmentation task it is unnecessary to use `reduce_zero_label=True`, please take solutions we mentioned above. +Noted that in please check out label numbers of dataset when using `reduce_zero_label`. If dataset only has two types of labels (i.e., label 0 and 1), it needs to close `reduce_zero_label`, i.e., set `reduce_zero_label=True`. diff --git a/docs/zh_cn/faq.md b/docs/zh_cn/faq.md index 24515a659a..b3ec9945bf 100644 --- a/docs/zh_cn/faq.md +++ b/docs/zh_cn/faq.md @@ -69,71 +69,40 @@ python tools/test.py {config} {checkpoint} --show-dir {/path/to/save/image} --op ## 如何处理二值分割任务? -MMSegmentation 使用 `num_classes` 和 `out_channels` 来控制模型最后一层 `self.conv_seg` 的输出. (更多细节可以参考 [这里](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/models/decode_heads/decode_head.py).): +MMSegmentation 使用 `num_classes` 和 `out_channels` 来控制模型最后一层 `self.conv_seg` 的输出. 更多细节可以参考 [这里](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/models/decode_heads/decode_head.py). -```python -def __init__(self, - ..., - ): - ... - if out_channels is None: - if num_classes == 2: - warnings.warn('For binary segmentation, we suggest using' - '`out_channels = 1` to define the output' - 'channels of segmentor, and use `threshold`' - 'to convert seg_logist into a prediction' - 'applying a threshold') - out_channels = num_classes - - if out_channels != num_classes and out_channels != 1: - raise ValueError( - 'out_channels should be equal to num_classes,' - 'except binary segmentation set out_channels == 1 and' - f'num_classes == 2, but got out_channels={out_channels}' - f'and num_classes={num_classes}') - - if out_channels == 1 and threshold is None: - threshold = 0.3 - warnings.warn('threshold is not defined for binary, and defaults' - 'to 0.3') - self.num_classes = num_classes - self.out_channels = out_channels - self.threshold = threshold - ... - self.conv_seg = nn.Conv2d(channels, self.out_channels, kernel_size=1) -``` - -有两种计算二值分割任务的方法: +`num_classes` 应该和数据集本身类别个数一致,当是二值分割时,数据集只有前景和背景两类,所以 `num_classes` 为 2。`out_channels` 控制模型最后一层的输出的通道数,通常和 `num_classes` 相等,但当二值分割时候,可以有两种处理方法,分别是: -- 当 `out_channels=2` 时, 在训练时以 Cross Entropy Loss 作为损失函数, 在推理时使用 `F.softmax()` 归一化 logits 值, 然后通过 `argmax()` 得到每个像素的预测结果. +- 设置 `out_channels=2`, 在训练时以 Cross Entropy Loss 作为损失函数, 在推理时使用 `F.softmax()` 归一化 logits 值, 然后通过 `argmax()` 得到每个像素的预测结果. -- 当 `out_channels=1` 时, 我们在 [#2016](https://github.com/open-mmlab/mmsegmentation/pull/2016) 里提供了阈值参数 `threshold (默认为 0.3)`, 在训练时以 Binary Cross Entropy Loss 作为损失函数, 在推理时使用 `F.sigmoid()` 和 `threshold` 得到预测结果. - -```python -... -if self.out_channels == 1: - seg_logit = F.sigmoid(seg_logit) -else: - seg_logit = F.softmax(seg_logit, dim=1) - -... - -if self.out_channels == 1: - seg_pred = (seg_logit > - self.decode_head.threshold).to(seg_logit).squeeze(1) -else: - seg_pred = seg_logit.argmax(dim=1) -``` +- 设置 `out_channels=1`, 在训练时以 Binary Cross Entropy Loss 作为损失函数, 在推理时使用 `F.sigmoid()` 和 `threshold` 得到预测结果, `threshold` 默认为 0.3. -更多关于计算语义分割预测的细节可以参考 [encoder_decoder.py](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/models/segmentors/encoder_decoder.py): +更多关于实现细节可以参考 [encoder_decoder.py](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/models/segmentors/encoder_decoder.py): -对于实现上述两种计算二值分割的方法, 需要在 `decode_head` 和 `auxiliary_head` 的配置里修改: +对于实现上述两种计算二值分割的方法, 需要在 `decode_head` 和 `auxiliary_head` 的配置里修改. 下面是对样例 [pspnet_unet_s5-d16.py](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/_base_/models/pspnet_unet_s5-d16.py) 做出的对应修改. - (1) `num_classes=2`, `out_channels=2` 并在 `CrossEntropyLoss` 里面设置 `use_sigmoid=False` -- (2) `num_classes=2`, `out_channels=1` 并在 `CrossEntropyLoss` 里面设置 `use_sigmoid=True`. +```python +decode_head=dict( + type='PSPHead', + in_channels=64, + in_index=4, + num_classes=2, + out_channels=2, + loss_decode=dict( + type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)), +auxiliary_head=dict( + type='FCNHead', + in_channels=128, + in_index=3, + num_classes=2, + out_channels=2, + loss_decode=dict( + type='CrossEntropyLoss', use_sigmoid=False, loss_weight=0.4)), +``` -如果采用解决方案 (2), 下面是对样例 [pspnet_unet_s5-d16.py](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/_base_/models/pspnet_unet_s5-d16.py) 做出的对应修改: +- (2) `num_classes=2`, `out_channels=1` 并在 `CrossEntropyLoss` 里面设置 `use_sigmoid=True`. ```python decode_head=dict( @@ -156,7 +125,8 @@ auxiliary_head=dict( ## `reduce_zero_label` 的作用 -在 MMSegmentation 里面, 当 [加载注释](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/datasets/pipelines/loading.py#L91) 时, `reduce_zero_label (bool)` 被用来决定是否将所有 label 减去 1: +数据集中 `reduce_zero_label`参数类型为布尔类型, 默认为 False, 它的功能是为了忽略数据集 label 0. 具体做法是将 label 0 改为 255, 其余 label 相应编号减1, 同时 decode head 里将 255 设为 ignore index, 即不参与 loss 计算. +以下是 `reduce_zero_label` 具体实现逻辑: ```python if self.reduce_zero_label: @@ -166,4 +136,4 @@ if self.reduce_zero_label: gt_semantic_seg[gt_semantic_seg == 254] = 255 ``` -`reduce_zero_label` 常常被用来处理 label 0 是背景的数据集, 如果 `reduce_zero_label=True`, label 0 对应的像素将不会参与损失函数的计算. 需要说明的是在二值分割任务中没有必要设置 `reduce_zero_label=True`, 请采用上面我们提到的解决方案. +需要注意的是, 使用 `reduce_zero_label` 请确认数据集原始类别个数, 如果只有两类, 需要关闭 `reduce_zero_label` 即设置 `reduce_zero_label=False`. From 1854d192e23e63da3b2b5e6ee358f80f94688c55 Mon Sep 17 00:00:00 2001 From: MengzhangLI Date: Mon, 24 Oct 2022 13:33:40 +0800 Subject: [PATCH 07/11] fix order --- docs/en/faq.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/en/faq.md b/docs/en/faq.md index 142ba976e9..c6814a953e 100644 --- a/docs/en/faq.md +++ b/docs/en/faq.md @@ -69,7 +69,10 @@ python tools/test.py {config} {checkpoint} --show-dir {/path/to/save/image} --op ## How to handle binary segmentation task -MMSegmentation uses `num_classes` and `out_channels` to control output of last layer `self.conv_seg` (More details could be found [here](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/models/decode_heads/decode_head.py).): +MMSegmentation uses `num_classes` and `out_channels` to control output of last layer `self.conv_seg`. More details could be found [here](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/models/decode_heads/decode_head.py). + +`num_classes` should be the same as number of types of labels, in binary segmentation task, dataset only has two types of labels: foreground and background, so `num_classes=2`. `out_channels` controls the output channel of last layer of model, it usually equals to `num_classes`. +But in binary segmentation task, there are two solutions: - Set `out_channels=2`, using Cross Entropy Loss in training, using `F.softmax()` and `argmax()` to get prediction of each pixel in inference. @@ -79,9 +82,6 @@ More details about implementation could be found in [encoder_decoder.py](https:/ In summary, to implement binary segmentation methods users should modify below parameters in the `decode_head` and `auxiliary_head` configs. Here is a modification example of [pspnet_unet_s5-d16.py](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/_base_/models/pspnet_unet_s5-d16.py): -`num_classes` should be the same as number of types of labels, in binary segmentation task, dataset only has two types of labels: foreground and background, so `num_classes=2`. `out_channels` controls the output channel of last layer of model, it usually equals to `num_classes`. -But in binary segmentation task, there are two solutions: - - (1) `num_classes=2`, `out_channels=2` and `use_sigmoid=False` in `CrossEntropyLoss`. ```python From 35f578da51ecf54927fc4d3a7bffe74818d9b178 Mon Sep 17 00:00:00 2001 From: MengzhangLI Date: Mon, 24 Oct 2022 14:01:40 +0800 Subject: [PATCH 08/11] fix --- docs/en/faq.md | 2 -- docs/zh_cn/faq.md | 6 ++---- 2 files changed, 2 insertions(+), 6 deletions(-) diff --git a/docs/en/faq.md b/docs/en/faq.md index c6814a953e..404ba1005c 100644 --- a/docs/en/faq.md +++ b/docs/en/faq.md @@ -78,8 +78,6 @@ But in binary segmentation task, there are two solutions: - Set `out_channels=1`, using Binary Cross Entropy Loss in training, using `F.sigmoid()` and `threshold` to get prediction of each pixel in inference. `threshold` is set 0.3 as default. -More details about implementation could be found in [encoder_decoder.py](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/models/segmentors/encoder_decoder.py): - In summary, to implement binary segmentation methods users should modify below parameters in the `decode_head` and `auxiliary_head` configs. Here is a modification example of [pspnet_unet_s5-d16.py](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/_base_/models/pspnet_unet_s5-d16.py): - (1) `num_classes=2`, `out_channels=2` and `use_sigmoid=False` in `CrossEntropyLoss`. diff --git a/docs/zh_cn/faq.md b/docs/zh_cn/faq.md index b3ec9945bf..c4dda95cb5 100644 --- a/docs/zh_cn/faq.md +++ b/docs/zh_cn/faq.md @@ -71,17 +71,15 @@ python tools/test.py {config} {checkpoint} --show-dir {/path/to/save/image} --op MMSegmentation 使用 `num_classes` 和 `out_channels` 来控制模型最后一层 `self.conv_seg` 的输出. 更多细节可以参考 [这里](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/models/decode_heads/decode_head.py). -`num_classes` 应该和数据集本身类别个数一致,当是二值分割时,数据集只有前景和背景两类,所以 `num_classes` 为 2。`out_channels` 控制模型最后一层的输出的通道数,通常和 `num_classes` 相等,但当二值分割时候,可以有两种处理方法,分别是: +`num_classes` 应该和数据集本身类别个数一致,当是二值分割时,数据集只有前景和背景两类, 所以 `num_classes` 为 2. `out_channels` 控制模型最后一层的输出的通道数,通常和 `num_classes` 相等, 但当二值分割时候, 可以有两种处理方法, 分别是: - 设置 `out_channels=2`, 在训练时以 Cross Entropy Loss 作为损失函数, 在推理时使用 `F.softmax()` 归一化 logits 值, 然后通过 `argmax()` 得到每个像素的预测结果. - 设置 `out_channels=1`, 在训练时以 Binary Cross Entropy Loss 作为损失函数, 在推理时使用 `F.sigmoid()` 和 `threshold` 得到预测结果, `threshold` 默认为 0.3. -更多关于实现细节可以参考 [encoder_decoder.py](https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/models/segmentors/encoder_decoder.py): - 对于实现上述两种计算二值分割的方法, 需要在 `decode_head` 和 `auxiliary_head` 的配置里修改. 下面是对样例 [pspnet_unet_s5-d16.py](https://github.com/open-mmlab/mmsegmentation/blob/master/configs/_base_/models/pspnet_unet_s5-d16.py) 做出的对应修改. -- (1) `num_classes=2`, `out_channels=2` 并在 `CrossEntropyLoss` 里面设置 `use_sigmoid=False` +- (1) `num_classes=2`, `out_channels=2` 并在 `CrossEntropyLoss` 里面设置 `use_sigmoid=False`. ```python decode_head=dict( From 294b93b7e7ff658e442796a0d3ff6d734f021d6d Mon Sep 17 00:00:00 2001 From: MengzhangLI Date: Mon, 24 Oct 2022 14:03:42 +0800 Subject: [PATCH 09/11] fix --- docs/zh_cn/faq.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/zh_cn/faq.md b/docs/zh_cn/faq.md index c4dda95cb5..bf8e4514ed 100644 --- a/docs/zh_cn/faq.md +++ b/docs/zh_cn/faq.md @@ -123,7 +123,7 @@ auxiliary_head=dict( ## `reduce_zero_label` 的作用 -数据集中 `reduce_zero_label`参数类型为布尔类型, 默认为 False, 它的功能是为了忽略数据集 label 0. 具体做法是将 label 0 改为 255, 其余 label 相应编号减1, 同时 decode head 里将 255 设为 ignore index, 即不参与 loss 计算. +数据集中 `reduce_zero_label` 参数类型为布尔类型, 默认为 False, 它的功能是为了忽略数据集 label 0. 具体做法是将 label 0 改为 255, 其余 label 相应编号减 1, 同时 decode head 里将 255 设为 ignore index, 即不参与 loss 计算. 以下是 `reduce_zero_label` 具体实现逻辑: ```python From 56bbfcc7163f8d41c01c56b6eb08098798f1dc71 Mon Sep 17 00:00:00 2001 From: Miao Zheng <76149310+MeowZheng@users.noreply.github.com> Date: Fri, 28 Oct 2022 21:26:34 +0800 Subject: [PATCH 10/11] Update docs/en/faq.md --- docs/en/faq.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/en/faq.md b/docs/en/faq.md index 404ba1005c..2ca2fd8293 100644 --- a/docs/en/faq.md +++ b/docs/en/faq.md @@ -134,4 +134,4 @@ if self.reduce_zero_label: gt_semantic_seg[gt_semantic_seg == 254] = 255 ``` -Noted that in please check out label numbers of dataset when using `reduce_zero_label`. If dataset only has two types of labels (i.e., label 0 and 1), it needs to close `reduce_zero_label`, i.e., set `reduce_zero_label=True`. +**Noted:** Please pay attention to label numbers of dataset when using `reduce_zero_label`. If dataset only has two types of labels (i.e., label 0 and 1), it needs to close `reduce_zero_label`, i.e., set `reduce_zero_label=False`. From 84625fe9dd0b9b4e0d156b2bbb7a27907f664f9e Mon Sep 17 00:00:00 2001 From: Miao Zheng <76149310+MeowZheng@users.noreply.github.com> Date: Fri, 28 Oct 2022 21:27:12 +0800 Subject: [PATCH 11/11] Update docs/zh_cn/faq.md --- docs/zh_cn/faq.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/zh_cn/faq.md b/docs/zh_cn/faq.md index bf8e4514ed..3cab08a18a 100644 --- a/docs/zh_cn/faq.md +++ b/docs/zh_cn/faq.md @@ -134,4 +134,4 @@ if self.reduce_zero_label: gt_semantic_seg[gt_semantic_seg == 254] = 255 ``` -需要注意的是, 使用 `reduce_zero_label` 请确认数据集原始类别个数, 如果只有两类, 需要关闭 `reduce_zero_label` 即设置 `reduce_zero_label=False`. +**注意:** 使用 `reduce_zero_label` 请确认数据集原始类别个数, 如果只有两类, 需要关闭 `reduce_zero_label` 即设置 `reduce_zero_label=False`.