-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I got same as reported in Lovasz loss #1036 #1681
Comments
Hi @pengzhao-life |
Hi, Thank you so much for the response! My settings are as below:
* This is for (a binary) semantic segmentation, each pixel is classified as object pixel or background pixel.
* Swin-B is backbone, and UPernet is for segmentation.
* config file is as follows:
* cfg = Config.fromfile('../configs/swin/upernet_swin_base_patch4_window12_512x512_160k_ade20k_pretrain_384x384_22K.py')
*
from mmseg.apis import set_random_seed
# Since we use only one GPU, BN is used instead of SyncBN
cfg.norm_cfg = dict(type='BN', requires_grad=True)
#cfg.model.backbone.norm_cfg = cfg.norm_cfg
cfg.model.decode_head.norm_cfg = cfg.norm_cfg
cfg.model.auxiliary_head.norm_cfg = cfg.norm_cfg
# modify num classes of the model in decode/auxiliary head
cfg.model.decode_head.num_classes = 2
cfg.model.auxiliary_head.num_classes = 2
# Modify dataset type and path
cfg.dataset_type = 'StanfordBackgroundDataset'
cfg.data_root = data_root
cfg.data.samples_per_gpu = 8
cfg.data.workers_per_gpu=8
crop_size = (384,384) # (256,256)
image_scale = (512,512) # (320, 240)
cfg.img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
cfg.crop_size = crop_size
cfg.train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations'),
dict(type='Resize', img_scale=image_scale, ratio_range=(0.5, 2.0)),
dict(type='RandomCrop', crop_size=cfg.crop_size, cat_max_ratio=0.75),
dict(type='RandomFlip', flip_ratio=0.5),
dict(type='PhotoMetricDistortion'),
dict(type='Normalize', **cfg.img_norm_cfg),
dict(type='Pad', size=cfg.crop_size, pad_val=0, seg_pad_val=255),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_semantic_seg']),
]
cfg.test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=image_scale,
# img_ratios=[0.5, 0.75, 1.0, 1.25, 1.5, 1.75],
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(type='Normalize', **cfg.img_norm_cfg),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
])
]
cfg.data.train.type = cfg.dataset_type
cfg.data.train.data_root = cfg.data_root
cfg.data.train.img_dir = img_dir
cfg.data.train.ann_dir = ann_dir
cfg.data.train.pipeline = cfg.train_pipeline
cfg.data.train.split = 'splits/train.txt'
cfg.data.val.type = cfg.dataset_type
cfg.data.val.data_root = cfg.data_root
cfg.data.val.img_dir = img_dir
cfg.data.val.ann_dir = ann_dir
cfg.data.val.pipeline = cfg.test_pipeline
cfg.data.val.split = 'splits/val.txt'
cfg.data.test.type = cfg.dataset_type
cfg.data.test.data_root = cfg.data_root
cfg.data.test.img_dir = img_dir
cfg.data.test.ann_dir = ann_dir
cfg.data.test.pipeline = cfg.test_pipeline
cfg.data.test.split = 'splits/val.txt'
# We can still use the pre-trained Mask RCNN model though we do not need to
# use the mask branch
cfg.load_from = '../../../../checkpoints/upernet_swin_base_patch4_window12_512x512_160k_ade20k_pretrain_384x384_22K_20210531_125459-429057bf.pth'
# Set up working dir to save files and logs.
cfg.work_dir = '/tmp/peng-tmp'
cfg.runner.max_iters = 10000
cfg.log_config.interval = 10
cfg.evaluation.interval = 200
cfg.checkpoint_config.interval = 200
# Set seed to facitate reproducing the result
cfg.seed = 0
set_random_seed(0, deterministic=False)
cfg.gpu_ids = range(1)
# Let's have a look at the final config used for training
print(f'Config:\n{cfg.pretty_text}')
* in 'upernet_swin.py', the loss function is modified as below:
*
norm_cfg = dict(type='SyncBN', requires_grad=True)
backbone_norm_cfg = dict(type='LN', requires_grad=True)
model = dict(
type='EncoderDecoder',
pretrained=None,
backbone=dict(
type='SwinTransformer',
pretrain_img_size=224,
embed_dims=96,
patch_size=4,
window_size=7,
mlp_ratio=4,
depths=[2, 2, 6, 2],
num_heads=[3, 6, 12, 24],
strides=(4, 2, 2, 2),
out_indices=(0, 1, 2, 3),
qkv_bias=True,
qk_scale=None,
patch_norm=True,
drop_rate=0.,
attn_drop_rate=0.,
drop_path_rate=0.3,
use_abs_pos_embed=False,
act_cfg=dict(type='GELU'),
norm_cfg=backbone_norm_cfg),
decode_head=dict(
type='UPerHead',
in_channels=[96, 192, 384, 768],
in_index=[0, 1, 2, 3],
pool_scales=(1, 2, 3, 6),
channels=512,
dropout_ratio=0.1,
num_classes=19,
norm_cfg=norm_cfg,
align_corners=False,
loss_decode=dict(
type='LovaszLoss',
loss_type='binary',
reduction='none',
loss_weight=1.0,
loss_name='loss_lovasz')),
auxiliary_head=dict(
type='FCNHead',
in_channels=384,
in_index=2,
channels=256,
num_convs=1,
concat_input=False,
dropout_ratio=0.1,
num_classes=19,
norm_cfg=norm_cfg,
align_corners=False,
loss_decode=dict(
type='LovaszLoss',
loss_type='binary',
reduction='none',
loss_weight=0.4,
loss_name='loss_lovasz')),
# model training and testing settings
train_cfg=dict(),
test_cfg=dict(mode='whole'))
*
Train and evaluation:
*
# Build the dataset
datasets = [build_dataset(cfg.data.train)]
# Build the detector
model = build_segmentor(cfg.model)
# Add an attribute for visualization convenience
model.CLASSES = datasets[0].CLASSES
# Create work_dir
mmcv.mkdir_or_exist(osp.abspath(cfg.work_dir))
train_segmentor(model, datasets, cfg, distributed=False, validate=True,
meta=dict())
Thanks,
Peng
…________________________________
From: 谢昕辰 ***@***.***>
Sent: Monday, June 20, 2022 11:28 PM
To: open-mmlab/mmsegmentation ***@***.***>
Cc: Peng Zhao ***@***.***>; Mention ***@***.***>
Subject: Re: [open-mmlab/mmsegmentation] I got same as reported in Lovasz loss #1036 (Issue #1681)
Hi @pengzhao-life<https://github.com/pengzhao-life>
Could you provide your model config? If the model is not used for binary classification, the error occurs.
―
Reply to this email directly, view it on GitHub<#1681 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ABZZO2JI7RPCW36R2X2KS53VQEZGZANCNFSM5ZHL23OQ>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
As for binary segmentation, You can try to modify your config, as # model settings
norm_cfg = dict(type='SyncBN', requires_grad=True)
model = dict(
type='EncoderDecoder',
pretrained=None,
backbone=dict(
type='UNet',
in_channels=3,
base_channels=64,
num_stages=5,
strides=(1, 1, 1, 1, 1),
enc_num_convs=(2, 2, 2, 2, 2),
dec_num_convs=(2, 2, 2, 2),
downsamples=(True, True, True, True),
enc_dilations=(1, 1, 1, 1, 1),
dec_dilations=(1, 1, 1, 1),
with_cp=False,
conv_cfg=None,
norm_cfg=norm_cfg,
act_cfg=dict(type='ReLU'),
upsample_cfg=dict(type='InterpConv'),
norm_eval=False),
decode_head=dict(
type='FCNHead',
in_channels=64,
in_index=4,
channels=64,
num_convs=1,
concat_input=False,
dropout_ratio=0.1,
num_classes=1,
norm_cfg=norm_cfg,
align_corners=False,
loss_decode=dict(
type='LovaszLoss', loss_type='binary', reduction='none', loss_weight=1.0)),
auxiliary_head=dict(
type='FCNHead',
in_channels=128,
in_index=3,
channels=64,
num_convs=1,
concat_input=False,
dropout_ratio=0.1,
num_classes=1,
norm_cfg=norm_cfg,
align_corners=False,
loss_decode=dict(
type='LovaszLoss', loss_type='binary', reduction='none', loss_weight=0.4)),
# model training and testing settings
train_cfg=dict(),
test_cfg=dict(mode='slide', crop_size=256, stride=170)) |
Hi,
Thanks for the response! I changed num_classes=1 in configs/_base_/models/fcn_unet_s5-d16.py. And I have to change 'CLASSES' as below to one element only, otherwise it throws error for not matching. It runs, but the segmentation result is always the whole image. My annotated image is attached, where white pixels are my interest, and black is the background. Can you tell what went wrong? Thanks!
Peng
@DATASETS.register_module()
class StanfordBackgroundDataset(CustomDataset):
CLASSES =('b') # some class name
PALETTE =[[255,0,0]] # some color
def __init__(self, split, **kwargs):
super().__init__(img_suffix='.png', seg_map_suffix='.png',
split=split, **kwargs)
assert osp.exists(self.img_dir) and self.split is not None
…________________________________
From: MengzhangLI ***@***.***>
Sent: Tuesday, June 21, 2022 1:37 AM
To: open-mmlab/mmsegmentation ***@***.***>
Cc: Peng Zhao ***@***.***>; Mention ***@***.***>
Subject: Re: [open-mmlab/mmsegmentation] I got same as reported in Lovasz loss #1036 (Issue #1681)
As for binary segmentation, num_classes should be set 1 rather than 2. You can try to modify config from configs/_base_/models/fcn_unet_s5-d16.py like:
# model settings
norm_cfg = dict(type='SyncBN', requires_grad=True)
model = dict(
type='EncoderDecoder',
pretrained=None,
backbone=dict(
type='UNet',
in_channels=3,
base_channels=64,
num_stages=5,
strides=(1, 1, 1, 1, 1),
enc_num_convs=(2, 2, 2, 2, 2),
dec_num_convs=(2, 2, 2, 2),
downsamples=(True, True, True, True),
enc_dilations=(1, 1, 1, 1, 1),
dec_dilations=(1, 1, 1, 1),
with_cp=False,
conv_cfg=None,
norm_cfg=norm_cfg,
act_cfg=dict(type='ReLU'),
upsample_cfg=dict(type='InterpConv'),
norm_eval=False),
decode_head=dict(
type='FCNHead',
in_channels=64,
in_index=4,
channels=64,
num_convs=1,
concat_input=False,
dropout_ratio=0.1,
num_classes=1,
norm_cfg=norm_cfg,
align_corners=False,
loss_decode=dict(
type='LovaszLoss', loss_type='binary', reduction='none', loss_weight=1.0)),
auxiliary_head=dict(
type='FCNHead',
in_channels=128,
in_index=3,
channels=64,
num_convs=1,
concat_input=False,
dropout_ratio=0.1,
num_classes=1,
norm_cfg=norm_cfg,
align_corners=False,
loss_decode=dict(
type='LovaszLoss', loss_type='binary', reduction='none', loss_weight=0.4)),
# model training and testing settings
train_cfg=dict(),
test_cfg=dict(mode='slide', crop_size=256, stride=170))
—
Reply to this email directly, view it on GitHub<#1681 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ABZZO2NUIU6G5USF35AZD53VQFIJPANCNFSM5ZHL23OQ>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Sorry for late reply. Could you please try to use |
…usion.py` (open-mmlab#1681) * Fix type checking remainders * Remove IS_V20_MODEL flag always being True Co-authored-by: apolinario <joaopaulo.passos+multimodal@gmail.com>
Hi, I run into same problem as the link below, i.e. when using lovasz loss with loss_type='binary', the following problems will occur
IndexError: The shape of the mask [....] at index 0 does not match the shape of the indexed tensor [....] at index 0. I noticed that #1036 has been closed. I wonder how it was handled. Thanks!
#1036
The text was updated successfully, but these errors were encountered: