Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【Hackathon 5th No.68】轻量语义分割网络PIDNet #3548

Merged
merged 14 commits into from
Nov 8, 2023

Conversation

flytocc
Copy link
Contributor

@flytocc flytocc commented Oct 29, 2023

PR types

New features

PR changes

Task PaddlePaddle/Paddle#57262
RPC PaddlePaddle/community#722

Description

  1. 添加了PIDNet网络
  2. 新增了一种edge生成方法 AddEdgeLabel transform
  3. 修复了 CrossEntropyLoss 的一个bug
  4. OhemCrossEntropyLoss增加了对 weight 的支持

used AI Studio for training. Thanks a lot!

@paddle-bot
Copy link

paddle-bot bot commented Oct 29, 2023

Thanks for your contribution!

Copy link
Collaborator

@shiyutang shiyutang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

感谢贡献,1. 有一些细节问题,同时2. 需要在configs目录下补充readme;另外3. 辛苦确认训练参数、transform等配置,如果确认,我们在我们这边进行训练。(4. 训练之前可以提供下backbone的预训练权重。)

configs/pidnet/pidnet_s_cityscapes_1024x1024_120k.yml Outdated Show resolved Hide resolved
configs/pidnet/pidnet_s_cityscapes_1024x1024_120k.yml Outdated Show resolved Hide resolved
paddleseg/models/backbones/pidnet.py Outdated Show resolved Hide resolved
paddleseg/models/backbones/pidnet.py Outdated Show resolved Hide resolved
paddleseg/models/losses/cross_entropy_loss.py Show resolved Hide resolved
paddleseg/transforms/transforms.py Outdated Show resolved Hide resolved
@flytocc
Copy link
Contributor Author

flytocc commented Oct 30, 2023

这里是imagenet_pretrained weights和2xb6-120k_1024x1024-cityscapes weights

百度网盘

@shiyutang
Copy link
Collaborator

这里是imagenet_pretrained weights和2xb6-120k_1024x1024-cityscapes weights

百度网盘

image

@flytocc
Copy link
Contributor Author

flytocc commented Nov 1, 2023

这里是imagenet_pretrained weights和2xb6-120k_1024x1024-cityscapes weights
百度网盘

image

现在应该有了

Copy link
Collaborator

@shiyutang shiyutang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

模型权重有点问题,你重新给我提交下吧,我把链接给你~

Comment on lines +1261 to +1269
kernel = np.ones((self.edge_size, self.edge_size), np.uint8)
edge = np.pad(
edge[self.y_k_size:-self.y_k_size, self.x_k_size:-self.x_k_size],
((self.y_k_size, self.y_k_size), (self.x_k_size, self.x_k_size)),
mode='constant')
edge = (cv2.dilate(edge, kernel, iterations=1) > 50) * 1.0

data['gt_fields'].append('edge')
data['edge'] = edge
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个比mask_to_binary_edge快,因为是没有使用到onehot的步骤。
但是这样获得的edge是每个类别的edge么,因为lable是0-18?

Copy link
Contributor Author

@flytocc flytocc Nov 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mask_to_binary_edge先生成onehot,再从onehot生成edge,但是最后又把全部类别的edge和在一起了,那似乎生成onehot就没有必要了。

paddleseg/models/losses/cross_entropy_loss.py Show resolved Hide resolved
@flytocc
Copy link
Contributor Author

flytocc commented Nov 1, 2023

模型权重有点问题,你重新给我提交下吧,我把链接给你~

什么问题

@flytocc flytocc requested a review from shiyutang November 1, 2023 09:45
@shiyutang
Copy link
Collaborator

模型权重有点问题,你重新给我提交下吧,我把链接给你~

什么问题

没事,我先验证下。

@flytocc
Copy link
Contributor Author

flytocc commented Nov 1, 2023

我使用2张V100进行了一次训练,mIoU78.34%,官方权重的mIoU78.74%,误差大约0.4%

当前的实现和官方有如下区别:

  1. label不同:PaddleSeg使用*_labelTrainIds.png,PIDNet使用*_labelIds.png,二者有细微区别
  2. 数据增强padding_value不同: PaddleSeg填充127.5,PIDNet填充0
  3. 数据增强img resize不同: PaddleSeg中img为float32,PIDNet中img为uint8。这个差异较小,应该不影响训练
  4. BatchNorm不同:PaddleSeg使用SyncBatchNorm,PIDNet使用的是普通的BatchNorm2d

另外,我也尝试了使用RandomDistort增强,基本没有提升(78.35)。

如果,你要进行训练,可以把padding_value设置为0

train_dataset:
  transforms:
    - type: AddEdgeLabel
    - type: ResizeStepScaling
      min_scale_factor: 0.5
      max_scale_factor: 2.1
      scale_step_size: 0.1
    - type: RandomPaddingCrop
      crop_size: [1024, 1024]
      im_padding_value: 0   # <---------- here
    - type: RandomHorizontalFlip
    - type: Normalize
      mean: *mean
      std: *std

@shiyutang
Copy link
Collaborator

shiyutang commented Nov 2, 2023

好,我在paddle=0上训练下。你那边能否看看标签的影响呢?另外我看标签不应该是影响,labelID是完整的33类数据,但是PID源码是19类的模型,辛苦看看如果没有转换,那模型和数据的类别是怎么匹配上的,如果可以匹配也可以用labelID训练下,探究下这方面的影响。https://www.cnblogs.com/dotman/p/cityscapes_dataset_tips.html
image

@flytocc
Copy link
Contributor Author

flytocc commented Nov 2, 2023

好,我在paddle=0上训练下。你那边能否看看标签的影响呢?另外我看标签不应该是影响,labelID是完整的33类数据,但是PID源码是19类的模型,辛苦看看如果没有转换,那模型和数据的类别是怎么匹配上的

需要把33类的*_labelIds.png转换成19类。

https://github.com/mcordts/cityscapesScripts/blob/a7ac7b4062d1a80ed5e22d2ea2179c886801c77d/cityscapesscripts/helpers/labels.py#L62-L99

参考PIDNet的实现:

def convert_label(label: np.ndarray, ignore_label: int = 255) -> np.ndarray:
    label_mapping = {-1: ignore_label, 0: ignore_label, 
                     1: ignore_label, 2: ignore_label, 
                     3: ignore_label, 4: ignore_label, 
                     5: ignore_label, 6: ignore_label, 
                     7: 0, 8: 1, 9: ignore_label, 
                     10: ignore_label, 11: 2, 12: 3, 
                     13: 4, 14: ignore_label, 15: ignore_label, 
                     16: ignore_label, 17: 5, 18: ignore_label, 
                     19: 6, 20: 7, 21: 8, 22: 9, 23: 10, 24: 11,
                     25: 12, 26: 13, 27: 14, 28: 15, 
                     29: ignore_label, 30: ignore_label, 
                     31: 16, 32: 17, 33: 18}

    temp = label.copy()
    for v, k in label_mapping.items():
        temp[label == k] = v
    return temp

*_labelIds.png是数据集自带的,我不太清楚它具体是怎样生成的,换成成19类后,和*_labelTrainIds.png有细微的差别。

如果可以匹配也可以用labelID训练下,探究下这方面的影响。https://www.cnblogs.com/dotman/p/cityscapes_dataset_tips.html image

我正在用转换后的*_labelIds.png进行训练。

@flytocc
Copy link
Contributor Author

flytocc commented Nov 2, 2023

模型链接如下,另外你可以发下你的训练日志给我么?

https://pan.baidu.com/s/15HZ46FAJCNes8sxUDj36lQ?pwd=gvg9

@shiyutang
Copy link
Collaborator

shiyutang commented Nov 3, 2023

替换padding=0后训练,精度还是有些差距。我修改padding重新训练一下。
image

@flytocc
Copy link
Contributor Author

flytocc commented Nov 3, 2023

替换padding=0后训练,精度还是有些差距。我在padding !=0重新训练一下。 image

77.74%确实有点低啊,padding=127.5我一共跑了三次,分别是78.32,78.34,78.35

@shiyutang
Copy link
Collaborator

shiyutang commented Nov 6, 2023

使用pad=默认值重新训练了一下,精度差距缩小。
image

Copy link
Collaborator

@shiyutang shiyutang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

configs/pidnet/README.md Outdated Show resolved Hide resolved
configs/pidnet/pidnet_large_cityscapes_1024x1024_120k.yml Outdated Show resolved Hide resolved
@flytocc
Copy link
Contributor Author

flytocc commented Nov 6, 2023

tipc有点问题

在paddle2.5.1版本下,export.py会报错:

Traceback (most recent call last):
  File "/home/flytocc/workspaces/PaddleSeg/tools/export.py", line 121, in <module>
    main(args)
  File "/home/flytocc/workspaces/PaddleSeg/tools/export.py", line 89, in main
    paddle.jit.save(model, os.path.join(args.save_dir, save_name))
  File "/home/flytocc/anaconda3/envs/pd25/lib/python3.10/site-packages/decorator.py", line 232, in fun
    return caller(func, *(extras + args), **kw)
  File "/home/flytocc/anaconda3/envs/pd25/lib/python3.10/site-packages/paddle/fluid/wrapped_decorator.py", line 25, in __impl__
    return wrapped_func(*args, **kwargs)
  File "/home/flytocc/anaconda3/envs/pd25/lib/python3.10/site-packages/paddle/jit/api.py", line 752, in wrapper
    func(layer, path, input_spec, **configs)
  File "/home/flytocc/anaconda3/envs/pd25/lib/python3.10/site-packages/decorator.py", line 232, in fun
    return caller(func, *(extras + args), **kw)
  File "/home/flytocc/anaconda3/envs/pd25/lib/python3.10/site-packages/paddle/fluid/wrapped_decorator.py", line 25, in __impl__
    return wrapped_func(*args, **kwargs)
  File "/home/flytocc/anaconda3/envs/pd25/lib/python3.10/site-packages/paddle/fluid/dygraph/base.py", line 75, in __impl__
    return func(*args, **kwargs)
  File "/home/flytocc/anaconda3/envs/pd25/lib/python3.10/site-packages/paddle/jit/api.py", line 1043, in save
    static_func.concrete_program_specify_input_spec(
  File "/home/flytocc/anaconda3/envs/pd25/lib/python3.10/site-packages/paddle/jit/dy2static/program_translator.py", line 709, in concrete_program_specify_input_spec
    concrete_program, _ = self.get_concrete_program(
  File "/home/flytocc/anaconda3/envs/pd25/lib/python3.10/site-packages/paddle/jit/dy2static/program_translator.py", line 589, in get_concrete_program
    concrete_program, partial_program_layer = self._program_cache[
  File "/home/flytocc/anaconda3/envs/pd25/lib/python3.10/site-packages/paddle/jit/dy2static/program_translator.py", line 1249, in __getitem__
    self._caches[item_id] = self._build_once(item)
  File "/home/flytocc/anaconda3/envs/pd25/lib/python3.10/site-packages/paddle/jit/dy2static/program_translator.py", line 1193, in _build_once
    concrete_program = ConcreteProgram.from_func_spec(
  File "/home/flytocc/anaconda3/envs/pd25/lib/python3.10/site-packages/decorator.py", line 232, in fun
    return caller(func, *(extras + args), **kw)
  File "/home/flytocc/anaconda3/envs/pd25/lib/python3.10/site-packages/paddle/fluid/wrapped_decorator.py", line 25, in __impl__
    return wrapped_func(*args, **kwargs)
  File "/home/flytocc/anaconda3/envs/pd25/lib/python3.10/site-packages/paddle/fluid/dygraph/base.py", line 75, in __impl__
    return func(*args, **kwargs)
  File "/home/flytocc/anaconda3/envs/pd25/lib/python3.10/site-packages/paddle/jit/dy2static/program_translator.py", line 1070, in from_func_spec
    error_data.raise_new_exception()
  File "/home/flytocc/anaconda3/envs/pd25/lib/python3.10/site-packages/paddle/jit/dy2static/error.py", line 452, in raise_new_exception
    raise new_exception from None
ValueError: In transformed code:

    File "/home/flytocc/workspaces/PaddleSeg/paddleseg/deploy/export.py", line 26, in forward
        outs = self.model(x)
    File "/home/flytocc/workspaces/PaddleSeg/paddleseg/models/pidnet.py", line 69, in forward
        feat = self.backbone(x)
    File "/home/flytocc/workspaces/PaddleSeg/paddleseg/models/backbones/pidnet.py", line 681, in forward
        # stage 0-2
        x = self.stem(x)
        ~~~~~~~~~~~~~~~~ <--- HERE

        # stage 3

    File "/home/flytocc/anaconda3/envs/pd25/lib/python3.10/site-packages/paddle/jit/dy2static/convert_call_func.py", line 222, in convert_call
        if is_builtin(func) or is_unsupported(func):
    File "/home/flytocc/anaconda3/envs/pd25/lib/python3.10/site-packages/paddle/jit/dy2static/convert_call_func.py", line 122, in is_unsupported
        func_in_dict = func == v

    ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (6,) + inhomogeneous part.

在paddle2.5.2版本下,export.py没报错,但infer.py会报错:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib/python3.8/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "/usr/lib/python3.8/multiprocessing/spawn.py", line 126, in _main
    self = reduction.pickle.load(from_parent)
  File "/usr/lib/python3.8/multiprocessing/synchronize.py", line 110, in __setstate__
    self._semlock = _multiprocessing.SemLock._rebuild(*state)
FileNotFoundError: [Errno 2] No such file or directory
Traceback (most recent call last):
  File "deploy/python/infer.py", line 396, in <module>
    main(args)
  File "deploy/python/infer.py", line 384, in main
    predictor.run(imgs_list)
  File "deploy/python/infer.py", line 219, in run
    self.predictor.run()
ValueError: (InvalidArgument) Broadcast dimension mismatch. Operands could not be broadcast together with the shape of X = [1, 32, 128, 256] and the shape of Y = [1, 32, 128, 128]. Received [256] in X is not equal to [128] in Y at i:3.
  [Hint: Expected x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1 == true, but received x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1:0 != true:1.] (at ../paddle/phi/kernels/funcs/common_shape.h:86)
  [operator < elementwise_mul > error]

@flytocc flytocc requested a review from shiyutang November 6, 2023 10:06
@flytocc
Copy link
Contributor Author

flytocc commented Nov 6, 2023

如果可以匹配也可以用labelID训练下,探究下这方面的影响。

我训练完成了,mIoU 78.40%,比labelTrainIds0.05%,基本没什么影响

Copy link
Collaborator

@shiyutang shiyutang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

导出和推理在paddle develop上重新尝试下? 看上去是形状不匹配,应该是动转静过程中的问题。

另外readme需要补充下。

Comment on lines 14 to 15
|PIDNet|PIDNet_Medium|1024x1024|120000|80.22%|82.05%|[model](https://paddleseg.bj.bcebos.com/dygraph/pidnet/pidnet_medium_2xb6-120k_1024x1024-cityscapes.pdparams)|
|PIDNet|PIDNet-Large |1024x1024|120000|80.89%|82.37%|[model](https://paddleseg.bj.bcebos.com/dygraph/pidnet/pidnet_large_2xb6-120k_1024x1024-cityscapes.pdparams)|
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这两个都是转换的模型而非训练模型,最好做一下区分。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m和l的模型都没训练,我先空着?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

嗯嗯,可以训练下加入。

configs/pidnet/README.md Outdated Show resolved Hide resolved
@shiyutang
Copy link
Collaborator

paddle2.5.1版本下,export.py没报错

这两个都是2.5.1?

@flytocc
Copy link
Contributor Author

flytocc commented Nov 7, 2023

paddle2.5.1版本下,export.py没报错

这两个都是2.5.1?

写错了,2.5.1版本下export.py报错,2.5.2版本下infer.py报错,稍后我试试develop

@flytocc flytocc requested a review from shiyutang November 7, 2023 12:54
Copy link
Collaborator

@shiyutang shiyutang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@shiyutang shiyutang merged commit 12ca7e3 into PaddlePaddle:develop Nov 8, 2023
@flytocc flytocc deleted the PIDNet branch November 8, 2023 10:19
@shiyutang shiyutang mentioned this pull request Nov 9, 2023
@pobi123
Copy link

pobi123 commented Apr 25, 2024

我在使用paddleseg训练的时候,选用的是pp_liteseg_stdc1_camvid_960x720_10k。训练发生报错:
ValueError: (InvalidArgument) Broadcast dimension mismatch.
Operands could not be broadcast together with the shape of X = [20, 4147200] and the shape of Y = [12441600].
Received [4147200] in X is not equal to [12441600] in Y at i:1.
请问是什么原因?

@flytocc
Copy link
Contributor Author

flytocc commented Apr 26, 2024

@pobi123 这是PIDNet的pr,pp_liteseg有什么问题建议直接提issue

我在使用paddleseg训练的时候,选用的是pp_liteseg_stdc1_camvid_960x720_10k。训练发生报错: ValueError: (InvalidArgument) Broadcast dimension mismatch. Operands could not be broadcast together with the shape of X = [20, 4147200] and the shape of Y = [12441600]. Received [4147200] in X is not equal to [12441600] in Y at i:1. 请问是什么原因?

@Yana990
Copy link

Yana990 commented Jun 16, 2024

500/500 [==============================] - 54s 107ms/step - batch_cost: 0.1071 - reader cost: 0.0861
Traceback (most recent call last):
File "train.py", line 230, in <module>
main(args)
File "train.py", line 206, in main
train(
File "/home/PaddleSeg-release-2.5/paddleseg/core/train.py", line 285, in train
mean_iou, acc, _, _, _ = evaluate(
File "/home/PaddleSeg-release-2.5/paddleseg/core/val.py", line 223, in evaluate
acc, class_precision, class_recall = metrics.class_measurement(
File "/home/PaddleSeg-release-2.5/paddleseg/utils/metrics.py", line 226, in class_measurement
return mean_acc, np.array(class_precision), np.array(class_recall)
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (19,) + inhomogeneous part.

请问以上的报错怎么解决呢?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contributor Contribution from developers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants