-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
【Hackathon 5th No.68】轻量语义分割网络PIDNet #3548
Conversation
Thanks for your contribution! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
感谢贡献,1. 有一些细节问题,同时2. 需要在configs目录下补充readme;另外3. 辛苦确认训练参数、transform等配置,如果确认,我们在我们这边进行训练。(4. 训练之前可以提供下backbone的预训练权重。)
这里是imagenet_pretrained weights和2xb6-120k_1024x1024-cityscapes weights |
|
现在应该有了 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
模型权重有点问题,你重新给我提交下吧,我把链接给你~
kernel = np.ones((self.edge_size, self.edge_size), np.uint8) | ||
edge = np.pad( | ||
edge[self.y_k_size:-self.y_k_size, self.x_k_size:-self.x_k_size], | ||
((self.y_k_size, self.y_k_size), (self.x_k_size, self.x_k_size)), | ||
mode='constant') | ||
edge = (cv2.dilate(edge, kernel, iterations=1) > 50) * 1.0 | ||
|
||
data['gt_fields'].append('edge') | ||
data['edge'] = edge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个比mask_to_binary_edge快,因为是没有使用到onehot的步骤。
但是这样获得的edge是每个类别的edge么,因为lable是0-18?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mask_to_binary_edge先生成onehot,再从onehot生成edge,但是最后又把全部类别的edge和在一起了,那似乎生成onehot就没有必要了。
什么问题 |
没事,我先验证下。 |
我使用2张V100进行了一次训练, 当前的实现和官方有如下区别:
另外,我也尝试了使用RandomDistort增强,基本没有提升( 如果,你要进行训练,可以把padding_value设置为
|
好,我在paddle=0上训练下。你那边能否看看标签的影响呢?另外我看标签不应该是影响,labelID是完整的33类数据,但是PID源码是19类的模型,辛苦看看如果没有转换,那模型和数据的类别是怎么匹配上的,如果可以匹配也可以用labelID训练下,探究下这方面的影响。https://www.cnblogs.com/dotman/p/cityscapes_dataset_tips.html |
需要把33类的 参考PIDNet的实现: def convert_label(label: np.ndarray, ignore_label: int = 255) -> np.ndarray:
label_mapping = {-1: ignore_label, 0: ignore_label,
1: ignore_label, 2: ignore_label,
3: ignore_label, 4: ignore_label,
5: ignore_label, 6: ignore_label,
7: 0, 8: 1, 9: ignore_label,
10: ignore_label, 11: 2, 12: 3,
13: 4, 14: ignore_label, 15: ignore_label,
16: ignore_label, 17: 5, 18: ignore_label,
19: 6, 20: 7, 21: 8, 22: 9, 23: 10, 24: 11,
25: 12, 26: 13, 27: 14, 28: 15,
29: ignore_label, 30: ignore_label,
31: 16, 32: 17, 33: 18}
temp = label.copy()
for v, k in label_mapping.items():
temp[label == k] = v
return temp
我正在用转换后的 |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
精度结果基本满足预期,完成下列修改可合入:
- 对small模型的模型参数和日志都加上吧(
vdl:https://paddlepaddle.org.cn/paddle/visualdl/service/app?id=57dda9c34cd06a4b2996118df03583c9
model:https://paddleseg.bj.bcebos.com/dygraph/pidnet/pidnet_small_cityscapes_1024x1024_120k/model.pdparams
log:https://paddleseg.bj.bcebos.com/dygraph/pidnet/pidnet_small_cityscapes_1024x1024_120k/pidnet_small.log
),可行的话也加上medium和large的。 - 同时加一下tipc:https://github.com/PaddlePaddle/PaddleSeg/blob/release/2.8/test_tipc/docs/test_train_inference_python.md#22-%E5%8A%9F%E8%83%BD%E6%B5%8B%E8%AF%95 lite_tran_lite_infer模式即可,用于加入CI测试。
tipc有点问题 在paddle2.5.1版本下,export.py会报错: Traceback (most recent call last):
File "/home/flytocc/workspaces/PaddleSeg/tools/export.py", line 121, in <module>
main(args)
File "/home/flytocc/workspaces/PaddleSeg/tools/export.py", line 89, in main
paddle.jit.save(model, os.path.join(args.save_dir, save_name))
File "/home/flytocc/anaconda3/envs/pd25/lib/python3.10/site-packages/decorator.py", line 232, in fun
return caller(func, *(extras + args), **kw)
File "/home/flytocc/anaconda3/envs/pd25/lib/python3.10/site-packages/paddle/fluid/wrapped_decorator.py", line 25, in __impl__
return wrapped_func(*args, **kwargs)
File "/home/flytocc/anaconda3/envs/pd25/lib/python3.10/site-packages/paddle/jit/api.py", line 752, in wrapper
func(layer, path, input_spec, **configs)
File "/home/flytocc/anaconda3/envs/pd25/lib/python3.10/site-packages/decorator.py", line 232, in fun
return caller(func, *(extras + args), **kw)
File "/home/flytocc/anaconda3/envs/pd25/lib/python3.10/site-packages/paddle/fluid/wrapped_decorator.py", line 25, in __impl__
return wrapped_func(*args, **kwargs)
File "/home/flytocc/anaconda3/envs/pd25/lib/python3.10/site-packages/paddle/fluid/dygraph/base.py", line 75, in __impl__
return func(*args, **kwargs)
File "/home/flytocc/anaconda3/envs/pd25/lib/python3.10/site-packages/paddle/jit/api.py", line 1043, in save
static_func.concrete_program_specify_input_spec(
File "/home/flytocc/anaconda3/envs/pd25/lib/python3.10/site-packages/paddle/jit/dy2static/program_translator.py", line 709, in concrete_program_specify_input_spec
concrete_program, _ = self.get_concrete_program(
File "/home/flytocc/anaconda3/envs/pd25/lib/python3.10/site-packages/paddle/jit/dy2static/program_translator.py", line 589, in get_concrete_program
concrete_program, partial_program_layer = self._program_cache[
File "/home/flytocc/anaconda3/envs/pd25/lib/python3.10/site-packages/paddle/jit/dy2static/program_translator.py", line 1249, in __getitem__
self._caches[item_id] = self._build_once(item)
File "/home/flytocc/anaconda3/envs/pd25/lib/python3.10/site-packages/paddle/jit/dy2static/program_translator.py", line 1193, in _build_once
concrete_program = ConcreteProgram.from_func_spec(
File "/home/flytocc/anaconda3/envs/pd25/lib/python3.10/site-packages/decorator.py", line 232, in fun
return caller(func, *(extras + args), **kw)
File "/home/flytocc/anaconda3/envs/pd25/lib/python3.10/site-packages/paddle/fluid/wrapped_decorator.py", line 25, in __impl__
return wrapped_func(*args, **kwargs)
File "/home/flytocc/anaconda3/envs/pd25/lib/python3.10/site-packages/paddle/fluid/dygraph/base.py", line 75, in __impl__
return func(*args, **kwargs)
File "/home/flytocc/anaconda3/envs/pd25/lib/python3.10/site-packages/paddle/jit/dy2static/program_translator.py", line 1070, in from_func_spec
error_data.raise_new_exception()
File "/home/flytocc/anaconda3/envs/pd25/lib/python3.10/site-packages/paddle/jit/dy2static/error.py", line 452, in raise_new_exception
raise new_exception from None
ValueError: In transformed code:
File "/home/flytocc/workspaces/PaddleSeg/paddleseg/deploy/export.py", line 26, in forward
outs = self.model(x)
File "/home/flytocc/workspaces/PaddleSeg/paddleseg/models/pidnet.py", line 69, in forward
feat = self.backbone(x)
File "/home/flytocc/workspaces/PaddleSeg/paddleseg/models/backbones/pidnet.py", line 681, in forward
# stage 0-2
x = self.stem(x)
~~~~~~~~~~~~~~~~ <--- HERE
# stage 3
File "/home/flytocc/anaconda3/envs/pd25/lib/python3.10/site-packages/paddle/jit/dy2static/convert_call_func.py", line 222, in convert_call
if is_builtin(func) or is_unsupported(func):
File "/home/flytocc/anaconda3/envs/pd25/lib/python3.10/site-packages/paddle/jit/dy2static/convert_call_func.py", line 122, in is_unsupported
func_in_dict = func == v
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (6,) + inhomogeneous part. 在paddle2.5.2版本下,export.py没报错,但infer.py会报错: Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/lib/python3.8/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/usr/lib/python3.8/multiprocessing/spawn.py", line 126, in _main
self = reduction.pickle.load(from_parent)
File "/usr/lib/python3.8/multiprocessing/synchronize.py", line 110, in __setstate__
self._semlock = _multiprocessing.SemLock._rebuild(*state)
FileNotFoundError: [Errno 2] No such file or directory
Traceback (most recent call last):
File "deploy/python/infer.py", line 396, in <module>
main(args)
File "deploy/python/infer.py", line 384, in main
predictor.run(imgs_list)
File "deploy/python/infer.py", line 219, in run
self.predictor.run()
ValueError: (InvalidArgument) Broadcast dimension mismatch. Operands could not be broadcast together with the shape of X = [1, 32, 128, 256] and the shape of Y = [1, 32, 128, 128]. Received [256] in X is not equal to [128] in Y at i:3.
[Hint: Expected x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1 == true, but received x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1:0 != true:1.] (at ../paddle/phi/kernels/funcs/common_shape.h:86)
[operator < elementwise_mul > error] |
我训练完成了,mIoU |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
导出和推理在paddle develop上重新尝试下? 看上去是形状不匹配,应该是动转静过程中的问题。
另外readme需要补充下。
configs/pidnet/README.md
Outdated
|PIDNet|PIDNet_Medium|1024x1024|120000|80.22%|82.05%|[model](https://paddleseg.bj.bcebos.com/dygraph/pidnet/pidnet_medium_2xb6-120k_1024x1024-cityscapes.pdparams)| | ||
|PIDNet|PIDNet-Large |1024x1024|120000|80.89%|82.37%|[model](https://paddleseg.bj.bcebos.com/dygraph/pidnet/pidnet_large_2xb6-120k_1024x1024-cityscapes.pdparams)| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这两个都是转换的模型而非训练模型,最好做一下区分。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m和l的模型都没训练,我先空着?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
嗯嗯,可以训练下加入。
这两个都是2.5.1? |
写错了,2.5.1版本下export.py报错,2.5.2版本下infer.py报错,稍后我试试develop |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
我在使用paddleseg训练的时候,选用的是pp_liteseg_stdc1_camvid_960x720_10k。训练发生报错: |
@pobi123 这是PIDNet的pr,pp_liteseg有什么问题建议直接提issue
|
500/500 [==============================] - 54s 107ms/step - batch_cost: 0.1071 - reader cost: 0.0861 请问以上的报错怎么解决呢? |
PR types
New features
PR changes
Task PaddlePaddle/Paddle#57262
RPC PaddlePaddle/community#722
Description
PIDNet
网络AddEdgeLabel
transformCrossEntropyLoss
的一个bugOhemCrossEntropyLoss
增加了对 weight 的支持used AI Studio for training. Thanks a lot!