Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reuse ConvNormActivation in some vision models #40431

Merged
merged 8 commits into from
Apr 22, 2022

Conversation

SigureMo
Copy link
Member

@SigureMo SigureMo commented Mar 10, 2022

PR types

Others

PR changes

Others

Describe

背景:#38653 (comment)

由于目前 paddle.vision.models 中很多模块中都单独实现了 ConvBNLayer 这一结构,因此我们完全可以将这一共有结构提取成一个单独的 Layer,此前在 #38653 中已经将该 Layer 提取到 paddle.vision.ops.ConvNormActivation,在本 PR 中将会在其余 5 个模型中复用该 Layer。

需要重构的网络如下:

  • inceptionv3.ConvBNLayer(1 个权重)
  • mobilenetv1.ConvBNLayer(1 个权重)
  • mobilenetv2.ConvBNLayer(无需更新权重)
  • resnext.ConvBNLayer(6 个权重)
  • shufflenetv2.ConvBNLayer(7 个权重)

其中 mobilenetv2 中 ConvBNLayer 与 ConvNormActivation 实现方式一致(nn.Sequential),因此无需更新权重,但其余模型权重均需更新。

resnext 将会在 #40588 修改,原因见下面的 comments

全部模型更新后均重新测试了 performance,均未发生下降的问题,测试详情见:https://aistudio.baidu.com/studio/project/partial/verify/3593768/f4038fdf8eb14cc698ca8dcccbcd363c

@paddle-bot-old
Copy link

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@SigureMo SigureMo marked this pull request as ready for review March 14, 2022 03:26
@SigureMo SigureMo force-pushed the reuse-convnorm-layer branch 2 times, most recently from 235ce62 to ddac292 Compare March 15, 2022 09:00
@SigureMo
Copy link
Member Author

@LielinJiang

已经使用 ConvNormActivation 替换掉其他 vision models 的 ConvBNLayer~麻烦有空 review 下下~

@SigureMo
Copy link
Member Author

@LielinJiang

另外,我发现 ResNeXt 能够直接通过复用 ResNet 网络结构实现(torchvision 和 keras 都是这么做的),只需在 ResNet 网络结构上稍作改动即可,这样可以避免再额外在 resnext.py 中实现一遍网络结构

这个问题是上次做 WideResNet 时发现的,之后的 Hackathon 闭门会时也有其他开发者提及,但当时已经合入因此未做改动,当时刚做 ResNeXt 时候考虑不全面非常抱歉,请问是否可以直接复用 ResNet 来重新实现 ResNeXt 呢?

不过如果改动的话,也会同时移除掉 ResNeXt 这个 Layer API,仅保留 resnext_xxxx 这样的工厂函数 API,API 参数也会稍有变动,而且整体与本 PR 没什么关系,我觉得重新开一个 PR 比较好。(由于非常简单,目前已经尝试在 #40588 中实现了下~)

如果这是合适的话,我在本 PR 里 revert 掉对 resnext.py 的改动,避免重复对 resnext.py 的改动与权重变更~否则就直接 close 掉那个 PR 啦~

@LielinJiang
Copy link
Contributor

接口不变的情况下,resnext用更好的实现是鼓励的

@SigureMo
Copy link
Member Author

emmm,应该做不到接口完全不变,目前 #40588 中的实现大概是这样的

原 API:

# resnet.py
ResNet(block, depth=50, width=64, num_classes=1000, with_pool=True)
resnet50(pretrained=False, **kwargs)
resnet101(pretrained=False, **kwargs)
resnet152(pretrained=False, **kwargs)
wide_resnet50_2(pretrained=False, **kwargs)
wide_resnet101_2(pretrained=False, **kwargs)

# resnext.py
ResNeXt(depth=50, cardinality=32, num_classes=1000, with_pool=True)
resnext50_32x4d(pretrained=False, **kwargs)
resnext101_32x4d(pretrained=False, **kwargs)
resnext152_32x4d(pretrained=False, **kwargs)
resnext50_64x4d(pretrained=False, **kwargs)
resnext101_64x4d(pretrained=False, **kwargs)
resnext152_64x4d(pretrained=False, **kwargs)

修改后的 API:

# resnet.py
ResNet(block, depth=50, groups=1, width_per_group=64, num_classes=1000, with_pool=True)
resnet50(pretrained=False, **kwargs)
resnet101(pretrained=False, **kwargs)
resnet152(pretrained=False, **kwargs)
wide_resnet50_2(pretrained=False, **kwargs)
wide_resnet101_2(pretrained=False, **kwargs)
resnext50_32x4d(pretrained=False, **kwargs)
resnext101_32x4d(pretrained=False, **kwargs)
resnext152_32x4d(pretrained=False, **kwargs)
resnext50_64x4d(pretrained=False, **kwargs)
resnext101_64x4d(pretrained=False, **kwargs)
resnext152_64x4d(pretrained=False, **kwargs)

整体 diff:

- ResNeXt(depth=50, cardinality=32, num_classes=1000, with_pool=True)
- ResNet(block, depth=50, width=64, num_classes=1000, with_pool=True)
+ ResNet(block, depth=50, groups=1, width_per_group=64, num_classes=1000, with_pool=True)

@LielinJiang 请问这是可以接受的嘛?

@LielinJiang
Copy link
Contributor

看样子是可以统一的,ResNeXt(depth=50, cardinality=32, num_classes=1000, with_pool=True)这个就先保留,ResNet添加一个默认参数group,width这个参数不变,在文档中说明具体含义,这样是否可行

@SigureMo
Copy link
Member Author

看样子是可以统一的,ResNeXt(depth=50, cardinality=32, num_classes=1000, with_pool=True)这个就先保留,ResNet添加一个默认参数group,width这个参数不变,在文档中说明具体含义,这样是否可行

嗯嗯,我可以尝试做一下,之前主要是考虑到现在 ResNeXt 在最新的 release(2.2.2) 里还没有发布,因此以为这个 API 无需做兼容性考虑。

我已经在本 PR revert 掉了 resnext 相关变动啦,有时间可以 review 下本 PR 嘛?

@LielinJiang
Copy link
Contributor

好的。resnext没有发布,那就不考虑了,可以随意改动

@SigureMo
Copy link
Member Author

好的。resnext没有发布,那就不考虑了,可以随意改动

好哒~明白啦~

@SigureMo
Copy link
Member Author

@LielinJiang 唔,可以 review 下这个 PR 嘛,这个 PR 没有任何 API 变动,只是复用了下 ConvNormActivation,不过需要更新下权重~

LielinJiang
LielinJiang previously approved these changes Mar 23, 2022
@paddle-bot-old
Copy link

paddle-bot-old bot commented Apr 6, 2022

Sorry to inform you that 18b24a5's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually.

@SigureMo
Copy link
Member Author

@LielinJiang 抱歉前段时间有些事情没有回复,可以麻烦上传下权重嘛?

Model name Weights
inception_v3 https://bj.bcebos.com/v1/ai-studio-online/37bdfacc03a3478d807287da1433c27c3b4cb5094aca4ff78e1738fa5ddd45c0
mobilenetv1_1.0 https://bj.bcebos.com/v1/ai-studio-online/e4780dbc69e44e88af956736840ecf31a74a91dcbff64ede8e311fffcd99b64f
shufflenet_v2_x0_25 https://bj.bcebos.com/v1/ai-studio-online/648cc351973b4233a016d158621a8e600568c6de97a14b469b2902676d939e6c
shufflenet_v2_x0_33 https://bj.bcebos.com/v1/ai-studio-online/df83b6202b784a72b8486936333da529fca22335189e4b9e88b3e0bc9b7597b8
shufflenet_v2_x0_5 https://bj.bcebos.com/v1/ai-studio-online/ac3e4a06a3714939bed412307bafcc19193beff575dc48548b47b34d9ccbb3d8
shufflenet_v2_x1_0 https://bj.bcebos.com/v1/ai-studio-online/a20dfea65c614e8baeb6530dc1c84c3004cf2a09a96348938d0a14da9412bc05
shufflenet_v2_x1_5 https://bj.bcebos.com/v1/ai-studio-online/d9f3eaf999d345c7bba7683bc6cc69d61881c3c37e45479bbf4dcf9ab132025d
shufflenet_v2_x2_0 https://bj.bcebos.com/v1/ai-studio-online/653d1228c8ca4274984d341111d0254fecdbf0954cfe4f85bc4bddd9545912e4
shufflenet_v2_swish https://bj.bcebos.com/v1/ai-studio-online/cb7a029e746a423c9219d034c4dd834438945503157b4b41aa26f7cda62e1fc9

另外 PR #40588 中也有一些需要上传

Model name Weights
resnext50_32x4d https://bj.bcebos.com/v1/ai-studio-online/d416003b0c2442ba84b7f1d979f758eeb9e806c3bb30409598e74caa2d091fd2
resnext50_64x4d https://bj.bcebos.com/v1/ai-studio-online/ab91153a5f424e2da776b4567db8aba38bb2a962e4ce4476a08a7759fc40f4bd
resnext101_32x4d https://bj.bcebos.com/v1/ai-studio-online/3a05705f6b5a4cc88c372312eedf6f91510440dc1d0740d0ac08235dae17a187
resnext101_64x4d https://bj.bcebos.com/v1/ai-studio-online/4a68929db53c48039df7888b65bf5f161a4683676cce40aa92c0b7777422c16f
resnext152_32x4d https://bj.bcebos.com/v1/ai-studio-online/708c5de2dcc14180b8792bd2d756c357929c695f97fc4837baf925ca147f1289
resnext152_64x4d https://bj.bcebos.com/v1/ai-studio-online/a393f016a6fa425b9b666c01fb64c54015b49d683cf84fc4af16b37786bd4d54

@SigureMo
Copy link
Member Author

@LielinJiang 有时间上传下权重嘛 😂

Copy link
Contributor

@XiaoguangHu01 XiaoguangHu01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@LielinJiang LielinJiang merged commit f6219dd into PaddlePaddle:develop Apr 22, 2022
@SigureMo SigureMo deleted the reuse-convnorm-layer branch April 23, 2022 07:23
SigureMo added a commit to cattidea/Paddle that referenced this pull request Apr 25, 2022
* reuse ConvNormActivation in some vision models
XiaoguangHu01 pushed a commit that referenced this pull request Apr 26, 2022
* reuse ConvNormActivation in some vision models (#40431)

* reuse ConvNormActivation in some vision models

* reimplement ResNeXt based on ResNet (#40588)

* refactor resnext
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants