Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ImageNet performance #1

Open
ppwwyyxx opened this issue May 10, 2018 · 6 comments
Open

ImageNet performance #1

ppwwyyxx opened this issue May 10, 2018 · 6 comments

Comments

@ppwwyyxx
Copy link

This repo uses fb.resnet.torch for ImageNet experiments. However the performance of ResNet-50 in fb.resnet.torch is 24.01: https://github.com/facebook/fb.resnet.torch. The performance of ResNet-50 baseline reported in the DBN paper is 24.87. Why is that?

@huangleiBuaa
Copy link
Collaborator

@ppwwyyxx , I guess the difference is probably from the different version of cudnn.
I ran the experiments of Res-18/Res-34 on the machine with cudnn-5.0, while I ran the experiment of Res-50/Res-101 on another machine (I don't remember the version of cudnn and I currently have no access to that machine). It seems that Res-18 and Res-34 have similar results with the experiments on https://github.com/facebook/fb.resnet.torch.

@ppwwyyxx
Copy link
Author

ppwwyyxx commented May 12, 2018

Perhaps.

Another question: in this repo there are resnet_BN vs resnet_DBN_scale_L1 and preresnet_BN vs preresnet_DBN_scale_L1. Which pair was used as the experiments in the paper? I didn't see the paper mention the use of preresnet but the README here mentions it.

@huangleiBuaa
Copy link
Collaborator

@ppwwyyxx
It's resnet_BN vs resnent_DBN_scale_L1 as described in the paper. Thanks for your pointing out, I will revise the README.

Actually, we also ran preresnet-18 and preresnet-34 (The same configuration with res-18 and res-34 described in this repo). The respective results are: 30.44 vs 29.97 (preresnet-18 vs preresnet-DBN-scale-L1-18), 26.76 vs 26.44 (preresnet-34 vs preresnet-DBN-scale-L1-34). We also further ran extra 10 epochs with learning rate diving 10 (100 epochs in total), which is the common setups in recent papers), then we get the results: 29.79 vs 29.31 (preresnet-18 vs preresnet-DBN-scale-18), 26.01 vs 25.76 (preresnet-34 vs preresnet-DBN-scale-L1-34).

@ppwwyyxx
Copy link
Author

Thanks. The diff shows that resnet_DBN_scale_L1 has one more convolution layer, which means the comparison between it and resnet_BN is not fair:

$ diff resnet_BN.lua resnet_DBN_scale_L1.lua
13a14
> require 'cudbn'
121a123,126
>       model:add(Convolution(64,64,3,3,1,1,1,1))
>       model:add(nn.Spatial_DBN_opt(64,opt.m_perGroup, opt.eps,_,true))
>       model:add(ReLU(true))
>  

@huangleiBuaa
Copy link
Collaborator

There is no big difference, I guess. You can check the model of preresnet.lua and preresnet-DBN-scale.lua (They have the same number convolution). I also did experiments with the original preresnet (without this extra conv, on 18 and 34 layer), the original preresnet has slightly better performance (30.38 vs 30.44 for 18 layers, 26.66 vs 26.76 for 34 layers). So I guess there is no big deal for the original residual network with this extra convolution. If you are interested in it, you can validate it. I also can run the experiments, however, I only have one machine with 8 GPUs available (and shared with other Lab members), it may take long times to get the results.

@JaeDukSeo
Copy link

thanks for this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants