You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I tried to convert pretrained densenet121 model provided in https://github.com/shicai/DenseNet-Caffe
to efficient version obeying your DenseBlock naming convention. I have the following prototxt (efficient_densenet121.prototxt) and script to copy params (from standard DenseNet_121.prototxt and corresponding model) until the end of first transition layer:
importsyssys.path.insert(0, './efficient-caffe/python')
importcaffeimportnumpyasnpimportgoogle.protobufaspb2importipdb### @brief## @param i ith denseblock, 1st denseblock prefix: conv2_, 2nd denseblock prefix: conv3_# @param j jth subblock, prefix: convi_j/# @param k which param in efficient denseblock# 0: 3x3 filter, 1/2: scale/bias of scale layer before 1x1 conv# 3/4: global mean/var of bn layer before 1x1 conv# 5: 1x1 filter, 6/7: scale/bias of scale layer before 3x3 conv# 8/9: global mean/var of bn layer before 3x3 conv# 10: moving average factor of bn layers## @return layer_name in standard source netdeffind_src_layer(i, j, k):
layer_name='conv{}_{}/'.format(i,j)
idx=0assertkinnp.arange(11)
ifk==0:
layer_name+='x2'elifk==1:
layer_name+='x1/scale'elifk==2:
layer_name+='x1/scale'idx=1elifk==3:
layer_name+='x1/bn'elifk==4:
layer_name+='x1/bn'idx=1elifk==5:
layer_name+='x1'elifk==6:
layer_name+='x2/scale'elifk==7:
layer_name+='x2/scale'idx=1elifk==8:
layer_name+='x2/bn'elifk==9:
layer_name+='x2/bn'idx=1else:
layer_name+='x1/bn'idx=2returnlayer_name, idxcaffe.set_mode_gpu()
net_dst=caffe.Net('./efficient_densenet121.prototxt', './DenseNet_121.caffemodel', caffe.TEST)
net_src=caffe.Net('./DenseNet_121.prototxt', './DenseNet_121.caffemodel', caffe.TEST)
net_dst_proto=caffe.proto.caffe_pb2.NetParameter()
withopen('./efficient_densenet121.prototxt', 'rb') asfd:
pb2.text_format.Merge(fd.read(), net_dst_proto)
j=1# copy denseblock paramsfori, layerinenumerate(net_dst.layers):
iflayer.type=='DenseBlock':
# params are saved in layer.blobslayer_proto=net_dst_proto.layer[i-1]
repeat=layer_proto.denseblock_param.numTransitionj+=1iflayer_proto.denseblock_param.use_BC:
fork, paraminenumerate(layer.blobs):
nth_param=k/repeatnth_repeat=k%repeat+1src_layer, nth=find_src_layer(j, nth_repeat, nth_param)
# print src_layer, nth, nth_repeat, nth_paramassertparam.data.shape==net_src.params[src_layer][nth].data.shapeor \
param.data.size==net_src.params[src_layer][nth].data.sizeparam.data[:] =net_src.params[src_layer][nth].data.copy().reshape(param.data.shape)
else:
pass# sanity checkinp=np.ones((1,3,224,224))
o_dst=net_dst.forward(data=inp)
o_src=net_src.forward(data=inp)
top_blob=net_dst.top_names[net_dst._layer_names[i]][0] # concat_6_2 layer for first denseblockprinttop_blob, net_dst.blobs[top_blob].data.mean(), net_src.blobs[top_blob].data.mean(), \
net_dst.blobs[top_blob].data.std(), net_src.blobs[top_blob].data.std()
However, when I run the above script, the mean and std of 'concat_2_6' layer does not match, not very big difference but obviously some issues exist when copying parameters, especially I found the result of efficient version seems irrelevant to the value of final parameter of DenseBlock layer. In https://github.com/Tongcheng/caffe/blob/master/src/caffe/layers/DenseBlock_layer.cpp#L159
it says it is related to batch norm layer's moving average factor, I am wondering is it because in standard densenet all batch norm layers's last parameter is the same, so in this efficient implementation it only has one parameter? Other than that, I still don't know why its value doesn't affect output and what will be the correct mapping of parameters from standard model to efficient model?
The text was updated successfully, but these errors were encountered:
I tried to convert pretrained densenet121 model provided in
https://github.com/shicai/DenseNet-Caffe
to efficient version obeying your DenseBlock naming convention. I have the following prototxt (efficient_densenet121.prototxt) and script to copy params (from standard DenseNet_121.prototxt and corresponding model) until the end of first transition layer:
script to test copying params:
However, when I run the above script, the mean and std of 'concat_2_6' layer does not match, not very big difference but obviously some issues exist when copying parameters, especially I found the result of efficient version seems irrelevant to the value of final parameter of DenseBlock layer. In
https://github.com/Tongcheng/caffe/blob/master/src/caffe/layers/DenseBlock_layer.cpp#L159
it says it is related to batch norm layer's moving average factor, I am wondering is it because in standard densenet all batch norm layers's last parameter is the same, so in this efficient implementation it only has one parameter? Other than that, I still don't know why its value doesn't affect output and what will be the correct mapping of parameters from standard model to efficient model?
The text was updated successfully, but these errors were encountered: