You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for the great work. I asked this question over at DeepLabCut, but I figured I should ask it again here.
I'm not used to TensorFlow, so it could be due to my inability to read tf code, but I'm confused about the prediction layers in the model. In both the DeeperCut paper and DeepLabCut paper, the authors describe using a ResNet base followed by 2x upsampling with deconvolution layers. Then, the authors "connect the final output to the output of the conv3 bank."
In the code, features are extracted with the net_funcs imported from tf.slim: resnet_v1.resnet_v1_50 and resnet_v1.resnet_v1_101. Due to the use of atrous convolutions, and the lack of global average pooling (etc)., I think the features should be of shape (N, H/16, W/16, 2048).
They are then, I believe, passed to the following prediction layer:
This just means that the output of the ResNet is passed into a deconvolution layer, without any connection to conv3. Did I miss it somewhere?
Both papers use the phrasing "connected to", so I'm not sure if it's supposed to be concatenate + conv2d, addition, and whether or not the connection happens to the upsampled features or original features. I expected to see (in pseudocode) the prediction layer be something like this:
Hi there,
Thanks for the great work. I asked this question over at DeepLabCut, but I figured I should ask it again here.
I'm not used to TensorFlow, so it could be due to my inability to read tf code, but I'm confused about the prediction layers in the model. In both the DeeperCut paper and DeepLabCut paper, the authors describe using a ResNet base followed by 2x upsampling with deconvolution layers. Then, the authors "connect the final output to the output of the conv3 bank."
In the code, features are extracted with the
net_funcs
imported fromtf.slim
:resnet_v1.resnet_v1_50
andresnet_v1.resnet_v1_101
. Due to the use of atrous convolutions, and the lack of global average pooling (etc)., I think the features should be of shape(N, H/16, W/16, 2048)
.They are then, I believe, passed to the following prediction layer:
This just means that the output of the ResNet is passed into a deconvolution layer, without any connection to
conv3
. Did I miss it somewhere?Both papers use the phrasing "connected to", so I'm not sure if it's supposed to be concatenate + conv2d, addition, and whether or not the connection happens to the upsampled features or original features. I expected to see (in pseudocode) the prediction layer be something like this:
The text was updated successfully, but these errors were encountered: