Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fusing layer #11

Closed
hungnguyen0606 opened this issue Mar 9, 2017 · 5 comments
Closed

Fusing layer #11

hungnguyen0606 opened this issue Mar 9, 2017 · 5 comments
Labels

Comments

@hungnguyen0606
Copy link

According to the paper, we should add a 1x1 convolutional layer on top of pool4 to get a score for each class and use that score to fuse with the final layer in FC32. Finally, we use a devconv layer to get the target image.
However, in your implementation, you convert final layer of FC32, using a devconv layer, to have the same shape with pool4 layer. Then, you directly fuse pool4 with that score.
I just want to know whether the order of these operation matters?

@shekkizh
Copy link
Owner

The order doesn't matter. All that is required is you up-sample(deconv) fc32 to get same size as that of pool4 and add them together .

@hungnguyen0606
Copy link
Author

hungnguyen0606 commented Mar 14, 2017

Thank you for your answer.

Btw, can you provide the information about your GPU and time to train the network from scratch?

As I'v read in the paper, the authors used up to 7 convolution layers of VGG net, but I observed that you only use 4 layers. Can you provide the performance of your network compared to that of the authors?

@hungnguyen0606
Copy link
Author

Furthermore, if I want to use this network for other images with different size, should I retrain the network from scratch or only need to resize the label map to the original size?

@shekkizh
Copy link
Owner

The model was trained on a 12gb titanx card. Not sure about exact timing but I believe I trained overnight so ~6-7hrs.
Not sure what you are referring to as conv layers here. The vgg net model has several conv layers.

As for input image size, yes it should be possible to do inference with different image size at test time. Just make sure that you do not set any fixed size in placeholders and other places during training.

@hungnguyen0606
Copy link
Author

Sorry, it's just my misunderstanding about the structure of the network.
Thank you for your information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants