-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
some questions about the model size #13
Comments
Thank you for your attention to our work! Yes, the trained models in this code are large because we don't perform binarization in our code. PyTorch doesn't support computation with binary weight so far. If you want to get a smaller model from a model trained using this repo, I recommend you to refer to some BNN frameworks (e.g. daBNN). Also, I need to point out that, we don't implement the bi-real re-training schedule in this repo. So in the Bi-Real Net paper, the authors first train the model with real-valued weight, then re-train the model with learning_rate=0 to let the BN absorb the magnitude of the weight. Thus you can't directly binarize the weight of the models trained using this repo. You can refer to bi-real for the re-training implementation. |
Hi! So basically BiDet means using Xnor-Net as the detector architecture (quantization method, network connection, etc.) plus the Sparse Object Prior and IB training loss. And BiDet (SC) means using Bi-Real-Net as the detector architecture plus our proposed losses (SC stands for shortcut, which is adopted in Bi-Real-Net). In this repo, we only implement BiDet (SC) because it achieves better mAP than BiDet. So, if you see something like "reg loss: xxx, prior loss: xxx", then this is BiDet (SC) training. I think the reason you get worse mAP than reported in the paper is because of the training schedule, e.g. learning rate decay and batch size. You can refer to this issue for more details about training binary detectors. Just to make sure one more thing, all the "Xnor-Net", "Bi-Real-Net", "BiDet", "BiDet (SC)" are trained using binary values. |
Hi! sorry to bother you again, as you said, these models are trained using binary values. In this repo, can I train BiDet (SC) using binary values?According to the code, this repo uses nn.Conv2d instead of your defined BinarizeConv2d in class BiDetResnet, however RPN_Conv uses the BinarizeConv2d. It confuses me that whether BiDet (SC) are trained using binary values in this repo. If I want to obtain a binarized network, do I need to transfer the trained BiDet (SC) model to binarized one using daBNN? |
Oh, I understand what you mean. It is a common practice in BNN that, we keep the first and last layer of the network full-precision (see Xnor-Net, Bi-Real-Net, etc.). That's why I use nn.Conv2d here. Besides, Bi-Real-Net also uses full-precision Conv in the downsample residual connection of ResNet, and that's why we also use nn.Conv2d here. But in other layers of the network, we use BinaryConv, for example here, you can see that we feed the BinBasicBlock which consists of two BinaryConv into the function. Also as you point out, we use BinaryConv in the RPN. One more thing, I'm not familiar with daBNN. But I think if you want to get a binarized network, whose weight is -1 and +1 in most of the layers, you need to re-train the network by only updating the BN to absorb the magnitude caused by binarization. You can refer to the Bi-Real-Net paper, Section 4.1, "Training" part for more details. |
Hi, could you tell me the size of your ssd or faster rcnn model? I found that my own trained faster rcnn model takes 142.14MB space! It is still too large.
The text was updated successfully, but these errors were encountered: