Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CrossEntropy Loss or MSELoss in cls_closs? #31

Open
xmfbit opened this issue Aug 17, 2017 · 2 comments
Open

CrossEntropy Loss or MSELoss in cls_closs? #31

xmfbit opened this issue Aug 17, 2017 · 2 comments

Comments

@xmfbit
Copy link

xmfbit commented Aug 17, 2017

Good work. But I am confused about how to calculate cls loss.
It seems that you used MSELoss in your code. However, I find that in darknet, when computing gradient, the formula seems like cross-entropy: see https://github.com/pjreddie/darknet/blob/master/src/region_layer.c#L130. Besides, in the paper YOLO9000, the author seemed to use MSELoss just like what he did in YOLOV1.

So could you check this? Thank you.

@xmfbit
Copy link
Author

xmfbit commented Aug 17, 2017

OK...I see. You used one-hot vector as gt_classes. But a new question is that: the gradient (gt_class - prob) should be passed directly to the output of the final conv-layer (let's call it x), while you used softmax(x) in the code (prob_pred = F.softmax(score_pred.view(-1, score_pred.size()[-1])).view_as(score_pred)), then the autograd mechanism will bp through softmax operation. Is it right?

@AndresPMD
Copy link

Hello xmfbit,

Did you manage to understand the loss function? I am struggling with that as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants