CrossEntropy Loss or MSELoss in cls_closs? #31

xmfbit · 2017-08-17T10:35:29Z

Good work. But I am confused about how to calculate cls loss.
It seems that you used MSELoss in your code. However, I find that in darknet, when computing gradient, the formula seems like cross-entropy: see https://github.com/pjreddie/darknet/blob/master/src/region_layer.c#L130. Besides, in the paper YOLO9000, the author seemed to use MSELoss just like what he did in YOLOV1.

So could you check this? Thank you.

xmfbit · 2017-08-17T10:55:01Z

OK...I see. You used one-hot vector as gt_classes. But a new question is that: the gradient (gt_class - prob) should be passed directly to the output of the final conv-layer (let's call it x), while you used softmax(x) in the code (prob_pred = F.softmax(score_pred.view(-1, score_pred.size()[-1])).view_as(score_pred)), then the autograd mechanism will bp through softmax operation. Is it right?

AndresPMD · 2017-10-17T20:35:23Z

Hello xmfbit,

Did you manage to understand the loss function? I am struggling with that as well.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CrossEntropy Loss or MSELoss in cls_closs? #31

CrossEntropy Loss or MSELoss in cls_closs? #31

xmfbit commented Aug 17, 2017

xmfbit commented Aug 17, 2017

AndresPMD commented Oct 17, 2017

CrossEntropy Loss or MSELoss in cls_closs? #31

CrossEntropy Loss or MSELoss in cls_closs? #31

Comments

xmfbit commented Aug 17, 2017

xmfbit commented Aug 17, 2017

AndresPMD commented Oct 17, 2017