You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Good work. But I am confused about how to calculate cls loss.
It seems that you used MSELoss in your code. However, I find that in darknet, when computing gradient, the formula seems like cross-entropy: see https://github.com/pjreddie/darknet/blob/master/src/region_layer.c#L130. Besides, in the paper YOLO9000, the author seemed to use MSELoss just like what he did in YOLOV1.
So could you check this? Thank you.
The text was updated successfully, but these errors were encountered:
OK...I see. You used one-hot vector as gt_classes. But a new question is that: the gradient (gt_class - prob) should be passed directly to the output of the final conv-layer (let's call it x), while you used softmax(x) in the code (prob_pred = F.softmax(score_pred.view(-1, score_pred.size()[-1])).view_as(score_pred)), then the autograd mechanism will bp through softmax operation. Is it right?
Good work. But I am confused about how to calculate cls loss.
It seems that you used MSELoss in your code. However, I find that in darknet, when computing gradient, the formula seems like cross-entropy: see https://github.com/pjreddie/darknet/blob/master/src/region_layer.c#L130. Besides, in the paper YOLO9000, the author seemed to use MSELoss just like what he did in YOLOV1.
So could you check this? Thank you.
The text was updated successfully, but these errors were encountered: