Used Loss #1

gutihernandez · 2020-02-18T12:49:53Z

Hi there!

I am trying to implement your model RUGE on PyTorch. I have read your paper and checked your code and reproduce your results in JAVA. I have some questions after comparing your implementation and the paper:

1. In the implementation which loss did you use? It looks like you did not use cross_entropy loss. I am claiming this by checking how you calculate the gradients in StochasticGradient.java's calculateGradient method. It looks like you have used MSE or something similar to that.
CORRECTION: I have checked it again and after taking the derivative correctly, I found out that you have used the gradient of the cross_entropy loss in the implementation like in the paper as well.

2. Also, theoretically is it OK to use cross_entropy loss for non-binary targets? Let's say the soft-label is supposed to be 0.8 and model predicted the unlabeled score as 0.8 too. When we put soft-label as a target and unlabeled_score as an output, then ideally loss should be 0 since the model correctly predicted the target. However, if we use cross_entropy to calculate the loss, loss will be 0.5004. So my worry is, using cross_entropy might mislead the model during unlabeled_loss calculation. It is doing good during the labeled_loss calculation since labeled_loss' targets are either 1s(true triples) or 0s(negative samples). I believe that is why you do not use cross_entropy in your implementation

cross entropy method that I used where x is output, y is target:
out = torch.mean(((-y) * torch.log(x + 1e-10)) -(1-y)*torch.log((1-x) + 1e-10))

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Used Loss #1

Used Loss #1

gutihernandez commented Feb 18, 2020 •

edited

Loading

Used Loss #1

Used Loss #1

Comments

gutihernandez commented Feb 18, 2020 • edited Loading

gutihernandez commented Feb 18, 2020 •

edited

Loading