Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Used Loss #1

Open
gutihernandez opened this issue Feb 18, 2020 · 0 comments
Open

Used Loss #1

gutihernandez opened this issue Feb 18, 2020 · 0 comments

Comments

@gutihernandez
Copy link

gutihernandez commented Feb 18, 2020

Hi there!

I am trying to implement your model RUGE on PyTorch. I have read your paper and checked your code and reproduce your results in JAVA. I have some questions after comparing your implementation and the paper:

1. In the implementation which loss did you use? It looks like you did not use cross_entropy loss. I am claiming this by checking how you calculate the gradients in StochasticGradient.java's calculateGradient method. It looks like you have used MSE or something similar to that.
CORRECTION: I have checked it again and after taking the derivative correctly, I found out that you have used the gradient of the cross_entropy loss in the implementation like in the paper as well.

2. Also, theoretically is it OK to use cross_entropy loss for non-binary targets? Let's say the soft-label is supposed to be 0.8 and model predicted the unlabeled score as 0.8 too. When we put soft-label as a target and unlabeled_score as an output, then ideally loss should be 0 since the model correctly predicted the target. However, if we use cross_entropy to calculate the loss, loss will be 0.5004. So my worry is, using cross_entropy might mislead the model during unlabeled_loss calculation. It is doing good during the labeled_loss calculation since labeled_loss' targets are either 1s(true triples) or 0s(negative samples). I believe that is why you do not use cross_entropy in your implementation

cross entropy method that I used where x is output, y is target:
out = torch.mean(((-y) * torch.log(x + 1e-10)) -(1-y)*torch.log((1-x) + 1e-10))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant