Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lack of Softmax in this code? #30

Open
zachary-jablons-okcupid opened this issue Mar 16, 2022 · 3 comments
Open

Lack of Softmax in this code? #30

zachary-jablons-okcupid opened this issue Mar 16, 2022 · 3 comments

Comments

@zachary-jablons-okcupid

Hey Geoff,

I know this is 5 year old research code, but I'm a bit confused about something. In the accompanying paper, it seems like the output of temperature scaling is meant to go through a softmax before being used.

image

However, in this implementation as far as I can tell there's no use of Softmax as part of the temperature scaling operation. I'd expect to see it maybe at the forward step or potentially when the output thereof is put into the cross entropy loss here, but it seems like instead the cross entropy is being given the scaled logits without any softmax applied.

I might just be missing something obvious here of course, but I want to make sure my understanding of how temperature scaling is supposed to work is correct.

Thanks in advance for helping me clarify anything I'm missing here

@zachary-jablons-okcupid
Copy link
Author

Actually, as I was reimplementing it I came across this line in the PyTorch docs for CrossEntropyLoss:

The input is expected to contain raw, unnormalized scores for each class.

I'm guessing this is my answer - this behavior is equivalent to using from_logits=True in categorical_crossentropy in TensorFlow. Is that where the softmax is effectively occurring?

@CallMeMisterOwl
Copy link

Yes, you are correct. The wrapper class does not apply softmax to the output.

@lindseyfeng
Copy link

hi I am wondering the same thing. just to double check, I am wondering where you ended up adding the softmax in the code?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants