Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Computes the KL penalty using the entire distribution #541

Merged
merged 5 commits into from
Jul 27, 2023

Conversation

edbeeching
Copy link
Collaborator

I got so fed up with negative KL that I have added an option to calculate the KL for the entire distribution.
It is memory intensive for large sequences with an allocation of 8*vocab_size per token, but it should stabilize training and cannot be negative! Memory usage could be greatly reduced but would require a larger refactor.

Single seed runs for gpt2-sentiment:

image

@edbeeching edbeeching requested review from lvwerra, lewtun and vwxyzjn July 19, 2023 12:33
@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jul 19, 2023

The documentation is not available anymore as the PR was closed or merged.

@edbeeching edbeeching marked this pull request as ready for review July 20, 2023 06:59
@lvwerra lvwerra requested a review from younesbelkada July 26, 2023 06:22
Copy link
Contributor

@younesbelkada younesbelkada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks !

@younesbelkada younesbelkada merged commit 31658b4 into main Jul 27, 2023
@younesbelkada younesbelkada deleted the full-kl-penalty branch July 27, 2023 10:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants