Computes the KL penalty using the entire distribution #541

edbeeching · 2023-07-19T12:33:30Z

I got so fed up with negative KL that I have added an option to calculate the KL for the entire distribution.
It is memory intensive for large sequences with an allocation of 8*vocab_size per token, but it should stabilize training and cannot be negative! Memory usage could be greatly reduced but would require a larger refactor.

Single seed runs for gpt2-sentiment:

HuggingFaceDocBuilderDev · 2023-07-19T12:39:04Z

The documentation is not available anymore as the PR was closed or merged.

younesbelkada

Thanks !

edbeeching added 4 commits July 18, 2023 17:19

adds full log probs

ff8c982

Adds tests, comments

374366a

precommit

e8a9608

bug all -> full

2da84c8

edbeeching requested review from lvwerra, lewtun and vwxyzjn July 19, 2023 12:33

adds option description to sentiment analysis script, fixes a few bugs

272ca8c

edbeeching marked this pull request as ready for review July 20, 2023 06:59

lvwerra approved these changes Jul 26, 2023

View reviewed changes

lvwerra requested a review from younesbelkada July 26, 2023 06:22

younesbelkada approved these changes Jul 26, 2023

View reviewed changes

younesbelkada merged commit 31658b4 into main Jul 27, 2023

younesbelkada deleted the full-kl-penalty branch July 27, 2023 10:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Computes the KL penalty using the entire distribution #541

Computes the KL penalty using the entire distribution #541

edbeeching commented Jul 19, 2023

HuggingFaceDocBuilderDev commented Jul 19, 2023 •

edited

Loading

younesbelkada left a comment

Computes the KL penalty using the entire distribution #541

Computes the KL penalty using the entire distribution #541

Conversation

edbeeching commented Jul 19, 2023

HuggingFaceDocBuilderDev commented Jul 19, 2023 • edited Loading

younesbelkada left a comment

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Jul 19, 2023 •

edited

Loading