Fix `logits/chosen` and `logits/rejected` metrics in `kto_trainer`. #2077

PhilipMay · 2024-09-18T14:11:59Z

The calculation of the logits/chosen and logits/rejected metrics in kto_trainer seem to be wrong. A nansum() followed by nanmean() applied to the policy_rejected_logits is wrong.

Our fix is to apply nansum() followed by an other nansum() and then devide the result by count/chosen or count/rejected.

PhilipMay · 2024-09-18T14:22:31Z

Tagging @MAOJIASONG as the original author and @claralp and asking for a review. 🙏🏼

HuggingFaceDocBuilderDev · 2024-09-18T14:24:40Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

qgallouedec · 2024-09-18T17:58:04Z

Now I understand:

Let's take the following example:

              process 1   process 2
logits        [0, 1, 2]   [3, 4]
sum               3         7
gather              [3, 7]
mean                  5

Your proposed fix

              process 1   process 2
logits        [0, 1, 2]   [3, 4]
sum               3         7
gather              [3, 7]
sum                   10
/all_num_chosen        2

Would the following work?

              process 1   process 2
logits        [0, 1, 2]   [3, 4]
gather          [0, 1, 2, 3, 4]
sum                   10
/all_num_chosen        2

Probably, but too memory intensive, right?

MAOJIASONG · 2024-09-18T19:20:54Z

Now I understand:

Let's take the following example:

              process 1   process 2
logits        [0, 1, 2]   [3, 4]
sum               3         7
gather              [3, 7]
mean                  5

Your proposed fix

              process 1   process 2
logits        [0, 1, 2]   [3, 4]
sum               3         7
gather              [3, 7]
sum                   10
/all_num_chosen        2

Would the following work?

              process 1   process 2
logits        [0, 1, 2]   [3, 4]
gather          [0, 1, 2, 3, 4]
sum                   10
/all_num_chosen        2

Probably, but too memory intensive, right?

May I ask what's the difference between the first one, and the proposed second one? why is it better to use all_num_chosen for division? @qgallouedec

qgallouedec · 2024-09-18T19:58:33Z

Because otherwise the result is wrong (5)

PhilipMay · 2024-09-18T20:05:11Z

Probably, but too memory intensive, right?

@qgallouedec I am not sure about this to be honest.

MAOJIASONG · 2024-09-19T06:03:40Z

Because otherwise the result is wrong (5)

Thx, I misinterpreted the example. Thanks for pointing it out.

fix metrics

8c04aea

kashif approved these changes Sep 18, 2024

View reviewed changes

PhilipMay added 2 commits September 18, 2024 16:19

fix formatting

3a6f238

fix "#" sign

7dcde27

qgallouedec approved these changes Sep 18, 2024

View reviewed changes

qgallouedec merged commit 0d2bee5 into huggingface:main Sep 18, 2024
9 checks passed

claralp mentioned this pull request Sep 21, 2024

KTO: fix logits metric, add logits metric to BCOTrainer #2094

Merged

PhilipMay changed the title ~~[WIP] Fix logits/chosen and logits/rejected metrics in kto_trainer.~~ Fix logits/chosen and logits/rejected metrics in kto_trainer. Sep 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix `logits/chosen` and `logits/rejected` metrics in `kto_trainer`. #2077

Fix `logits/chosen` and `logits/rejected` metrics in `kto_trainer`. #2077

PhilipMay commented Sep 18, 2024

PhilipMay commented Sep 18, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Sep 18, 2024

qgallouedec commented Sep 18, 2024 •

edited

Loading

MAOJIASONG commented Sep 18, 2024

qgallouedec commented Sep 18, 2024

PhilipMay commented Sep 18, 2024

MAOJIASONG commented Sep 19, 2024 •

edited

Loading

Fix logits/chosen and logits/rejected metrics in kto_trainer. #2077

Fix logits/chosen and logits/rejected metrics in kto_trainer. #2077

Conversation

PhilipMay commented Sep 18, 2024

PhilipMay commented Sep 18, 2024 • edited Loading

HuggingFaceDocBuilderDev commented Sep 18, 2024

qgallouedec commented Sep 18, 2024 • edited Loading

MAOJIASONG commented Sep 18, 2024

qgallouedec commented Sep 18, 2024

PhilipMay commented Sep 18, 2024

MAOJIASONG commented Sep 19, 2024 • edited Loading

Fix `logits/chosen` and `logits/rejected` metrics in `kto_trainer`. #2077

Fix `logits/chosen` and `logits/rejected` metrics in `kto_trainer`. #2077

PhilipMay commented Sep 18, 2024 •

edited

Loading

qgallouedec commented Sep 18, 2024 •

edited

Loading

MAOJIASONG commented Sep 19, 2024 •

edited

Loading