Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the GPT SFT datasets loss mask bug #6409

Merged
merged 1 commit into from
Apr 11, 2023
Merged

Fix the GPT SFT datasets loss mask bug #6409

merged 1 commit into from
Apr 11, 2023

Conversation

yidong72
Copy link
Collaborator

What does this PR do ?

Currently the mask position is off by 1. The first token in the label is not included in the loss calculation.

Signed-off-by: Yi Dong <yidong@nvidia.com>
@MaximumEntropy MaximumEntropy merged commit dbdb8c5 into main Apr 11, 2023
@MaximumEntropy MaximumEntropy deleted the sft_impro branch April 11, 2023 17:42
hsiehjackson pushed a commit that referenced this pull request Apr 12, 2023
Signed-off-by: Yi Dong <yidong@nvidia.com>
hsiehjackson pushed a commit to hsiehjackson/NeMo that referenced this pull request Jun 2, 2023
Signed-off-by: Yi Dong <yidong@nvidia.com>
Signed-off-by: hsiehjackson <c2hsieh@ucsd.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants