Open
Description
Thank you for your valuable open-source contribution!
In instruction tuning stage, it seems that only the answer aligned with the instruction participates in the backpropagation process. And this code seems to imply that the question part of the dataset is also involved in backpropagation of the gradient. Will this lead to better training results?
Metadata
Metadata
Assignees
Labels
No labels