You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for your valuable open-source contribution!
In instruction tuning stage, it seems that only the answer aligned with the instruction participates in the backpropagation process. And this code seems to imply that the question part of the dataset is also involved in backpropagation of the gradient. Will this lead to better training results?
The text was updated successfully, but these errors were encountered:
Hi. Thank you for reaching out. In this implementation, we do not mask the part of the sequence which corresponds to the question. You can slightly modify the code to account for that. Both methods work reasonably well (masking and not masking) though masking seems to be the standard practice.
Thank you for your valuable open-source contribution!
In instruction tuning stage, it seems that only the answer aligned with the instruction participates in the backpropagation process. And this code seems to imply that the question part of the dataset is also involved in backpropagation of the gradient. Will this lead to better training results?
The text was updated successfully, but these errors were encountered: