You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've seen a few discussions about DPO for sd-scripts, specifically this and this.
However there hasn't been further movement on either, from what I can tell. ORPO is related to DPO and some even consider it superior.
I was recently browsing the various forks of sd-scripts and found the following repo which appears to be under active development.
The branch doesn't function for me, erroring out with the following:
I think that any kind of preference training would be interesting to explore and would be happy to see this kind of feature in sd-scripts but I have not been able to successfully contact the developer of this fork and so I'm raising awareness here in case there is a chance to gain traction.
After speaking with other AI/ML researchers and developers, I have been informed that regular DPO training is "easy" to implement.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I've seen a few discussions about DPO for sd-scripts, specifically this and this.
However there hasn't been further movement on either, from what I can tell.
ORPO is related to DPO and some even consider it superior.
I was recently browsing the various forks of sd-scripts and found the following repo which appears to be under active development.
The branch doesn't function for me, erroring out with the following:
I think that any kind of preference training would be interesting to explore and would be happy to see this kind of feature in sd-scripts but I have not been able to successfully contact the developer of this fork and so I'm raising awareness here in case there is a chance to gain traction.
After speaking with other AI/ML researchers and developers, I have been informed that regular DPO training is "easy" to implement.
Beta Was this translation helpful? Give feedback.
All reactions