generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Open
Labels
🏋 GRPORelated to GRPORelated to GRPO🏋 KTORelated to KTORelated to KTO🏋 RLOORelated to RLOORelated to RLOO📚 documentationImprovements or additions to documentationImprovements or additions to documentation📱 cliRelated to the Command-line interfaceRelated to the Command-line interface
Description
currently https://huggingface.co/docs/trl/main/en/clis?command_line=Reward#basic-usage shows only basic example usage for SFT, DPO and Reward. We should have it for all supported CLIs (ie, GRPO, RLOO, KTO)
Metadata
Metadata
Assignees
Labels
🏋 GRPORelated to GRPORelated to GRPO🏋 KTORelated to KTORelated to KTO🏋 RLOORelated to RLOORelated to RLOO📚 documentationImprovements or additions to documentationImprovements or additions to documentation📱 cliRelated to the Command-line interfaceRelated to the Command-line interface