generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Open
13 / 2613 of 26 issues completedDescription
The purpose of this issue is to list the tasks that need to be completed before we reach v1. This list is evolving and is modified based on recent discussions and progress.
Documentation
- Remove
how_to_train.mdRemove how_to_train.md: outdated training FAQ #4267 - Remove
using_llama_models.mdRemove using_llama_models.md: outdated Llama2-specific documentation #4268 - Remove
logging.mdRemove logging.md: trainer-specific metrics documentation #4269 - Rewrite
peft_integration.md#4376 - Remove guidance about converting conversational to standard. #4375
- Move every section of
conceptual_guides/experimentalinto its own section inexperimental#4377 - Extend basic usage example to all supported CLIs #4378
- Remove or populate "Training customization" #4379
- Remove outdated warning about batch contamination #4381
- Populate "Speeding Up Training" #4382
- Add PEFT subsection to "Reducing Memory Usage" #4383
- Write the subsection "Multi-Node Training" #4384
- Use a common 'trl-lib` namespace for the models/datasets/spaces #4385
- Reference supported trainers in Liger Kernel integration guide #4386
- Remove Sentiment Tuning Examples #4396
- Remove or move Multi Adapter RL #4397
- Complete paper index #4407
Examples
Tests
Main codebase
- Add accuracy reward to the
trl.rewardsmodule Add accuracy reward #4270 - Add an option (default to True) to use
RichProgressCallbackin scripts (trl.scripts). - Add kernels to Docker images #4398
- Remove
log_example_reports.pyRemove unused log_example_reports.py script #4241 - Remove
commandsdirectory Remove unused commands directory #4258 - Remove
examples/research_projectsRemove unused commands directory #4258 - Remove
trl.extra.dataset_formattingDeprecate unused dataset_formatting module #4242 - Remove support for FSDP1 #4387
- Remove
BestOfNSampler. DeprecateBestOfNSampler#4291 - Fully transition from
flash-attntokernels#4380 - Move
masked_mean,masked_varandmasked_whitentoppo.py#4403 - Refactor DPO to align implementation with SFT (WIP in [DRAFT] Refactor DPO #3906)
- Tool calling for GRPO/RLOO (WIP in Tool call #4300)
- Async generation for Online methods
- Bump transformers to v5
- Make vLLM server OpenAI-compatible #4402
Moving experimental features to experimental submodule
Discussed in #4223 for trainers
- Move BCO to experimental submodule 🚚 Move BCO to
trl.experimental#4312 - Move KTO to experimental submodule
- Move Nash-MD to experimental submodule
- Move ORPO to experimental submodule
- Move PPO to experimental submodule
- Move PRM to experimental submodule
- Move RLOO to experimental submodule
- Move XPO to experimental submodule
- Move everything related to Mergekit to the experimental submodule #4395
- Move everything related to Judges to
trl.experimental#4400 - Move Winrate callback to experimental
sergiopaniego, maziyarpanahi, kashif and ucyang
Sub-issues
Metadata
Metadata
Assignees
Labels
No labels