Skip to content

Conversation

@behroozazarkhalili
Copy link
Collaborator

Summary

Moves CPOTrainer and CPOConfig to trl.experimental.cpo following the pattern established in #4312 for BCO migration.

This is part of the V1 refactoring effort tracked in #4223.

Changes

  • ✅ Created trl/experimental/cpo/ module with full CPOTrainer and CPOConfig implementation
  • ✅ Replaced original files with deprecation stubs that maintain backward compatibility
  • ✅ Updated all imports in tests (test_cpo_trainer.py, test_trainers_args.py)
  • ✅ Updated example script (examples/scripts/cpo.py)
  • ✅ Updated documentation:
    • Moved CPO from Trainers to Experimental section in _toctree.yml
    • Updated import examples in cpo_trainer.md
    • Updated AlphaPO configuration example in paper_index.md
  • ✅ Added deprecation warnings pointing users to new import path
  • ✅ Set removal timeline for TRL 0.29

Backward Compatibility

The old import path still works but issues a FutureWarning:

from trl import CPOConfig, CPOTrainer  # Still works, shows deprecation warning

New import path:

from trl.experimental.cpo import CPOConfig, CPOTrainer

Testing

All existing tests updated to use the new import path. The migration follows the exact same pattern as BCO (#4312), which has been successfully integrated.

Closes #4460

Move CPOTrainer and CPOConfig to trl.experimental.cpo following the pattern
established in #4312 for BCO.

Changes:
- Create trl/experimental/cpo/ module with CPOTrainer and CPOConfig
- Replace original files with deprecation stubs
- Update all imports in tests, examples, and documentation
- Move CPO from Trainers to Experimental section in docs
- Maintain backward compatibility with deprecation warnings until TRL 0.29

Closes #4460
@behroozazarkhalili behroozazarkhalili enabled auto-merge (squash) November 5, 2025 22:14
- Remove unused imports (os, warnings) from experimental/cpo/cpo_trainer.py
- Reorder config import after utils imports
- Split combined import into separate lines in trainer/cpo_trainer.py
- Add missing torch import for type hints in deprecation stub

All ruff checks now pass.
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Move CPOTrainer to trl.experimental

3 participants