-
Notifications
You must be signed in to change notification settings - Fork 259
Open
Description
Currently, the SkyRL Gym Generator masks out thinking tokens for Qwen3 models using the get_custom_chat_template hook, however, some users want to train Qwen3 and keep the thinking tokens.
Users want the ability to choose whether to train on thinking tokens and, more generally, want to be able to provide a custom chat template without forking the code.
TODOs
- Provide easier configuration / hook for users to provide a custom chat template in the SkyRL Gym Generator.
- Add configuration for using thought tokens or mask out thought tokens (using different custom templates).
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels