Decouple Retokenize and Custom Chat Template by devpatelio · Pull Request #351 · NovaSky-AI/SkyRL

devpatelio · 2025-09-26T18:32:45Z

Addresses #179: Separate configuration of cfg.generator.retokenize and cfg.generator.custom_chat_template
Note, this branch is an extension of this PR for improving the custom chat template endpoint flexibility (#178)

Separate configuration of cfg.generator.retokenize and cfg.generator.custom_chat_template
Currently, prior to this PR, having a custom_chat_template means that we always retokenize chat history, rather than performing token-in-token-out style of appending tokens. But these behaviors should be decoupled. Refer to https://skyrl.readthedocs.io/en/latest/tutorials/skyrl_gym_generator.html for more details.

… commits as right now, users have to manually add in parser support). Provided thinking masking on/off support for Qwen3 Models

Done!

…. Final checks are done and ready!

…nja2 files

…rebase

…t list instead of List[List[int]]

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

…_templating.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

…om_chat_template endpoint to ensure tests are aligned

…or no chat template selection, brought it back

…i brought this back. return None if no name_or_path since behaviour for apply_chat_template uses default template from tokenizer in that case. Addressed all other PRs too

…when name_or_path is None

…template for list return, same template as without_thinking but return type is different and supports test

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

gemini-code-assist

Code Review

This pull request effectively decouples the retokenize and custom_chat_template configurations by introducing a more flexible chat_template config structure and a separate retokenize boolean flag. The changes in skyrl_train/generators/utils.py to handle template loading are a significant improvement. However, I've identified a critical issue where an old assertion was missed during refactoring, which will lead to a runtime error and undermines the decoupling effort. Additionally, a unit test intended to cover the retokenization path is now broken due to the logic changes. My review includes specific suggestions to address these points.

skyrl-train/skyrl_train/generators/skyrl_gym_generator.py

skyrl-train/tests/cpu/generators/test_skyrl_gym_generator.py

devpatelio · 2025-09-26T18:38:12Z

/gemini review

gemini-code-assist

Code Review

This pull request effectively decouples the retokenize and custom_chat_template configurations, which is a great improvement for flexibility. The implementation is solid, with new configuration options in ppo_base_config.yaml and updated logic in skyrl_gym_generator.py and utils.py. The tests have also been updated to cover the new functionality, including loading templates from files.

I have a few suggestions to improve code clarity and robustness, mainly around configuration handling and variable naming. I've also noted a minor point about a new test file and an outdated TODO comment.

skyrl-train/skyrl_train/generators/skyrl_gym_generator.py

skyrl-train/tests/cpu/generators/qwen3_acc_thinking.jinja2

skyrl-train/tests/cpu/generators/test_skyrl_gym_generator.py

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

devpatelio and others added 30 commits August 23, 2025 01:12

Added config for custom chat template (will make easier in subsequent…

3a4b930

… commits as right now, users have to manually add in parser support). Provided thinking masking on/off support for Qwen3 Models

Added test script for masking and demasking, seems to work well!

930cb19

Update run_gsm8k_thinking.sh

700d7c0

updated new

64d6f7e

Done!

Added support for custom masking training with templates for batching…

c27930b

…. Final checks are done and ready!

Fixed bug to append custom response for multi-turn

20e63b7

stash changes

66c23c9

Updated config to support names in custom_chat_template or pass in ji…

7a22332

…nja2 files

fixed rebase

4233da9

nit: repetition

14d8b92

done

b8cea88

fixed refactor issues with some misnamed variables from the previous …

6a38f6d

…rebase

additional rebase fixes

f7efc37

removed print statements and debug scripts

a077e8d

deleted extra jinja template file

7d41b60

fixed bug where prompts were being processed by tokenizer as 1 elemen…

5ab4dde

…t list instead of List[List[int]]

list comprehension

58082a4

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

PR comments applied

fb7b2ea

Remove assignment of enable_thinking_tokens variable

dcffdaf

removed uv.lock file

9663547

removed redundant model_name

b3398b9

Update skyrl-train/skyrl_train/generators/skyrl_gym_generator.py

0f80225

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Update skyrl-train/tests/cpu/generators/test_skyrl_gym_generator_chat…

f2711a4

…_templating.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

done

b13a647

fix by using without thinking so no parsing problem, like before

d2bbe58

fixed the test defaults to check without tokenization

c1d0679

allow model_name to pass to SkyRLGymGenerator, do not use in get_cust…

cee373c

…om_chat_template endpoint to ensure tests are aligned

eos token for merge conflict

99aeb3c

Merge branch 'main' into devpatel/skyrl-issue-104

d2ee9cd

looks like a new merge conflict removed default logprobs truncation f…

85f946f

…or no chat template selection, brought it back

devpatelio and others added 19 commits September 16, 2025 05:33

keep the last message thinking in a multi-turn conversation

9aed08b

linter

96d3ade

fixed generator texts to support new format

f7e5d28

fixed comment

9e83a00

checked test, default qwen model has the newline in the unit test so …

2a06dbb

…i brought this back. return None if no name_or_path since behaviour for apply_chat_template uses default template from tokenizer in that case. Addressed all other PRs too

added unit test for qwen3_without_thinking and default chat template …

61e398e

…when name_or_path is None

fixed tests to support the config updates

4f1eef9

additional fix for get

4bf821d

removed unnecessary dictionary return style by not using custom_chat_…

35459ca

…template for list return, same template as without_thinking but return type is different and supports test

Update skyrl-train/skyrl_train/generators/utils.py

1ecc27e

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

addressed new review of comments

bccb846

run_gsm8k update to defaults

64c9031

added mock tokenizer test for custom chat template with jinja2 file

c9e92f5

applied nits and removed test redundancy for jinja2

63b4a23

resolved ppo merge conflict

a57d7de

resolved ppo merge conflict

162dcb5

Merge branch 'main' into devpatel/skyrl-issue-104

3fad0cb

linter

8044b1d

separate retokenize and custom chat template behaviour

6ceb3db

gemini-code-assist bot reviewed Sep 26, 2025

View reviewed changes

skyrl-train/skyrl_train/generators/skyrl_gym_generator.py Outdated Show resolved Hide resolved

skyrl-train/tests/cpu/generators/test_skyrl_gym_generator.py Show resolved Hide resolved

devpatelio added 2 commits September 26, 2025 18:37

updated assertion and set default to true

fcca36d

updated assertion and set default to true

7699517

gemini-code-assist bot reviewed Sep 26, 2025

View reviewed changes

devpatelio and others added 4 commits September 26, 2025 14:42

Update skyrl-train/skyrl_train/generators/skyrl_gym_generator.py

997517a

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Update skyrl-train/skyrl_train/generators/skyrl_gym_generator.py

e08196d

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Update skyrl-train/skyrl_train/generators/skyrl_gym_generator.py

11d1228

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Update skyrl-train/skyrl_train/generators/skyrl_gym_generator.py

9b5cbef

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

CharlieFRuan mentioned this pull request Oct 6, 2025

[Generator] Make custom generator examples TI/TO, and use appropriate encoding for Qwen3, ensuring on-policy training #410

Open

erictang000 added the skyrl-train label Dec 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decouple Retokenize and Custom Chat Template#351

Decouple Retokenize and Custom Chat Template#351
devpatelio wants to merge 65 commits intoNovaSky-AI:mainfrom
devpatelio:devpatel/skyrl-issue-179

devpatelio commented Sep 26, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

devpatelio commented Sep 26, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

Conversation

devpatelio commented Sep 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

devpatelio commented Sep 26, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

devpatelio commented Sep 26, 2025 •

edited

Loading