[Tinker] Add save_weights_for_sampler() to WorkerDispatch #922

tyler-griggs · 2026-01-22T22:48:28Z

Summary

Adds save_weights_for_sampler() to WorkerDispatch as the single entry point for syncing policy weights to inference engines before sampling. Now, calls to save_weights_for_sampler() are the trigger for weight sync, rather than explicit weight synchronization logic in the trainer. This aligns with the Tinker API pattern where users explicitly call save_weights_for_sampler() after training and before sampling.

Changes:

WorkerDispatch: Added save_weights_for_sampler() (to reflect the Tinker library) that handles the full weight sync flow:
- Prepares GPU state (offloads optimizer, keeps model on GPU)
- Wakes inference engine for weight transfer (colocate_all only)
- Broadcasts weights to inference engines
- Offloads model after sync (colocate_all only)
- Wakes inference engine for KV cache (colocate_all only)
Trainer: Replaced two explicit weight sync blocks with dispatch.save_weights_for_sampler():
- After checkpoint load (before first generation)
- After each training step (before next generation)

Testing

Added test_save_weights_for_sampler tests, all passing.

Move weight sync logic into WorkerDispatch as a single entry point. This aligns with Tinker's API pattern where users explicitly call save_weights_for_sampler() after training and before sampling. Changes: - WorkerDispatch: Add save_weights_for_sampler() async method that: - Prepares GPU state (offloads optimizer, keeps model on GPU) - Wakes inference engine for weight transfer (colocate_all only) - Broadcasts weights to inference engines - Offloads model after sync (colocate_all only) - Wakes inference engine for KV cache (colocate_all only) - Trainer: Replace two explicit weight sync blocks with: - dispatch.save_weights_for_sampler() after checkpoint load - dispatch.save_weights_for_sampler() after each training step - Tests: Add test_save_weights_for_sampler.py with: - E2E test: train → sync → sample (colocate and non-colocate) - Multiple training steps before single sync Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

gemini-code-assist

Code Review

This pull request significantly improves the codebase by encapsulating the weight synchronization logic into a single save_weights_for_sampler() method within WorkerDispatch. This refactoring greatly simplifies the Trainer class, making the training loop cleaner and more readable. Passing the inference_engine_client to the WorkerDispatch constructor is a good design choice for encapsulation. The addition of comprehensive end-to-end tests for the new functionality is also a major plus, ensuring the changes are robust. I have a couple of suggestions to improve consistency in the new test files.

skyrl-train/skyrl_train/workers/worker_dispatch.py

skyrl-train/tests/gpu/gpu_ci/test_save_weights_for_sampler.py

erictang000 · 2026-01-22T23:20:06Z

was this intended to be added since there's already #898?

Address gemini-code-assist comments 2 & 3: - Create WorkerDispatch before calling init_weight_sync_state - Use dispatch.init_weight_sync_state(client) instead of calling directly on actor group - This properly tests the WorkerDispatch API as designed

…gs/save_weights_for_sampler

tyler-griggs · 2026-01-23T01:32:18Z

Sorry I just got confused :D

tyler-griggs · 2026-01-23T01:32:26Z

/gemini review

gemini-code-assist

Code Review

This pull request effectively refactors the weight synchronization logic by introducing save_weights_for_sampler() in WorkerDispatch. This change centralizes the weight syncing process, simplifying the Trainer class and improving code clarity and maintainability, which aligns well with the stated goal of adopting the Tinker API pattern. The addition of comprehensive tests in test_save_weights_for_sampler.py is a great inclusion, ensuring the new functionality is robust across different configurations. The changes are well-implemented and a clear improvement.

skyrl-train/skyrl_train/trainer.py

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

gemini-code-assist bot reviewed Jan 22, 2026

View reviewed changes

skyrl-train/skyrl_train/workers/worker_dispatch.py Show resolved Hide resolved

skyrl-train/tests/gpu/gpu_ci/test_save_weights_for_sampler.py Outdated Show resolved Hide resolved

skyrl-train/tests/gpu/gpu_ci/test_save_weights_for_sampler.py Outdated Show resolved Hide resolved

tyler-griggs added 2 commits January 22, 2026 23:52

format

8fe4593

tyler-griggs mentioned this pull request Jan 23, 2026

[Tinker] Move weight sync logic into dispatch (out of trainer.py) #898

Closed

2 tasks

Merge branch 'main' of https://github.com/NovaSky-AI/SkyRL into tgrig…

e7ab2be

…gs/save_weights_for_sampler

gemini-code-assist bot reviewed Jan 23, 2026

View reviewed changes

skyrl-train/skyrl_train/trainer.py Outdated Show resolved Hide resolved

tyler-griggs mentioned this pull request Jan 23, 2026

Add loss_fn parameterization to forward_backward #924

Draft

Update skyrl-train/skyrl_train/trainer.py

48c204e

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

tyler-griggs merged commit 48faf59 into main Jan 23, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Tinker] Add save_weights_for_sampler() to WorkerDispatch #922

[Tinker] Add save_weights_for_sampler() to WorkerDispatch #922

tyler-griggs commented Jan 22, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

erictang000 commented Jan 22, 2026

Uh oh!

tyler-griggs commented Jan 23, 2026

Uh oh!

tyler-griggs commented Jan 23, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[Tinker] Add save_weights_for_sampler() to WorkerDispatch #922

[Tinker] Add save_weights_for_sampler() to WorkerDispatch #922

Conversation

tyler-griggs commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes:

Testing

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

erictang000 commented Jan 22, 2026

Uh oh!

tyler-griggs commented Jan 23, 2026

Uh oh!

tyler-griggs commented Jan 23, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tyler-griggs commented Jan 22, 2026 •

edited

Loading