-
Notifications
You must be signed in to change notification settings - Fork 564
Description
As slime grows, we need more thorough CI coverage for an increasing number of backends, deployment patterns, and training features.
This issue serves as a living index of our existing CI tests so that both users and contributors can quickly see:
- which backends are currently exercised by CI
- which parallelism / optimizer / algorithmic features are being tested
- where coverage gaps still exist
Over time this document should evolve into a compact “CI coverage map” of slime.
Megatron Backend CI
All Megatron-backend CI tests currently check at least the following invariants:
kl_loss == 0on step 0ppo_kl == 0on step 0 of each rollout
These are used as sanity checks that the initial policy/value setup and reference policy wiring are correct.
tests/test_quick_start_glm4_9B.py
- tp2, cp2
- 3 steps, rollout 8 × 8, gbs=32
- disaggregated
- algo: GRPO
- tp4, pp, cp, ep, cpu adam
- 3 steps, rollout 8 × 8, gbs=32
- colocated
- algo:
- GSPO
- Routing replay
- TIS
tests/test_moonlight_16B_A3B.py
- mla, tp4, pp, cp, ep, cpu adam
- 3 steps, rollout 8 × 8, gbs=32
- colocated
- tp4, pp, cp, ep, cpu adam
- 3 steps, rollout 8 × 8, gbs=32
- colocated
- algo:
- PPO
tests/test_qwen2.5_0.5B_gsm8k.py
- no parallism
- algo: grpo
tests/test_qwen2.5_0.5B_gsm8k_async.py
- no parallism
- algo:
- GSPO
- true on-policy
FSDP Backend CI
All FSDP-backend CI tests share a common set of checks (e.g., basic training sanity, loss decreasing, etc.).
More detailed invariants will be documented here as they are standardized across tests.
TODO: enumerate and unify the exact common assertions for FSDP tests.
tests/test_qwen3_4B_fsdp_true_on_policy.py
- algo: grpo
tests/test_qwen3_0.6B_fsdp_distributed.py
tests/test_qwen3_0.6B_fsdp_colocated_2xGPU.py
TODO
- PPO
- rollout routing replay
- deterministic training
- distributed post
- partial rollout
- fault tolerance
- mtp training