Skip to content

Conversation

@cklxx
Copy link
Owner

@cklxx cklxx commented Dec 22, 2025

Summary

  • refactor OPSM masking to use explicit input builders and avoid context-parallel double counting
  • compute OPSM-only CP KLs without autograd, enforcing strict zip length checks in critical paths
  • update FSDP actor usage and guard CP reductions with required process groups

Testing

  • ruff check .
  • pytest (fails: environment cannot import slime module during test discovery)

Codex Task

@cklxx cklxx merged commit 06787cf into codex/optimize-training-time-for-context-parallelism Dec 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant