Skip to content

merge#8

Merged
cklxx merged 64 commits intocklxx:codex/optimize-training-time-for-context-parallelismfrom
THUDM:main
Dec 21, 2025
Merged

merge#8
cklxx merged 64 commits intocklxx:codex/optimize-training-time-for-context-parallelismfrom
THUDM:main

Conversation

@cklxx
Copy link
Owner

@cklxx cklxx commented Dec 21, 2025

No description provided.

lin0303-siyuan and others added 30 commits December 11, 2025 13:00
…tory-r1xqo4

Remove FSDP rationale note from Chinese quick start
Fix FSDP load planner to keep model tensors
nanjiangwill and others added 29 commits December 16, 2025 09:06
Co-authored-by: nanjiangwill <willjiang2018@gmail.com>
Co-authored-by: Nan Jiang <59716405+nanjiangwill@users.noreply.github.com>
Co-authored-by: jhinpan <jpan236@wisc.edu>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: Lyken17 <7783214+Lyken17@users.noreply.github.com>
…ository

Added FSDP checkpoint handling to convert_torch_dist_to_hf.py
@cklxx cklxx merged commit 47415ac into cklxx:codex/optimize-training-time-for-context-parallelism Dec 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.