Enhancing checkpoint support for transformer type of workloads #248

zhenghh04 · 2025-02-05T22:47:12Z

Currently, when people want to do checkpointing write for transformer type workloads, one has to input the layer_parameters and optimization groups. All these parameters actually can be derived from the higher level parameters such as hidden dimension, ffn size, vocab_size, etc. Asking users to directly input layer_parameters and optimization groups in the configure file is not so good. We can hide all the lower level details here.

zhenghh04 mentioned this issue Feb 5, 2025

Checkpointing support for transformer type models #247

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhancing checkpoint support for transformer type of workloads #248

Enhancing checkpoint support for transformer type of workloads #248

zhenghh04 commented Feb 5, 2025

Enhancing checkpoint support for transformer type of workloads #248

Enhancing checkpoint support for transformer type of workloads #248

Comments

zhenghh04 commented Feb 5, 2025