-
Notifications
You must be signed in to change notification settings - Fork 100
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Optimizer CPU offload for single GPU training (#584)
* initial commit * use fused=True by default for PyTorch adam * detach param * try overlap D2H grad copy with backward * add customizable profile num steps * add v2 * fix various bugs * fix v1 impl * add full BF16 option * change n_profile_steps to 5 * add v3 * fix gradient accumulation * add note * add deepspeed offload * update deepspeed config * add some notes * update instructions. make some packages optional. change to AdamW * add last updated ordered dict * update deepspeed version * remove old versions * update docs * say deepspeed is untuned * add test * add test for offload_gradients. update benchmark script * update benchmark results. fix test. fix benchmark script * fix language * add save and load * pre-allocate CPU params. add note about gradient clipping * update README and remove unused imports
- Loading branch information
1 parent
de4a1fb
commit 1b1e94c
Showing
5 changed files
with
330 additions
and
28 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,3 @@ | ||
from .adam import Adam8bit, Adam4bit, AdamFp8 | ||
from .adamw import AdamW8bit, AdamW4bit, AdamWFp8 | ||
from .cpu_offload import CPUOffloadOptimizer |
Oops, something went wrong.