[CANN] Support cpu offload optimizer for Ascend NPU #4568
Merged
tjruwase merged 4 commits intodeepspeedai:masterfrom Nov 14, 2023
Merged
[CANN] Support cpu offload optimizer for Ascend NPU #4568tjruwase merged 4 commits intodeepspeedai:masterfrom
tjruwase merged 4 commits intodeepspeedai:masterfrom
Conversation
f572373 to
07916e0
Compare
4deb6f6 to
11d61cb
Compare
Support cpu_adam, cpu_adagrad and cpu_lion optimizer for Ascend NPU. All these optimizer are running on host, the difference between each backend is the way to copy params back to device. This commit add a new symbol called __ENABLE_CANN__. This symbol can compile code adapted to NPU. The NPU builder adds the required header files and libraries for compiling, according to CANN's compilation manual. Note that there's no FusedLion implementation for NPU, test_cpu_lion test case should disabled until FusedLion optimizer implemented. Besides, when NPU is selected as the accelerator, ds_report will show torch_npu and CANN informations.
Contributor
|
Hi @tjruwase, please take a look at this PR. 🤗 Deepspeed test cases in huggingface/transformers are also passed. See:huggingface/transformers#27342 (comment) |
Contributor
Author
|
@tjruwase Format and spell issue has fixed. Please re-trigger checks, Thanks. |
Contributor
Author
|
All checks are passed. Is it ready to merge now? |
tjruwase
approved these changes
Nov 14, 2023
mauryaavinash95
pushed a commit
to mauryaavinash95/DeepSpeed
that referenced
this pull request
Feb 17, 2024
Support cpu_adam, cpu_adagrad and cpu_lion optimizer for Ascend NPU. All these optimizer are running on host, the difference between each backend is the way to copy params back to device. This commit add a new symbol called "__ENABLE_CANN__". This symbol can compile code adapted to NPU. The NPU builder adds the required header files and libraries for compiling, according to CANN's compilation manual. Note that there's no FusedLion implementation for NPU, test_cpu_lion test case should disabled until FusedLion optimizer implemented. Besides, when NPU is selected as the accelerator, ds_report will show torch_npu and CANN informations. With this PR, deepspeed test cases in [huggingface/accelerate](https://github.com/huggingface/accelerate/tree/main/tests/deepspeed) are all passed. It's a part of feature list for Ascend NPU support, @see deepspeedai#4567 --------- Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Support cpu_adam, cpu_adagrad and cpu_lion optimizer for Ascend NPU. All these optimizer are running on host, the difference between each backend is the way to copy params back to device. This commit add a new symbol called "ENABLE_CANN". This symbol can compile code adapted to NPU.
The NPU builder adds the required header files and libraries for compiling, according to CANN's compilation manual.
Note that there's no FusedLion implementation for NPU, test_cpu_lion test case should disabled until FusedLion optimizer implemented.
Besides, when NPU is selected as the accelerator, ds_report will show torch_npu and CANN informations.
With this PR, deepspeed test cases in huggingface/accelerate are all passed.
It's a part of feature list for Ascend NPU support, @see #4567