Add Ascend NPU accelerator support by CurryRice233 · Pull Request #3595 · deepspeedai/DeepSpeed

CurryRice233 · 2023-05-23T09:24:06Z

Add Ascend NPU accelerator support

CurryRice233 · 2023-05-23T09:26:24Z

@microsoft-github-policy-service agree

tjruwase · 2023-05-23T15:28:23Z

@CurryRice233, thanks for this PR. It looks very exciting. Do you mind sharing any background or docs about the Ascend NPU?

CurryRice233 · 2023-05-24T01:55:29Z

@CurryRice233, thanks for this PR. It looks very exciting. Do you mind sharing any background or docs about the Ascend NPU?

thanks for your reply. Ascend NPU is a AI processor that support AI frameworks like PyTorch, TensorFlow, etc. So, i think its possible run Deepspeed on NPU to train foundation model. Their website: https://www.hiascend.com/en/

CurryRice233 · 2023-05-27T09:42:01Z

@tjruwase @jeffra @xiaoxiawu-microsoft Sorry for annoying, can you guys merge this PR ?

tjruwase · 2023-05-31T03:18:05Z

@CurryRice233, apologies for the delay in merging this. I have setup for auto-merge but now it seems formatting is broken.

CurryRice233 · 2023-05-31T07:56:52Z

@CurryRice233, apologies for the delay in merging this. I have setup for auto-merge but now it seems formatting is broken.

no problem, it will be fixed with new commit.

* add Ascend NPU accelerator support * clean code --------- Co-authored-by: jializheng <jializheng@huawei.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>

NPU accelerator support is introduced in (deepspeedai#3595) but it not available. This commit mainly does two things: 1. Add a new accelerator_name 'npu' for choosing, it can be specified by environment variable or auto detected. 2. Optimize auto detect code in get_accelerator to avoid too many layers of exception throwing.

NPU accelerator support is introduced in (deepspeedai#3595). This commit provides two enhancements: 1. Add a new accelerator_name 'npu' for choosing, it can be specified by environment variable or auto detected. 2. Optimize auto detect code in get_accelerator to avoid too many layers of exception throwing.

* Make Ascend NPU available NPU accelerator support is introduced in (#3595). This commit provides two enhancements: 1. Add a new accelerator_name 'npu' for choosing, it can be specified by environment variable or auto detected. 2. Optimize auto detect code in get_accelerator to avoid too many layers of exception throwing. * Use DS_ACCELERATOR_LIST for overriding accelerators When detecting override accelerators there's an error message to show all support accelerators, using an accelerator list instead of hard coding accelerator names in this message. And fix code format issue(yapf). * Add HCCL backend HCCL is the distribute backend of Ascend NPU, it already implemented in npu plugin for pytorch (https://gitee.com/ascend/pytorch). Add HCCL backend as a not implemented backend to avoid not supported warning. * Add NPUNotImplementedBuilder Ascend NPU does not implement any op yet, leave npu folder empty will throw NoneType[op_name] when not supported op is called. Add this NPUNotImplementedBuilder as the default builder. * Optimize builder search logic 1. cpu and other backend implement their ops in sub dirs under op_builder, cuda_accelerator should skip these sub dirs. 2. Each backend will have its own NotImplementedBuilder, add device prefix to this class to distinguish. * Change the unimplemented builder name to the same for each backend

add Ascend NPU accelerator support

3640ff9

tjruwase added 2 commits May 24, 2023 20:52

Merge branch 'master' into master

1b79ab9

Merge branch 'master' into master

6c2c8b7

tjruwase approved these changes May 25, 2023

View reviewed changes

Merge branch 'master' into master

05ab171

Merge branch 'master' into master

144312f

tjruwase enabled auto-merge (squash) May 30, 2023 19:08

Merge branch 'master' into master

428d8f6

clean code

352a863

auto-merge was automatically disabled May 31, 2023 07:29
Head branch was pushed to by a user without write access

tjruwase enabled auto-merge (squash) May 31, 2023 13:22

tjruwase merged commit f3c8eac into deepspeedai:master May 31, 2023

molly-smith pushed a commit that referenced this pull request Jun 23, 2023

Add Ascend NPU accelerator support (#3595)

80e656c

* add Ascend NPU accelerator support * clean code --------- Co-authored-by: jializheng <jializheng@huawei.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>

hipudding mentioned this pull request Jun 28, 2023

Make Ascend NPU available #3831

Merged

hipudding mentioned this pull request Oct 26, 2023

[Feature package] Full feature support with Ascend NPU #4567

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Add Ascend NPU accelerator support#3595

Add Ascend NPU accelerator support#3595
tjruwase merged 7 commits intodeepspeedai:masterfrom
CurryRice233:master

CurryRice233 commented May 23, 2023

Uh oh!

CurryRice233 commented May 23, 2023

Uh oh!

tjruwase commented May 23, 2023

Uh oh!

CurryRice233 commented May 24, 2023

Uh oh!

CurryRice233 commented May 27, 2023

Uh oh!

tjruwase commented May 31, 2023

Uh oh!

CurryRice233 commented May 31, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

CurryRice233 commented May 23, 2023

Uh oh!

CurryRice233 commented May 23, 2023

Uh oh!

tjruwase commented May 23, 2023

Uh oh!

CurryRice233 commented May 24, 2023

Uh oh!

CurryRice233 commented May 27, 2023

Uh oh!

tjruwase commented May 31, 2023

Uh oh!

CurryRice233 commented May 31, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants