Skip to content

Comments

Add Ascend NPU accelerator support#3595

Merged
tjruwase merged 7 commits intodeepspeedai:masterfrom
CurryRice233:master
May 31, 2023
Merged

Add Ascend NPU accelerator support#3595
tjruwase merged 7 commits intodeepspeedai:masterfrom
CurryRice233:master

Conversation

@CurryRice233
Copy link
Contributor

Add Ascend NPU accelerator support

@CurryRice233
Copy link
Contributor Author

@microsoft-github-policy-service agree

@tjruwase
Copy link
Contributor

@CurryRice233, thanks for this PR. It looks very exciting. Do you mind sharing any background or docs about the Ascend NPU?

@CurryRice233
Copy link
Contributor Author

@CurryRice233, thanks for this PR. It looks very exciting. Do you mind sharing any background or docs about the Ascend NPU?

thanks for your reply. Ascend NPU is a AI processor that support AI frameworks like PyTorch, TensorFlow, etc. So, i think its possible run Deepspeed on NPU to train foundation model. Their website: https://www.hiascend.com/en/

@CurryRice233
Copy link
Contributor Author

@tjruwase @jeffra @xiaoxiawu-microsoft Sorry for annoying, can you guys merge this PR ?

@tjruwase tjruwase enabled auto-merge (squash) May 30, 2023 19:08
@tjruwase
Copy link
Contributor

@CurryRice233, apologies for the delay in merging this. I have setup for auto-merge but now it seems formatting is broken.

auto-merge was automatically disabled May 31, 2023 07:29

Head branch was pushed to by a user without write access

@CurryRice233
Copy link
Contributor Author

@CurryRice233, apologies for the delay in merging this. I have setup for auto-merge but now it seems formatting is broken.

no problem, it will be fixed with new commit.

@tjruwase tjruwase enabled auto-merge (squash) May 31, 2023 13:22
@tjruwase tjruwase merged commit f3c8eac into deepspeedai:master May 31, 2023
molly-smith pushed a commit that referenced this pull request Jun 23, 2023
* add Ascend NPU accelerator support

* clean code

---------

Co-authored-by: jializheng <jializheng@huawei.com>
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
hipudding added a commit to hipudding/DeepSpeed that referenced this pull request Jun 28, 2023
NPU accelerator support is introduced in (deepspeedai#3595) but it not available.
This commit mainly does two things:
  1. Add a new accelerator_name 'npu' for choosing, it can be specified
by environment variable or auto detected.
  2. Optimize auto detect code in get_accelerator to avoid too many
layers of exception throwing.
hipudding added a commit to hipudding/DeepSpeed that referenced this pull request Jun 28, 2023
NPU accelerator support is introduced in (deepspeedai#3595) but it not available.
This commit mainly does two things:
  1. Add a new accelerator_name 'npu' for choosing, it can be specified
by environment variable or auto detected.
  2. Optimize auto detect code in get_accelerator to avoid too many
layers of exception throwing.
hipudding added a commit to hipudding/DeepSpeed that referenced this pull request Jun 28, 2023
NPU accelerator support is introduced in (deepspeedai#3595) but it not available.
This commit mainly does two things:
  1. Add a new accelerator_name 'npu' for choosing, it can be specified
by environment variable or auto detected.
  2. Optimize auto detect code in get_accelerator to avoid too many
layers of exception throwing.
hipudding added a commit to hipudding/DeepSpeed that referenced this pull request Jun 28, 2023
NPU accelerator support is introduced in (deepspeedai#3595) but it not available.
This commit mainly does two things:
  1. Add a new accelerator_name 'npu' for choosing, it can be specified
by environment variable or auto detected.
  2. Optimize auto detect code in get_accelerator to avoid too many
layers of exception throwing.
hipudding added a commit to hipudding/DeepSpeed that referenced this pull request Jun 28, 2023
NPU accelerator support is introduced in (deepspeedai#3595) but it not available.
This commit mainly does two things:
  1. Add a new accelerator_name 'npu' for choosing, it can be specified
by environment variable or auto detected.
  2. Optimize auto detect code in get_accelerator to avoid too many
layers of exception throwing.
hipudding added a commit to hipudding/DeepSpeed that referenced this pull request Jun 28, 2023
NPU accelerator support is introduced in (deepspeedai#3595) but it not available.
This commit mainly does two things:
  1. Add a new accelerator_name 'npu' for choosing, it can be specified
by environment variable or auto detected.
  2. Optimize auto detect code in get_accelerator to avoid too many
layers of exception throwing.
hipudding added a commit to hipudding/DeepSpeed that referenced this pull request Jun 28, 2023
NPU accelerator support is introduced in (deepspeedai#3595) but it not available.
This commit mainly does two things:
  1. Add a new accelerator_name 'npu' for choosing, it can be specified
by environment variable or auto detected.
  2. Optimize auto detect code in get_accelerator to avoid too many
layers of exception throwing.
hipudding added a commit to hipudding/DeepSpeed that referenced this pull request Jun 28, 2023
NPU accelerator support is introduced in (deepspeedai#3595).
This commit provides two enhancements:
  1. Add a new accelerator_name 'npu' for choosing, it can be specified
by environment variable or auto detected.
  2. Optimize auto detect code in get_accelerator to avoid too many
layers of exception throwing.
hipudding added a commit to hipudding/DeepSpeed that referenced this pull request Jun 28, 2023
NPU accelerator support is introduced in (deepspeedai#3595).
This commit provides two enhancements:
  1. Add a new accelerator_name 'npu' for choosing, it can be specified
by environment variable or auto detected.
  2. Optimize auto detect code in get_accelerator to avoid too many
layers of exception throwing.
hipudding added a commit to hipudding/DeepSpeed that referenced this pull request Jun 28, 2023
NPU accelerator support is introduced in (deepspeedai#3595).
This commit provides two enhancements:
  1. Add a new accelerator_name 'npu' for choosing, it can be specified
by environment variable or auto detected.
  2. Optimize auto detect code in get_accelerator to avoid too many
layers of exception throwing.
hipudding added a commit to hipudding/DeepSpeed that referenced this pull request Jun 29, 2023
NPU accelerator support is introduced in (deepspeedai#3595).
This commit provides two enhancements:
  1. Add a new accelerator_name 'npu' for choosing, it can be specified
by environment variable or auto detected.
  2. Optimize auto detect code in get_accelerator to avoid too many
layers of exception throwing.
hipudding added a commit to hipudding/DeepSpeed that referenced this pull request Jun 30, 2023
NPU accelerator support is introduced in (deepspeedai#3595).
This commit provides two enhancements:
  1. Add a new accelerator_name 'npu' for choosing, it can be specified
by environment variable or auto detected.
  2. Optimize auto detect code in get_accelerator to avoid too many
layers of exception throwing.
hipudding added a commit to hipudding/DeepSpeed that referenced this pull request Jul 7, 2023
NPU accelerator support is introduced in (deepspeedai#3595).
This commit provides two enhancements:
  1. Add a new accelerator_name 'npu' for choosing, it can be specified
by environment variable or auto detected.
  2. Optimize auto detect code in get_accelerator to avoid too many
layers of exception throwing.
github-merge-queue bot pushed a commit that referenced this pull request Jul 22, 2023
* Make Ascend NPU available

NPU accelerator support is introduced in (#3595).
This commit provides two enhancements:
  1. Add a new accelerator_name 'npu' for choosing, it can be specified
by environment variable or auto detected.
  2. Optimize auto detect code in get_accelerator to avoid too many
layers of exception throwing.

* Use DS_ACCELERATOR_LIST for overriding accelerators

When detecting override accelerators there's an error message to show
all support accelerators, using an accelerator list instead of hard
coding accelerator names in this message.

And fix code format issue(yapf).

* Add HCCL backend

HCCL is the distribute backend of Ascend NPU, it already implemented in
npu plugin for pytorch (https://gitee.com/ascend/pytorch). Add HCCL
backend as a not implemented backend to avoid not supported warning.

* Add NPUNotImplementedBuilder

Ascend NPU does not implement any op yet, leave npu folder empty will
throw NoneType[op_name] when not supported op is called. Add this
NPUNotImplementedBuilder as the default builder.

* Optimize builder search logic

1. cpu and other backend implement their ops in sub dirs under
op_builder, cuda_accelerator should skip these sub dirs.
2. Each backend will have its own NotImplementedBuilder, add device
prefix to this class to distinguish.

* Change the unimplemented builder name to the same for each backend
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants