Add Ascend NPU accelerator support#3595
Conversation
|
@microsoft-github-policy-service agree |
|
@CurryRice233, thanks for this PR. It looks very exciting. Do you mind sharing any background or docs about the Ascend NPU? |
thanks for your reply. Ascend NPU is a AI processor that support AI frameworks like PyTorch, TensorFlow, etc. So, i think its possible run Deepspeed on NPU to train foundation model. Their website: https://www.hiascend.com/en/ |
|
@tjruwase @jeffra @xiaoxiawu-microsoft Sorry for annoying, can you guys merge this PR ? |
|
@CurryRice233, apologies for the delay in merging this. I have setup for auto-merge but now it seems formatting is broken. |
Head branch was pushed to by a user without write access
no problem, it will be fixed with new commit. |
* add Ascend NPU accelerator support * clean code --------- Co-authored-by: jializheng <jializheng@huawei.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
NPU accelerator support is introduced in (deepspeedai#3595) but it not available. This commit mainly does two things: 1. Add a new accelerator_name 'npu' for choosing, it can be specified by environment variable or auto detected. 2. Optimize auto detect code in get_accelerator to avoid too many layers of exception throwing.
NPU accelerator support is introduced in (deepspeedai#3595) but it not available. This commit mainly does two things: 1. Add a new accelerator_name 'npu' for choosing, it can be specified by environment variable or auto detected. 2. Optimize auto detect code in get_accelerator to avoid too many layers of exception throwing.
NPU accelerator support is introduced in (deepspeedai#3595) but it not available. This commit mainly does two things: 1. Add a new accelerator_name 'npu' for choosing, it can be specified by environment variable or auto detected. 2. Optimize auto detect code in get_accelerator to avoid too many layers of exception throwing.
NPU accelerator support is introduced in (deepspeedai#3595) but it not available. This commit mainly does two things: 1. Add a new accelerator_name 'npu' for choosing, it can be specified by environment variable or auto detected. 2. Optimize auto detect code in get_accelerator to avoid too many layers of exception throwing.
NPU accelerator support is introduced in (deepspeedai#3595) but it not available. This commit mainly does two things: 1. Add a new accelerator_name 'npu' for choosing, it can be specified by environment variable or auto detected. 2. Optimize auto detect code in get_accelerator to avoid too many layers of exception throwing.
NPU accelerator support is introduced in (deepspeedai#3595) but it not available. This commit mainly does two things: 1. Add a new accelerator_name 'npu' for choosing, it can be specified by environment variable or auto detected. 2. Optimize auto detect code in get_accelerator to avoid too many layers of exception throwing.
NPU accelerator support is introduced in (deepspeedai#3595) but it not available. This commit mainly does two things: 1. Add a new accelerator_name 'npu' for choosing, it can be specified by environment variable or auto detected. 2. Optimize auto detect code in get_accelerator to avoid too many layers of exception throwing.
NPU accelerator support is introduced in (deepspeedai#3595). This commit provides two enhancements: 1. Add a new accelerator_name 'npu' for choosing, it can be specified by environment variable or auto detected. 2. Optimize auto detect code in get_accelerator to avoid too many layers of exception throwing.
NPU accelerator support is introduced in (deepspeedai#3595). This commit provides two enhancements: 1. Add a new accelerator_name 'npu' for choosing, it can be specified by environment variable or auto detected. 2. Optimize auto detect code in get_accelerator to avoid too many layers of exception throwing.
NPU accelerator support is introduced in (deepspeedai#3595). This commit provides two enhancements: 1. Add a new accelerator_name 'npu' for choosing, it can be specified by environment variable or auto detected. 2. Optimize auto detect code in get_accelerator to avoid too many layers of exception throwing.
NPU accelerator support is introduced in (deepspeedai#3595). This commit provides two enhancements: 1. Add a new accelerator_name 'npu' for choosing, it can be specified by environment variable or auto detected. 2. Optimize auto detect code in get_accelerator to avoid too many layers of exception throwing.
NPU accelerator support is introduced in (deepspeedai#3595). This commit provides two enhancements: 1. Add a new accelerator_name 'npu' for choosing, it can be specified by environment variable or auto detected. 2. Optimize auto detect code in get_accelerator to avoid too many layers of exception throwing.
NPU accelerator support is introduced in (deepspeedai#3595). This commit provides two enhancements: 1. Add a new accelerator_name 'npu' for choosing, it can be specified by environment variable or auto detected. 2. Optimize auto detect code in get_accelerator to avoid too many layers of exception throwing.
* Make Ascend NPU available NPU accelerator support is introduced in (#3595). This commit provides two enhancements: 1. Add a new accelerator_name 'npu' for choosing, it can be specified by environment variable or auto detected. 2. Optimize auto detect code in get_accelerator to avoid too many layers of exception throwing. * Use DS_ACCELERATOR_LIST for overriding accelerators When detecting override accelerators there's an error message to show all support accelerators, using an accelerator list instead of hard coding accelerator names in this message. And fix code format issue(yapf). * Add HCCL backend HCCL is the distribute backend of Ascend NPU, it already implemented in npu plugin for pytorch (https://gitee.com/ascend/pytorch). Add HCCL backend as a not implemented backend to avoid not supported warning. * Add NPUNotImplementedBuilder Ascend NPU does not implement any op yet, leave npu folder empty will throw NoneType[op_name] when not supported op is called. Add this NPUNotImplementedBuilder as the default builder. * Optimize builder search logic 1. cpu and other backend implement their ops in sub dirs under op_builder, cuda_accelerator should skip these sub dirs. 2. Each backend will have its own NotImplementedBuilder, add device prefix to this class to distinguish. * Change the unimplemented builder name to the same for each backend
Add Ascend NPU accelerator support