Add NPU FusedAdam support#4343
Conversation
* origin/master: (48 commits) Fix autotune to support Triton 2.1 (deepspeedai#4340) Fix skipped inference tests (deepspeedai#4336) Suppress noise (deepspeedai#4310) Fix a bug in the implementation of dequantization for inference (deepspeedai#3433) DS-Chat BLOOM: Fix Attention mask (deepspeedai#4338) clear redundant timers (deepspeedai#4308) Add release version checking (deepspeedai#4328) Fix Zero3 contiguous grads, reduce scatter false accuracy issue (deepspeedai#4321) Clean up modeling code (deepspeedai#4320) Handle empty parameter groups (deepspeedai#4277) Update README.md (deepspeedai#4316) README update (deepspeedai#4303) Update release and bump patch versioning flow (deepspeedai#4286) added a bert-model check for triton (deepspeedai#4266) ZeRO-Inference v2 release bump to 0.10.4 Update index.md (deepspeedai#4297) fix user args parsing of string with spaces on runner (deepspeedai#4265) ZeRO-Inference refresh (deepspeedai#4197) AMD Kernel Compatibility Fixes (deepspeedai#3180) ...
|
@tjruwase @jeffra @RezaYazdaniAminabadi @cmikeh2 Sorry for annoying, can you guys review this PR ? |
* origin/master: Allow multiple inference engines in single script (deepspeedai#4384) adds triton flash attention2 kernel (deepspeedai#4337) Fix llama meta tensor loading in AutoTP and kernel injected inference (deepspeedai#3608) Fix min torch version (deepspeedai#4375) Fix multinode runner to properly append to PDSH_SSH_ARGS_APPEND (deepspeedai#4373) add the missing method (deepspeedai#4363) Openfold fix (deepspeedai#4368) deepspeed4science japanese blog (deepspeedai#4369) deepspeed4science chinese blog (deepspeedai#4366) Enable workflow dispatch on Torch 1.10 CI tests (deepspeedai#4361) Update conda env to have max pydantic version (deepspeedai#4362) add deepspeed4science blog link (deepspeedai#4364) added check to avoid undefined behavior when the input_id length is greater than max_tokens (deepspeedai#4349) Add the policy to run llama model from the official repo (deepspeedai#4313) fix deepspeed4science links (deepspeedai#4358) DeepSpeed4Science (deepspeedai#4357) Support InternLM (deepspeedai#4137) Pass base_dir to model files can be loaded for auto-tp/meta-tensor. (deepspeedai#4348)
|
@tjruwase Good day. This PR is approved and ready to be merged. Could you retrigger this workflow and merge it? Thanks :-) |
Sorry for the delay, however there seems to be a formatting issue. Please take a look. |
ji-huazhong
left a comment
There was a problem hiding this comment.
Resolve format checking errors
Co-authored-by: Hz, Ji <hzji210@gmail.com>
Co-authored-by: Hz, Ji <hzji210@gmail.com>
Co-authored-by: Hz, Ji <hzji210@gmail.com>
Co-authored-by: Hz, Ji <hzji210@gmail.com>
Co-authored-by: Hz, Ji <hzji210@gmail.com>
Co-authored-by: Hz, Ji <hzji210@gmail.com>
Co-authored-by: Hz, Ji <hzji210@gmail.com>
Co-authored-by: Hz, Ji <hzji210@gmail.com>
ji-huazhong
left a comment
There was a problem hiding this comment.
As long as we modify these two blank lines, the format check error should be solved.
|
@CurryRice233, it is best to use this guide for formatting issues: https://github.com/microsoft/DeepSpeed/blob/master/CONTRIBUTING.md#prerequisites |
Co-authored-by: Hz, Ji <hzji210@gmail.com>
Co-authored-by: Hz, Ji <hzji210@gmail.com>
Thank you, new skill get😉. By the way, could you retrigger this workflow again? |
|
@tjruwase hi, could you retrigger this workflow again and merge it? Thanks😀 |
* add npu support dtypes * add npu fused_adam support * add license * Update accelerator/npu_accelerator.py Co-authored-by: Hz, Ji <hzji210@gmail.com> * Update op_builder/npu/fused_adam.py Co-authored-by: Hz, Ji <hzji210@gmail.com> * Update op_builder/npu/fused_adam.py Co-authored-by: Hz, Ji <hzji210@gmail.com> * Update op_builder/npu/fused_adam.py Co-authored-by: Hz, Ji <hzji210@gmail.com> * Update op_builder/npu/fused_adam.py Co-authored-by: Hz, Ji <hzji210@gmail.com> * Update op_builder/npu/fused_adam.py Co-authored-by: Hz, Ji <hzji210@gmail.com> * Update op_builder/npu/fused_adam.py Co-authored-by: Hz, Ji <hzji210@gmail.com> * Update op_builder/npu/fused_adam.py Co-authored-by: Hz, Ji <hzji210@gmail.com> * Update accelerator/npu_accelerator.py Co-authored-by: Hz, Ji <hzji210@gmail.com> * Update accelerator/npu_accelerator.py Co-authored-by: Hz, Ji <hzji210@gmail.com> --------- Co-authored-by: jializheng <jializheng@huawei.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Hz, Ji <hzji210@gmail.com>
* add npu support dtypes * add npu fused_adam support * add license * Update accelerator/npu_accelerator.py Co-authored-by: Hz, Ji <hzji210@gmail.com> * Update op_builder/npu/fused_adam.py Co-authored-by: Hz, Ji <hzji210@gmail.com> * Update op_builder/npu/fused_adam.py Co-authored-by: Hz, Ji <hzji210@gmail.com> * Update op_builder/npu/fused_adam.py Co-authored-by: Hz, Ji <hzji210@gmail.com> * Update op_builder/npu/fused_adam.py Co-authored-by: Hz, Ji <hzji210@gmail.com> * Update op_builder/npu/fused_adam.py Co-authored-by: Hz, Ji <hzji210@gmail.com> * Update op_builder/npu/fused_adam.py Co-authored-by: Hz, Ji <hzji210@gmail.com> * Update op_builder/npu/fused_adam.py Co-authored-by: Hz, Ji <hzji210@gmail.com> * Update accelerator/npu_accelerator.py Co-authored-by: Hz, Ji <hzji210@gmail.com> * Update accelerator/npu_accelerator.py Co-authored-by: Hz, Ji <hzji210@gmail.com> --------- Co-authored-by: jializheng <jializheng@huawei.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Hz, Ji <hzji210@gmail.com>
Add NPU FusedAdam support.