Skip to content

Comments

Add NPU FusedAdam support#4343

Merged
tjruwase merged 25 commits intodeepspeedai:masterfrom
CurryRice233:master
Oct 18, 2023
Merged

Add NPU FusedAdam support#4343
tjruwase merged 25 commits intodeepspeedai:masterfrom
CurryRice233:master

Conversation

@CurryRice233
Copy link
Contributor

Add NPU FusedAdam support.

* origin/master: (48 commits)
  Fix autotune to support Triton 2.1  (deepspeedai#4340)
  Fix skipped inference tests (deepspeedai#4336)
  Suppress noise (deepspeedai#4310)
  Fix a bug in the implementation of dequantization for inference (deepspeedai#3433)
  DS-Chat BLOOM: Fix Attention mask (deepspeedai#4338)
  clear redundant timers (deepspeedai#4308)
  Add release version checking (deepspeedai#4328)
  Fix Zero3 contiguous grads, reduce scatter false  accuracy issue (deepspeedai#4321)
  Clean up modeling code (deepspeedai#4320)
  Handle empty parameter groups (deepspeedai#4277)
  Update README.md (deepspeedai#4316)
  README update (deepspeedai#4303)
  Update release and bump patch versioning flow (deepspeedai#4286)
  added a bert-model check for triton (deepspeedai#4266)
  ZeRO-Inference v2 release
  bump to 0.10.4
  Update index.md (deepspeedai#4297)
  fix user args parsing of string with spaces on runner (deepspeedai#4265)
  ZeRO-Inference refresh (deepspeedai#4197)
  AMD Kernel Compatibility Fixes (deepspeedai#3180)
  ...
@CurryRice233
Copy link
Contributor Author

@tjruwase @jeffra @RezaYazdaniAminabadi @cmikeh2 Sorry for annoying, can you guys review this PR ?

CurryRice233 and others added 6 commits September 19, 2023 14:11
* origin/master:
  Allow multiple inference engines in single script (deepspeedai#4384)
  adds triton flash attention2 kernel (deepspeedai#4337)
  Fix llama meta tensor loading in AutoTP and kernel injected inference (deepspeedai#3608)
  Fix min torch version (deepspeedai#4375)
  Fix multinode runner to properly append to PDSH_SSH_ARGS_APPEND (deepspeedai#4373)
  add the missing method (deepspeedai#4363)
  Openfold fix (deepspeedai#4368)
  deepspeed4science japanese blog (deepspeedai#4369)
  deepspeed4science chinese blog (deepspeedai#4366)
  Enable workflow dispatch on Torch 1.10 CI tests (deepspeedai#4361)
  Update conda env to have max pydantic version (deepspeedai#4362)
  add deepspeed4science blog link (deepspeedai#4364)
  added check to avoid undefined behavior when the input_id length is greater than max_tokens (deepspeedai#4349)
  Add the policy to run llama model from the official repo (deepspeedai#4313)
  fix deepspeed4science links (deepspeedai#4358)
  DeepSpeed4Science (deepspeedai#4357)
  Support InternLM (deepspeedai#4137)
  Pass base_dir to model files can be loaded for auto-tp/meta-tensor. (deepspeedai#4348)
@ji-huazhong
Copy link
Contributor

@tjruwase Good day. This PR is approved and ready to be merged. Could you retrigger this workflow and merge it? Thanks :-)

@tjruwase
Copy link
Contributor

@tjruwase Good day. This PR is approved and ready to be merged. Could you retrigger this workflow and merge it? Thanks :-)

Sorry for the delay, however there seems to be a formatting issue. Please take a look.

Copy link
Contributor

@ji-huazhong ji-huazhong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolve format checking errors

CurryRice233 and others added 9 commits October 14, 2023 14:41
Co-authored-by: Hz, Ji <hzji210@gmail.com>
Co-authored-by: Hz, Ji <hzji210@gmail.com>
Co-authored-by: Hz, Ji <hzji210@gmail.com>
Co-authored-by: Hz, Ji <hzji210@gmail.com>
Co-authored-by: Hz, Ji <hzji210@gmail.com>
Co-authored-by: Hz, Ji <hzji210@gmail.com>
Co-authored-by: Hz, Ji <hzji210@gmail.com>
Co-authored-by: Hz, Ji <hzji210@gmail.com>
Copy link
Contributor

@ji-huazhong ji-huazhong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as we modify these two blank lines, the format check error should be solved.

@tjruwase
Copy link
Contributor

CurryRice233 and others added 3 commits October 16, 2023 09:02
Co-authored-by: Hz, Ji <hzji210@gmail.com>
Co-authored-by: Hz, Ji <hzji210@gmail.com>
@CurryRice233
Copy link
Contributor Author

@CurryRice233, it is best to use this guide for formatting issues: https://github.com/microsoft/DeepSpeed/blob/master/CONTRIBUTING.md#prerequisites

Thank you, new skill get😉. By the way, could you retrigger this workflow again?

@CurryRice233
Copy link
Contributor Author

@tjruwase hi, could you retrigger this workflow again and merge it? Thanks😀

@tjruwase tjruwase enabled auto-merge October 18, 2023 02:34
@tjruwase tjruwase added this pull request to the merge queue Oct 18, 2023
Merged via the queue into deepspeedai:master with commit 3e70a88 Oct 18, 2023
baodii pushed a commit to baodii/DeepSpeed that referenced this pull request Nov 7, 2023
* add npu support dtypes

* add npu fused_adam support

* add license

* Update accelerator/npu_accelerator.py

Co-authored-by: Hz, Ji <hzji210@gmail.com>

* Update op_builder/npu/fused_adam.py

Co-authored-by: Hz, Ji <hzji210@gmail.com>

* Update op_builder/npu/fused_adam.py

Co-authored-by: Hz, Ji <hzji210@gmail.com>

* Update op_builder/npu/fused_adam.py

Co-authored-by: Hz, Ji <hzji210@gmail.com>

* Update op_builder/npu/fused_adam.py

Co-authored-by: Hz, Ji <hzji210@gmail.com>

* Update op_builder/npu/fused_adam.py

Co-authored-by: Hz, Ji <hzji210@gmail.com>

* Update op_builder/npu/fused_adam.py

Co-authored-by: Hz, Ji <hzji210@gmail.com>

* Update op_builder/npu/fused_adam.py

Co-authored-by: Hz, Ji <hzji210@gmail.com>

* Update accelerator/npu_accelerator.py

Co-authored-by: Hz, Ji <hzji210@gmail.com>

* Update accelerator/npu_accelerator.py

Co-authored-by: Hz, Ji <hzji210@gmail.com>

---------

Co-authored-by: jializheng <jializheng@huawei.com>
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
Co-authored-by: Hz, Ji <hzji210@gmail.com>
mauryaavinash95 pushed a commit to mauryaavinash95/DeepSpeed that referenced this pull request Feb 17, 2024
* add npu support dtypes

* add npu fused_adam support

* add license

* Update accelerator/npu_accelerator.py

Co-authored-by: Hz, Ji <hzji210@gmail.com>

* Update op_builder/npu/fused_adam.py

Co-authored-by: Hz, Ji <hzji210@gmail.com>

* Update op_builder/npu/fused_adam.py

Co-authored-by: Hz, Ji <hzji210@gmail.com>

* Update op_builder/npu/fused_adam.py

Co-authored-by: Hz, Ji <hzji210@gmail.com>

* Update op_builder/npu/fused_adam.py

Co-authored-by: Hz, Ji <hzji210@gmail.com>

* Update op_builder/npu/fused_adam.py

Co-authored-by: Hz, Ji <hzji210@gmail.com>

* Update op_builder/npu/fused_adam.py

Co-authored-by: Hz, Ji <hzji210@gmail.com>

* Update op_builder/npu/fused_adam.py

Co-authored-by: Hz, Ji <hzji210@gmail.com>

* Update accelerator/npu_accelerator.py

Co-authored-by: Hz, Ji <hzji210@gmail.com>

* Update accelerator/npu_accelerator.py

Co-authored-by: Hz, Ji <hzji210@gmail.com>

---------

Co-authored-by: jializheng <jializheng@huawei.com>
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
Co-authored-by: Hz, Ji <hzji210@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants