Skip to content

Comments

[NPU]Add ZeRO-Infinity feature for NPU#4809

Merged
tjruwase merged 9 commits intodeepspeedai:masterfrom
misstek:npu_nvme_infinity
Jan 5, 2024
Merged

[NPU]Add ZeRO-Infinity feature for NPU#4809
tjruwase merged 9 commits intodeepspeedai:masterfrom
misstek:npu_nvme_infinity

Conversation

@misstek
Copy link
Contributor

@misstek misstek commented Dec 13, 2023

Add ZeRO-Infinity feature for NPU devices.
I add a new async_io.py in op_builder/npu and compilation preprocessing judgment in deepspeed_aio_thread.cpp specifically for NPU, which will be isolated from other devices such as the GPU and will not affect each other.
See what we have already done in #4567 .

@misstek
Copy link
Contributor Author

misstek commented Dec 13, 2023

Hi @cmikeh2 @awan-10 @arashb , would you please help me launch CI and review this PR. Thank you.

@tjruwase tjruwase requested review from jomayeri and mrwyattii and removed request for arashb, awan-10 and cmikeh2 December 13, 2023 15:15
@misstek
Copy link
Contributor Author

misstek commented Dec 14, 2023

Hi @tjruwase , thanks for your help launching CI and I have fixed the formatting failure issue. But the failure of nv-transformers-v100 unit test does not seem to be caused by the code I committed.

@misstek
Copy link
Contributor Author

misstek commented Dec 19, 2023

Hi @tjruwase , would you please help me launch unit-tests again? Thank you.

@tjruwase
Copy link
Contributor

@misstek
Copy link
Contributor Author

misstek commented Dec 19, 2023

@misstek, can you please fix the formatting issues? https://github.com/microsoft/DeepSpeed/blob/master/CONTRIBUTING.md#contributing

Hi, @tjruwase , the formatting failure has been fixed according to CONTRIBUTING.md. Please launch formatting check again, sorry for the inconvenience and thank you.

pre-commit run --files csrc/aio/py_lib/deepspeed_aio_thread.cpp op_builder/npu/__init__.py op_builder/npu/async_io.py
Check hooks apply to the repository..................(no files to check)Skipped
Check for useless excludes...........................(no files to check)Skipped
Check for case conflicts.................................................Passed
Check JSON...........................................(no files to check)Skipped
Check for broken symlinks............................(no files to check)Skipped
Check Yaml...........................................(no files to check)Skipped
Detect Destroyed Symlinks................................................Passed
Fix End of Files.........................................................Passed
fix UTF-8 byte order marker..............................................Passed
Fix python encoding pragma...............................................Passed
Mixed line ending........................................................Passed
Fix requirements.txt.................................(no files to check)Skipped
Trim Trailing Whitespace.................................................Passed
yapf.....................................................................Passed
clang-format.............................................................Passed
check-torchdist..........................................................Passed
check-license............................................................Passed
codespell................................................................Passed
flake8...................................................................Passed
check-torchcuda..........................................................Passed

@misstek
Copy link
Contributor Author

misstek commented Dec 20, 2023

Hi @mrwyattii @jomayeri , since this pull request has passed all tests, could you please review it and add it to merge queue. Thank you.

@misstek
Copy link
Contributor Author

misstek commented Dec 25, 2023

Hi @tjruwase , I noticed that this PR has been approved. Is it ready to be put into the merge queue? Thank you.

@tjruwase tjruwase added this pull request to the merge queue Dec 25, 2023
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Dec 25, 2023
@tjruwase tjruwase added this pull request to the merge queue Jan 5, 2024
Merged via the queue into deepspeedai:master with commit b596963 Jan 5, 2024
mauryaavinash95 pushed a commit to mauryaavinash95/DeepSpeed that referenced this pull request Feb 17, 2024
Add ZeRO-Infinity feature for NPU devices. 
I add a new `async_io.py` in `op_builder/npu` and compilation
preprocessing judgment in `deepspeed_aio_thread.cpp` specifically for
NPU, which will be isolated from other devices such as the GPU and will
not affect each other.
See what we have already done in
deepspeedai#4567 .

---------

Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants