Add Ascend NPU Support #1521

xuedinge233 · 2025-04-02T09:53:36Z

This PR introduces native support for Huawei Ascend NPU acceleration in generative AI tasks and chat applications. The implementation leverages Ascend's CANN computing architecture to optimize tensor operations for large language models.

Compatibility Matrix:
Python 3.10.15
torch 2.7.0+cpu
torch_npu 2.7.0+git36b5d8c (nightly)
torchvision 0.22.0
torchtune 0.6.0

The following is the result generated using Llama3.1.

pytorch-bot · 2025-04-02T09:53:40Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/1521

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2025-04-02T09:53:42Z

Hi @xuedinge233!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

facebook-github-bot · 2025-04-03T02:10:27Z

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

hipudding · 2025-04-03T08:00:52Z

torchchat/generate.py

@@ -1595,6 +1599,12 @@ def sample(

        return idx_next, probs

+def is_npu_available():


Out of tree device support autoload, don't need import torch_npu.

hipudding · 2025-04-03T08:00:59Z

torchchat/utils/build_utils.py

@@ -275,33 +277,51 @@ def is_mps_available() -> bool:
    # MPS, is that you?
    return True

+def is_npu_available(check_device=False):


Out of tree device support autoload, don't need import torch_npu.

torchchat/utils/build_utils.py

torchchat/utils/quantize.py

hipudding · 2025-04-09T02:09:20Z

torchchat/utils/build_utils.py

+        return "cuda"
+    elif is_mps_available():
+        return "mps"
+    elif hasattr(torch, "npu") and torch.npu.is_available():


Add NPU after xpu.

hipudding · 2025-04-09T02:09:52Z

torchchat/generate.py

@@ -1213,6 +1213,8 @@ def callback(x, *, done_generating=False):
                    print(prof.key_averages().table(sort_by="self_cpu_time_total"))
                elif self.builder_args.device == "cuda":
                    print(prof.key_averages().table(sort_by="self_cuda_time_total"))
+                elif self.builder_args.device == "npu":
+                    print(prof.key_averages().table(sort_by="self_npu_time_total"))


where is self_npu_time_total defined.

xuedinge233 · 2025-04-11T01:27:33Z

@Jack-Khuu @mikekgfb Hello，can you help review this PR?

Jack-Khuu

Thanks for the addition!

Looking good, just a question about the pytorch install source

Jack-Khuu · 2025-04-11T18:01:30Z

install/install_requirements.sh

@@ -71,6 +71,9 @@ then
 elif [[ -x "$(command -v xpu-smi)" ]];
 then
  TORCH_NIGHTLY_URL="https://download.pytorch.org/whl/nightly/xpu"
+elif [[ -x "$(command -v npu-smi)" ]]
+then
+  TORCH_NIGHTLY_URL="https://download.pytorch.org/whl/test/cpu"


Why are we using a test wheel?

Hey @Jack-Khuu thanks for your review. IMO, we should use nightly pytorch wheels like other backends. What do you think? @hipudding @xuedinge233

But, we usually use pytorch RC versions, not sure if everything works fine on nightly

shink · 2025-04-14T01:49:47Z

torchchat/cli/builder.py

                self.device = "cuda"
            elif torch.xpu.is_available():
                self.device = "xpu"
+            elif hasattr(torch, "npu") and torch.npu.is_available():
+                self.device = "npu"
            else:
                self.device = "cpu"


How about torch.accelerator.current_accelerator(check_available=True)? We've done a lot of work on device-generic APIs. Hope they get used.

See: https://dev-discuss.pytorch.org/t/python-c-api-rules-for-device-generic-apis/2511

Got it，I will make modifications

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Apr 3, 2025

xuedinge233 changed the title ~~Add Ascend NPU device support for generate~~ Add Huawei Ascend NPU Support for Generative & Chat Workloads Apr 3, 2025

hipudding suggested changes Apr 3, 2025

View reviewed changes

xuedinge233 changed the title ~~Add Huawei Ascend NPU Support for Generative & Chat Workloads~~ Add Ascend NPU Support Apr 3, 2025

hipudding reviewed Apr 7, 2025

View reviewed changes

torchchat/utils/build_utils.py Outdated Show resolved Hide resolved

torchchat/utils/quantize.py Outdated Show resolved Hide resolved

xuedinge233 force-pushed the main branch from 1ce58f2 to 660494a Compare April 8, 2025 09:05

Add Ascend NPU support for generate and chat

9645ec0

xuedinge233 force-pushed the main branch from 660494a to 9645ec0 Compare April 8, 2025 09:46

xuedinge233 marked this pull request as ready for review April 8, 2025 09:48

hipudding reviewed Apr 9, 2025

View reviewed changes

hipudding approved these changes Apr 9, 2025

View reviewed changes

xuedinge233 added 2 commits April 9, 2025 11:55

update

872a9d2

Merge branch 'main' into main

7c6f116

Jack-Khuu reviewed Apr 11, 2025

View reviewed changes

shink reviewed Apr 14, 2025

View reviewed changes

xuedinge233 added 3 commits April 14, 2025 11:25

Use torch.accelerator for device selection

468b006

Merge branch 'pytorch:main' into main

403cfc1

Merge branch 'main' of https://github.com/xuedinge233/torchchat

088ed41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Ascend NPU Support #1521

Add Ascend NPU Support #1521

xuedinge233 commented Apr 2, 2025 •

edited

Loading

pytorch-bot bot commented Apr 2, 2025 •

edited

Loading

facebook-github-bot commented Apr 2, 2025

facebook-github-bot commented Apr 3, 2025

hipudding Apr 3, 2025

hipudding Apr 3, 2025

hipudding Apr 9, 2025

hipudding Apr 9, 2025

xuedinge233 commented Apr 11, 2025 •

edited

Loading

Jack-Khuu left a comment

Jack-Khuu Apr 11, 2025

shink Apr 14, 2025

shink Apr 14, 2025

shink Apr 14, 2025 •

edited

Loading

xuedinge233 Apr 14, 2025

		@@ -1595,6 +1599,12 @@ def sample(

		return idx_next, probs

		def is_npu_available():

Add Ascend NPU Support #1521

Are you sure you want to change the base?

Add Ascend NPU Support #1521

Conversation

xuedinge233 commented Apr 2, 2025 • edited Loading

pytorch-bot bot commented Apr 2, 2025 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/1521

facebook-github-bot commented Apr 2, 2025

Action Required

Process

facebook-github-bot commented Apr 3, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xuedinge233 commented Apr 11, 2025 • edited Loading

Jack-Khuu left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shink Apr 14, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xuedinge233 commented Apr 2, 2025 •

edited

Loading

pytorch-bot bot commented Apr 2, 2025 •

edited

Loading

xuedinge233 commented Apr 11, 2025 •

edited

Loading

shink Apr 14, 2025 •

edited

Loading