Qualcomm AI Engine Direct - Support QNN 2.28 #6811

shewu-quic · 2024-11-13T09:36:28Z

Note that if merged, QNN version less than 2.28 will not support it.

pytorch-bot · 2024-11-13T09:36:32Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/6811

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 259cbc9 with merge base 7010a11 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

cccclai · 2024-11-13T21:04:12Z

What feature is used from 2.28?

shewu-quic · 2024-11-14T05:42:24Z

What feature is used from 2.28?

Based on Qnn 2.28 release note, they seem to improve the latency for large models.
Therefore, we could get better performance.
BTW, I found an optimization that has not been updated yet. I will send another PR today so that you can reproduce the current latency with QNN 2.28.

facebook-github-bot · 2024-11-14T16:36:14Z

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

cccclai · 2024-11-14T18:16:24Z

Can you also update this line https://github.com/pytorch/executorch/blob/07c4d0e8624fbeb3b6a387cb9c6ba193512c5de5/shim/xplat/executorch/backends/qualcomm/qnn_version.bzl

shewu-quic · 2024-11-15T04:17:48Z

Can you also update this line https://github.com/pytorch/executorch/blob/07c4d0e8624fbeb3b6a387cb9c6ba193512c5de5/shim/xplat/executorch/backends/qualcomm/qnn_version.bzl

Fixed

facebook-github-bot · 2024-11-20T05:28:18Z

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

cccclai · 2024-11-22T21:46:03Z

Do we know if static llama and llama transformer will still work with this version bump? Also we need to figure out a plan to do version bump without bc breaking...

shewu-quic · 2024-11-22T22:49:02Z

Do we know if static llama and llama transformer will still work with this version bump? Also we need to figure out a plan to do version bump without bc breaking...

Our tests for static llama all are on this version.
It should work for llama transformer based on your CI result.
https://github.com/pytorch/executorch/actions/runs/11906365001/job/33178331953?pr=6811

About bc breaking, I think that we could add some macros around the code which only support on 2.28 to make sure that users could build libqnnexecutorch on older qnn version. Is it acceptable?

cccclai · 2024-11-23T00:13:13Z

Do we know if static llama and llama transformer will still work with this version bump? Also we need to figure out a plan to do version bump without bc breaking...

Our tests for static llama all are on this version. It should work for llama transformer based on your CI result. https://github.com/pytorch/executorch/actions/runs/11906365001/job/33178331953?pr=6811

About bc breaking, I think that we could add some macros around the code which only support on 2.28 to make sure that users could build libqnnexecutorch on older qnn version. Is it acceptable?

Yeah that sounds good. BC is something we can figure out later. Just hope it can be part of the plan.

facebook-github-bot · 2024-11-24T00:26:25Z

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

cccclai · 2024-12-03T19:59:46Z

Hi sorry for being a bit back and forth here - Context is that the pre-generated .pte file will no longer be compatible. If the work isn't too much, how likely we can make this PR bc compatible? If there is too much work, we can try to merge this.

shewu-quic · 2024-12-04T09:19:23Z

Hi sorry for being a bit back and forth here - Context is that the pre-generated .pte file will no longer be compatible. If the work isn't too much, how likely we can make this PR bc compatible? If there is too much work, we can try to merge this.

Thank you for your response. I apologize for any confusion. What I meant is that if we merge this PR, libqnnexecutorch will not be able to build with QNN versions earlier than 2.28. However, the pre-generated .pte files should still be supported. The build issue arises in cases like this: QNN_SYSTEM_CONTEXT_BINARY_INFO_VERSION_3 only exists in headers from version 2.28 onwards. If you try to build with version 2.26, you'll encounter an error stating that QNN_SYSTEM_CONTEXT_BINARY_INFO_VERSION_3 cannot be found.

shewu-quic · 2024-12-04T09:21:02Z

If possible, please go ahead and merge this PR. We will discuss internally whether to add some macros around the code that only supports version 2.28 and later, to ensure that users can build libqnnexecutorch on older QNN versions.

facebook-github-bot · 2024-12-04T21:03:37Z

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Note that if merged, QNN version less than 2.28 will not support it.

facebook-github-bot · 2025-01-06T06:00:36Z

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

shewu-quic · 2025-01-13T08:35:14Z

Hi @cccclai,
Thanks for review.
I think that @winskuo-quic has added some macros to ensure that users can build libqnnexecutorch on older QNN versions.
If possible, could you help to merge this PR?

facebook-github-bot · 2025-01-13T18:16:01Z

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

cccclai

Thanks! Inform the internal team and merging now

Differential Revision: D65949627 Pull Request resolved: #6811

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 13, 2024

shewu-quic force-pushed the dev1/hutton/enable_qnn_2_28 branch from 6ae5b16 to dd1836e Compare November 13, 2024 09:59

cccclai added the ciflow/trunk label Nov 13, 2024

shewu-quic force-pushed the dev1/hutton/enable_qnn_2_28 branch from f2f9c84 to e7eba31 Compare November 19, 2024 04:39

cccclai added the release notes: backends [DO NOT USE] Changes to any of the backend delegates label Nov 20, 2024

chunit-quic mentioned this pull request Dec 1, 2024

Qualcomm AI Engine Direct - Suport batch prefill mode for llama3.2 #6983

Merged

winskuo-quic force-pushed the dev1/hutton/enable_qnn_2_28 branch from e7eba31 to 6096504 Compare December 11, 2024 09:15

winskuo-quic mentioned this pull request Dec 11, 2024

Qualcomm AI Engine Direct - Support Hybrid Mode for Llama3.2 #7175

Merged

shewu-quic and others added 4 commits January 6, 2025 13:23

Qualcomm AI Engine Direct - Support QNN 2.28

46e5a4d

Note that if merged, QNN version less than 2.28 will not support it.

Fixed CI and doc

c7885e4

fixed qnn version

63073c2

Backward Compatability for 2.28

259cbc9

winskuo-quic force-pushed the dev1/hutton/enable_qnn_2_28 branch from 6096504 to 259cbc9 Compare January 6, 2025 05:41

cccclai approved these changes Jan 13, 2025

View reviewed changes

facebook-github-bot merged commit 8cd3afd into pytorch:main Jan 13, 2025
138 checks passed

guangy10 mentioned this pull request Jan 15, 2025

Additional QNN version fix #7664

Merged

YIWENX14 pushed a commit that referenced this pull request Jan 28, 2025

Qualcomm AI Engine Direct - Support QNN 2.28

e574fd6

Differential Revision: D65949627 Pull Request resolved: #6811

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qualcomm AI Engine Direct - Support QNN 2.28 #6811

Qualcomm AI Engine Direct - Support QNN 2.28 #6811

shewu-quic commented Nov 13, 2024

pytorch-bot bot commented Nov 13, 2024 •

edited

Loading

cccclai commented Nov 13, 2024

shewu-quic commented Nov 14, 2024

facebook-github-bot commented Nov 14, 2024

cccclai commented Nov 14, 2024

shewu-quic commented Nov 15, 2024

facebook-github-bot commented Nov 20, 2024

cccclai commented Nov 22, 2024

shewu-quic commented Nov 22, 2024

cccclai commented Nov 23, 2024

facebook-github-bot commented Nov 24, 2024

cccclai commented Dec 3, 2024

shewu-quic commented Dec 4, 2024

shewu-quic commented Dec 4, 2024

facebook-github-bot commented Dec 4, 2024

facebook-github-bot commented Jan 6, 2025

shewu-quic commented Jan 13, 2025 •

edited

Loading

facebook-github-bot commented Jan 13, 2025

cccclai left a comment

Qualcomm AI Engine Direct - Support QNN 2.28 #6811

Qualcomm AI Engine Direct - Support QNN 2.28 #6811

Conversation

shewu-quic commented Nov 13, 2024

pytorch-bot bot commented Nov 13, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/6811

✅ No Failures

cccclai commented Nov 13, 2024

shewu-quic commented Nov 14, 2024

facebook-github-bot commented Nov 14, 2024

cccclai commented Nov 14, 2024

shewu-quic commented Nov 15, 2024

facebook-github-bot commented Nov 20, 2024

cccclai commented Nov 22, 2024

shewu-quic commented Nov 22, 2024

cccclai commented Nov 23, 2024

facebook-github-bot commented Nov 24, 2024

cccclai commented Dec 3, 2024

shewu-quic commented Dec 4, 2024

shewu-quic commented Dec 4, 2024

facebook-github-bot commented Dec 4, 2024

facebook-github-bot commented Jan 6, 2025

shewu-quic commented Jan 13, 2025 • edited Loading

facebook-github-bot commented Jan 13, 2025

cccclai left a comment

Choose a reason for hiding this comment

pytorch-bot bot commented Nov 13, 2024 •

edited

Loading

shewu-quic commented Jan 13, 2025 •

edited

Loading