Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Qualcomm AI Engine Direct - Support QNN 2.28 #6811

Merged

Conversation

shewu-quic
Copy link
Collaborator

Note that if merged, QNN version less than 2.28 will not support it.

Copy link

pytorch-bot bot commented Nov 13, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/6811

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 259cbc9 with merge base 7010a11 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 13, 2024
@shewu-quic shewu-quic force-pushed the dev1/hutton/enable_qnn_2_28 branch from 6ae5b16 to dd1836e Compare November 13, 2024 09:59
@cccclai
Copy link
Contributor

cccclai commented Nov 13, 2024

What feature is used from 2.28?

@shewu-quic
Copy link
Collaborator Author

What feature is used from 2.28?

Based on Qnn 2.28 release note, they seem to improve the latency for large models.
Therefore, we could get better performance.
BTW, I found an optimization that has not been updated yet. I will send another PR today so that you can reproduce the current latency with QNN 2.28.

@facebook-github-bot
Copy link
Contributor

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@cccclai
Copy link
Contributor

cccclai commented Nov 14, 2024

Can you also update this line https://github.com/pytorch/executorch/blob/07c4d0e8624fbeb3b6a387cb9c6ba193512c5de5/shim/xplat/executorch/backends/qualcomm/qnn_version.bzl

@shewu-quic
Copy link
Collaborator Author

Can you also update this line https://github.com/pytorch/executorch/blob/07c4d0e8624fbeb3b6a387cb9c6ba193512c5de5/shim/xplat/executorch/backends/qualcomm/qnn_version.bzl

Fixed

@shewu-quic shewu-quic force-pushed the dev1/hutton/enable_qnn_2_28 branch from f2f9c84 to e7eba31 Compare November 19, 2024 04:39
@cccclai cccclai added the release notes: backends [DO NOT USE] Changes to any of the backend delegates label Nov 20, 2024
@facebook-github-bot
Copy link
Contributor

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@cccclai
Copy link
Contributor

cccclai commented Nov 22, 2024

Do we know if static llama and llama transformer will still work with this version bump? Also we need to figure out a plan to do version bump without bc breaking...

@shewu-quic
Copy link
Collaborator Author

Do we know if static llama and llama transformer will still work with this version bump? Also we need to figure out a plan to do version bump without bc breaking...

Our tests for static llama all are on this version.
It should work for llama transformer based on your CI result.
https://github.com/pytorch/executorch/actions/runs/11906365001/job/33178331953?pr=6811

About bc breaking, I think that we could add some macros around the code which only support on 2.28 to make sure that users could build libqnnexecutorch on older qnn version. Is it acceptable?

@cccclai
Copy link
Contributor

cccclai commented Nov 23, 2024

Do we know if static llama and llama transformer will still work with this version bump? Also we need to figure out a plan to do version bump without bc breaking...

Our tests for static llama all are on this version. It should work for llama transformer based on your CI result. https://github.com/pytorch/executorch/actions/runs/11906365001/job/33178331953?pr=6811

About bc breaking, I think that we could add some macros around the code which only support on 2.28 to make sure that users could build libqnnexecutorch on older qnn version. Is it acceptable?

Yeah that sounds good. BC is something we can figure out later. Just hope it can be part of the plan.

@facebook-github-bot
Copy link
Contributor

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@cccclai
Copy link
Contributor

cccclai commented Dec 3, 2024

Hi sorry for being a bit back and forth here - Context is that the pre-generated .pte file will no longer be compatible. If the work isn't too much, how likely we can make this PR bc compatible? If there is too much work, we can try to merge this.

@shewu-quic
Copy link
Collaborator Author

Hi sorry for being a bit back and forth here - Context is that the pre-generated .pte file will no longer be compatible. If the work isn't too much, how likely we can make this PR bc compatible? If there is too much work, we can try to merge this.

Thank you for your response. I apologize for any confusion. What I meant is that if we merge this PR, libqnnexecutorch will not be able to build with QNN versions earlier than 2.28. However, the pre-generated .pte files should still be supported. The build issue arises in cases like this: QNN_SYSTEM_CONTEXT_BINARY_INFO_VERSION_3 only exists in headers from version 2.28 onwards. If you try to build with version 2.26, you'll encounter an error stating that QNN_SYSTEM_CONTEXT_BINARY_INFO_VERSION_3 cannot be found.

@shewu-quic
Copy link
Collaborator Author

If possible, please go ahead and merge this PR. We will discuss internally whether to add some macros around the code that only supports version 2.28 and later, to ensure that users can build libqnnexecutorch on older QNN versions.

@facebook-github-bot
Copy link
Contributor

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@winskuo-quic winskuo-quic force-pushed the dev1/hutton/enable_qnn_2_28 branch from 6096504 to 259cbc9 Compare January 6, 2025 05:41
@facebook-github-bot
Copy link
Contributor

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@shewu-quic
Copy link
Collaborator Author

shewu-quic commented Jan 13, 2025

Hi @cccclai,
Thanks for review.
I think that @winskuo-quic has added some macros to ensure that users can build libqnnexecutorch on older QNN versions.
If possible, could you help to merge this PR?

@facebook-github-bot
Copy link
Contributor

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Contributor

@cccclai cccclai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Inform the internal team and merging now

@facebook-github-bot facebook-github-bot merged commit 8cd3afd into pytorch:main Jan 13, 2025
138 checks passed
YIWENX14 pushed a commit that referenced this pull request Jan 28, 2025
Differential Revision: D65949627

Pull Request resolved: #6811
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/trunk CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. release notes: backends [DO NOT USE] Changes to any of the backend delegates
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants