forked from ggml-org/llama.cpp
-
Notifications
You must be signed in to change notification settings - Fork 3
feat: perf opt part4 #43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
50 commits
Select commit
Hold shift + click to select a range
00890e5
wip
chraac d2d35e4
refactor: rewrite dequantize_row_q4_0 by intrinsic
chraac 5149473
log for debug
chraac 58e8bcf
fix q4 intrinsic
chraac f6f2277
small opt
chraac 8f8906f
wip
chraac 688b3c6
wip
chraac 3fc25f2
add vtcm_quota_size
chraac e6229ec
add perf log for hexagon-npu backend
chraac 7f47761
wip
chraac 9001c4f
add log
chraac 416ce0d
sync after a specfic op
chraac 9cdfef7
increase worker thread priority
chraac 137a3cc
fix unbalanced thread slice
chraac e58e921
small slict to fit in vtcm cache
chraac 00ee0cc
limit the supported row element size
chraac 035749f
opt 4_0 dequant
chraac 4d4d239
fix q4 dequant
chraac 142ab93
add power_utils
chraac 2415498
add rms_norm
chraac c5749a6
wip
chraac 96d2c18
enable rms_norm f32
chraac 35df13a
fix rms_norm with param
chraac 2b76030
fix compiling flags
chraac bd70076
use float
chraac 84d5e0e
fix small row size
chraac bb949a3
vectorized rms norm
chraac 68a6d76
wip
chraac 1d94db3
read 2 vectors
chraac edfb963
rename
chraac 83160d3
add perf log on update
chraac 1fce9b1
set empty tensors handle also
chraac 897bd47
merge some rpc functions
chraac bf62a51
opt param update
chraac 4bacb3f
wip
chraac d65475d
print more log
chraac 4acf419
add struct for update param config
chraac a42a9d9
add npu_device_graph_set_tensor_with_param
chraac 8352d47
merge tensor and params update
chraac 2caea38
wip
chraac 1ee710b
wip
chraac 8c027ba
make as template to reuse
chraac 57252c0
vectorize dequantize_row_q8_0
chraac fe08014
opt
chraac 8409dd1
avoid using union to store q data
chraac 435caa2
wip
chraac fafe1ad
wip
chraac 411097e
Merge branch 'dev-refactoring' into dev-perf-opt-part4
chraac d8bd368
wip
chraac 99fc1e3
Merge branch 'dev-refactoring' into dev-perf-opt-part4
chraac File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.