Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CANN] Fix compile error for CANN backend as get_name has been removed from ggml_backend_buffer_i #10158

Merged
merged 1 commit into from
Nov 4, 2024

Conversation

leo-pony
Copy link
Contributor

@leo-pony leo-pony commented Nov 4, 2024

Fix compile error for CANN backend as get_name has been removed from ggml_backend_buffer_i. The issue corresponding is #10105.

Function is normal:
image

…used in cann as it was removed in backend registry refactor PR.
@github-actions github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Nov 4, 2024
@hipudding hipudding added the Ascend NPU issues specific to Ascend NPUs label Nov 4, 2024
@hipudding hipudding self-requested a review November 4, 2024 11:04
@hipudding hipudding merged commit 329ed91 into ggerganov:master Nov 4, 2024
53 checks passed
apicalshark added a commit to apicalshark/llama.cpp that referenced this pull request Nov 7, 2024
* metal : fix minor string leaks (ggml/1004)

* cmake : make it possible linking ggml as external lib (ggml/1003)

* sync : ggml

* CANN: adjust backend registry refactor. (ggerganov#10158)

remove buffer->iface.get_name that used in cann as it was removed in backend registry refactor PR.

* metal : move dequantize templates to beginning of MSL source (#0)

* metal : simplify f16 and f32 dequant kernels (#0)

* cuda : clear error after changing peer access (ggerganov#10153)

* fix build break on arm64 linux (ggerganov#10166)

This fixes the build break from the recent changes
to move the CPU backend to separate files
ggerganov#10144

* server : clarify /slots endpoint, add is_processing (ggerganov#10162)

* server : clarify /slots endpoint, add is_processing

* fix tests

* ggml : fix q4xx mat mul, increase ggml_aligned_malloc alignment (ggerganov#10167)

* ggml : fix gelu tables initialization (ggerganov#10172)

* Q6_K AVX improvements (ggerganov#10118)

* q6_k instruction reordering attempt

* better subtract method

* should be theoretically faster

small improvement with shuffle lut, likely because all loads are already done at that stage

* optimize bit fiddling

* handle -32 offset separately. bsums exists for a reason!

* use shift

* Update ggml-quants.c

* have to update ci macos version to 13 as 12 doesnt work now. 13 is still x86

* ggml : fix arch check in bf16_to_fp32 (ggerganov#10164)

* llama : add <|tool_call|> formatting to Granite template (ggerganov#10177)

Branch: GraniteToolCallTemplate

Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>

* metal : add quantized FA support (ggerganov#10149)

* metal : add quantized FA (vec) support

ggml-ci

* metal : add quantized FA (non-vec) support

* metal : fix support check

ggml-ci

* metal : clean-up

* metal : clean-up (cont)

* metal : fix shared memory calc + reduce smem + comments

* metal : float-correctness

* metal : minor [no ci]

* ggml : adjust is_first_call init value (ggerganov#10193)

ggml-ci

* metal : fix from ptr buffer name (ggerganov#10189)

* server : remove hack for extra parallel slot (ggerganov#10187)

ggml-ci

* metal : add BF16 support (ggerganov#8439)

* ggml : add initial BF16 support

ggml-ci

* metal : add mul_mat_id BF16 support

ggml-ci

* metal : check for bfloat support on the Metal device

ggml-ci

* metal : better var names [no ci]

* metal : do not build bfloat kernels when not supported

ggml-ci

* metal : try to fix BF16 support check

ggml-ci

* metal : this should correctly check bfloat support

---------

Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
Co-authored-by: Plamen Minev <pacominev@gmail.com>
Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: leo-pony <nengjunma@outlook.com>
Co-authored-by: Diego Devesa <slarengh@gmail.com>
Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com>
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com>
Co-authored-by: Gabe Goodhart <ghart@us.ibm.com>
apicalshark added a commit to apicalshark/llama.cpp that referenced this pull request Nov 8, 2024
* Merge PR (#10) (#11) (#13)

Merge

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dennyxbox890 <58538165+dennyxbox890@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump requests from 2.31.0 to 2.32.2 in the pip group across 1 directory

Bumps the pip group with 1 update in the / directory: [requests](https://github.com/psf/requests).


Updates `requests` from 2.31.0 to 2.32.2
- [Release notes](https://github.com/psf/requests/releases)
- [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md)
- [Commits](psf/requests@v2.31.0...v2.32.2)

---
updated-dependencies:
- dependency-name: requests
  dependency-type: direct:production
  dependency-group: pip
...

Signed-off-by: dependabot[bot] <support@github.com>

* Temp (#15)

* metal : fix minor string leaks (ggml/1004)

* cmake : make it possible linking ggml as external lib (ggml/1003)

* sync : ggml

* CANN: adjust backend registry refactor. (ggerganov#10158)

remove buffer->iface.get_name that used in cann as it was removed in backend registry refactor PR.

* metal : move dequantize templates to beginning of MSL source (#0)

* metal : simplify f16 and f32 dequant kernels (#0)

* cuda : clear error after changing peer access (ggerganov#10153)

* fix build break on arm64 linux (ggerganov#10166)

This fixes the build break from the recent changes
to move the CPU backend to separate files
ggerganov#10144

* server : clarify /slots endpoint, add is_processing (ggerganov#10162)

* server : clarify /slots endpoint, add is_processing

* fix tests

* ggml : fix q4xx mat mul, increase ggml_aligned_malloc alignment (ggerganov#10167)

* ggml : fix gelu tables initialization (ggerganov#10172)

* Q6_K AVX improvements (ggerganov#10118)

* q6_k instruction reordering attempt

* better subtract method

* should be theoretically faster

small improvement with shuffle lut, likely because all loads are already done at that stage

* optimize bit fiddling

* handle -32 offset separately. bsums exists for a reason!

* use shift

* Update ggml-quants.c

* have to update ci macos version to 13 as 12 doesnt work now. 13 is still x86

* ggml : fix arch check in bf16_to_fp32 (ggerganov#10164)

* llama : add <|tool_call|> formatting to Granite template (ggerganov#10177)

Branch: GraniteToolCallTemplate

Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>

* metal : add quantized FA support (ggerganov#10149)

* metal : add quantized FA (vec) support

ggml-ci

* metal : add quantized FA (non-vec) support

* metal : fix support check

ggml-ci

* metal : clean-up

* metal : clean-up (cont)

* metal : fix shared memory calc + reduce smem + comments

* metal : float-correctness

* metal : minor [no ci]

* ggml : adjust is_first_call init value (ggerganov#10193)

ggml-ci

* metal : fix from ptr buffer name (ggerganov#10189)

* server : remove hack for extra parallel slot (ggerganov#10187)

ggml-ci

* metal : add BF16 support (ggerganov#8439)

* ggml : add initial BF16 support

ggml-ci

* metal : add mul_mat_id BF16 support

ggml-ci

* metal : check for bfloat support on the Metal device

ggml-ci

* metal : better var names [no ci]

* metal : do not build bfloat kernels when not supported

ggml-ci

* metal : try to fix BF16 support check

ggml-ci

* metal : this should correctly check bfloat support

---------

Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
Co-authored-by: Plamen Minev <pacominev@gmail.com>
Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: leo-pony <nengjunma@outlook.com>
Co-authored-by: Diego Devesa <slarengh@gmail.com>
Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com>
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com>
Co-authored-by: Gabe Goodhart <ghart@us.ibm.com>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
Co-authored-by: dennyxbox890 <58538165+dennyxbox890@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Plamen Minev <pacominev@gmail.com>
Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: leo-pony <nengjunma@outlook.com>
Co-authored-by: Diego Devesa <slarengh@gmail.com>
Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com>
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com>
Co-authored-by: Gabe Goodhart <ghart@us.ibm.com>
apicalshark added a commit to apicalshark/llama.cpp that referenced this pull request Nov 15, 2024
* Master1 (#17)

* Merge PR (#10) (#11) (#13)

Merge

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dennyxbox890 <58538165+dennyxbox890@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump requests from 2.31.0 to 2.32.2 in the pip group across 1 directory

Bumps the pip group with 1 update in the / directory: [requests](https://github.com/psf/requests).


Updates `requests` from 2.31.0 to 2.32.2
- [Release notes](https://github.com/psf/requests/releases)
- [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md)
- [Commits](psf/requests@v2.31.0...v2.32.2)

---
updated-dependencies:
- dependency-name: requests
  dependency-type: direct:production
  dependency-group: pip
...

Signed-off-by: dependabot[bot] <support@github.com>

* Temp (#15)

* metal : fix minor string leaks (ggml/1004)

* cmake : make it possible linking ggml as external lib (ggml/1003)

* sync : ggml

* CANN: adjust backend registry refactor. (ggerganov#10158)

remove buffer->iface.get_name that used in cann as it was removed in backend registry refactor PR.

* metal : move dequantize templates to beginning of MSL source (#0)

* metal : simplify f16 and f32 dequant kernels (#0)

* cuda : clear error after changing peer access (ggerganov#10153)

* fix build break on arm64 linux (ggerganov#10166)

This fixes the build break from the recent changes
to move the CPU backend to separate files
ggerganov#10144

* server : clarify /slots endpoint, add is_processing (ggerganov#10162)

* server : clarify /slots endpoint, add is_processing

* fix tests

* ggml : fix q4xx mat mul, increase ggml_aligned_malloc alignment (ggerganov#10167)

* ggml : fix gelu tables initialization (ggerganov#10172)

* Q6_K AVX improvements (ggerganov#10118)

* q6_k instruction reordering attempt

* better subtract method

* should be theoretically faster

small improvement with shuffle lut, likely because all loads are already done at that stage

* optimize bit fiddling

* handle -32 offset separately. bsums exists for a reason!

* use shift

* Update ggml-quants.c

* have to update ci macos version to 13 as 12 doesnt work now. 13 is still x86

* ggml : fix arch check in bf16_to_fp32 (ggerganov#10164)

* llama : add <|tool_call|> formatting to Granite template (ggerganov#10177)

Branch: GraniteToolCallTemplate

Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>

* metal : add quantized FA support (ggerganov#10149)

* metal : add quantized FA (vec) support

ggml-ci

* metal : add quantized FA (non-vec) support

* metal : fix support check

ggml-ci

* metal : clean-up

* metal : clean-up (cont)

* metal : fix shared memory calc + reduce smem + comments

* metal : float-correctness

* metal : minor [no ci]

* ggml : adjust is_first_call init value (ggerganov#10193)

ggml-ci

* metal : fix from ptr buffer name (ggerganov#10189)

* server : remove hack for extra parallel slot (ggerganov#10187)

ggml-ci

* metal : add BF16 support (ggerganov#8439)

* ggml : add initial BF16 support

ggml-ci

* metal : add mul_mat_id BF16 support

ggml-ci

* metal : check for bfloat support on the Metal device

ggml-ci

* metal : better var names [no ci]

* metal : do not build bfloat kernels when not supported

ggml-ci

* metal : try to fix BF16 support check

ggml-ci

* metal : this should correctly check bfloat support

---------

Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
Co-authored-by: Plamen Minev <pacominev@gmail.com>
Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: leo-pony <nengjunma@outlook.com>
Co-authored-by: Diego Devesa <slarengh@gmail.com>
Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com>
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com>
Co-authored-by: Gabe Goodhart <ghart@us.ibm.com>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
Co-authored-by: dennyxbox890 <58538165+dennyxbox890@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Plamen Minev <pacominev@gmail.com>
Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: leo-pony <nengjunma@outlook.com>
Co-authored-by: Diego Devesa <slarengh@gmail.com>
Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com>
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com>
Co-authored-by: Gabe Goodhart <ghart@us.ibm.com>

* Rename build.yml to build-ci.yml

* build.yml

* Update build-ci.yml

* Update CMakeLists.txt

* Update CMakeLists.txt

* Update CMakeLists.txt

* Delete ggml/src/vulkan-shaders/CMakeLists.txt

* Update build.yml

* Update build-ci.yml

* Update build-ci.yml

---------

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
Co-authored-by: dennyxbox890 <58538165+dennyxbox890@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Plamen Minev <pacominev@gmail.com>
Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: leo-pony <nengjunma@outlook.com>
Co-authored-by: Diego Devesa <slarengh@gmail.com>
Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com>
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com>
Co-authored-by: Gabe Goodhart <ghart@us.ibm.com>
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 15, 2024
remove buffer->iface.get_name that used in cann as it was removed in backend registry refactor PR.
apicalshark added a commit to apicalshark/llama.cpp that referenced this pull request Nov 16, 2024
* merge (#20)

* Master1 (#17)

* Merge PR (#10) (#11) (#13)

Merge

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dennyxbox890 <58538165+dennyxbox890@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump requests from 2.31.0 to 2.32.2 in the pip group across 1 directory

Bumps the pip group with 1 update in the / directory: [requests](https://github.com/psf/requests).


Updates `requests` from 2.31.0 to 2.32.2
- [Release notes](https://github.com/psf/requests/releases)
- [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md)
- [Commits](psf/requests@v2.31.0...v2.32.2)

---
updated-dependencies:
- dependency-name: requests
  dependency-type: direct:production
  dependency-group: pip
...

Signed-off-by: dependabot[bot] <support@github.com>

* Temp (#15)

* metal : fix minor string leaks (ggml/1004)

* cmake : make it possible linking ggml as external lib (ggml/1003)

* sync : ggml

* CANN: adjust backend registry refactor. (ggerganov#10158)

remove buffer->iface.get_name that used in cann as it was removed in backend registry refactor PR.

* metal : move dequantize templates to beginning of MSL source (#0)

* metal : simplify f16 and f32 dequant kernels (#0)

* cuda : clear error after changing peer access (ggerganov#10153)

* fix build break on arm64 linux (ggerganov#10166)

This fixes the build break from the recent changes
to move the CPU backend to separate files
ggerganov#10144

* server : clarify /slots endpoint, add is_processing (ggerganov#10162)

* server : clarify /slots endpoint, add is_processing

* fix tests

* ggml : fix q4xx mat mul, increase ggml_aligned_malloc alignment (ggerganov#10167)

* ggml : fix gelu tables initialization (ggerganov#10172)

* Q6_K AVX improvements (ggerganov#10118)

* q6_k instruction reordering attempt

* better subtract method

* should be theoretically faster

small improvement with shuffle lut, likely because all loads are already done at that stage

* optimize bit fiddling

* handle -32 offset separately. bsums exists for a reason!

* use shift

* Update ggml-quants.c

* have to update ci macos version to 13 as 12 doesnt work now. 13 is still x86

* ggml : fix arch check in bf16_to_fp32 (ggerganov#10164)

* llama : add <|tool_call|> formatting to Granite template (ggerganov#10177)

Branch: GraniteToolCallTemplate

Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>

* metal : add quantized FA support (ggerganov#10149)

* metal : add quantized FA (vec) support

ggml-ci

* metal : add quantized FA (non-vec) support

* metal : fix support check

ggml-ci

* metal : clean-up

* metal : clean-up (cont)

* metal : fix shared memory calc + reduce smem + comments

* metal : float-correctness

* metal : minor [no ci]

* ggml : adjust is_first_call init value (ggerganov#10193)

ggml-ci

* metal : fix from ptr buffer name (ggerganov#10189)

* server : remove hack for extra parallel slot (ggerganov#10187)

ggml-ci

* metal : add BF16 support (ggerganov#8439)

* ggml : add initial BF16 support

ggml-ci

* metal : add mul_mat_id BF16 support

ggml-ci

* metal : check for bfloat support on the Metal device

ggml-ci

* metal : better var names [no ci]

* metal : do not build bfloat kernels when not supported

ggml-ci

* metal : try to fix BF16 support check

ggml-ci

* metal : this should correctly check bfloat support

---------

Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
Co-authored-by: Plamen Minev <pacominev@gmail.com>
Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: leo-pony <nengjunma@outlook.com>
Co-authored-by: Diego Devesa <slarengh@gmail.com>
Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com>
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com>
Co-authored-by: Gabe Goodhart <ghart@us.ibm.com>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
Co-authored-by: dennyxbox890 <58538165+dennyxbox890@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Plamen Minev <pacominev@gmail.com>
Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: leo-pony <nengjunma@outlook.com>
Co-authored-by: Diego Devesa <slarengh@gmail.com>
Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com>
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com>
Co-authored-by: Gabe Goodhart <ghart@us.ibm.com>

* Rename build.yml to build-ci.yml

* build.yml

* Update build-ci.yml

* Update CMakeLists.txt

* Update CMakeLists.txt

* Update CMakeLists.txt

* Delete ggml/src/vulkan-shaders/CMakeLists.txt

* Update build.yml

* Update build-ci.yml

* Update build-ci.yml

---------

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
Co-authored-by: dennyxbox890 <58538165+dennyxbox890@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Plamen Minev <pacominev@gmail.com>
Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: leo-pony <nengjunma@outlook.com>
Co-authored-by: Diego Devesa <slarengh@gmail.com>
Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com>
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com>
Co-authored-by: Gabe Goodhart <ghart@us.ibm.com>

* Update build-ci.yml

---------

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
Co-authored-by: dennyxbox890 <58538165+dennyxbox890@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Plamen Minev <pacominev@gmail.com>
Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: leo-pony <nengjunma@outlook.com>
Co-authored-by: Diego Devesa <slarengh@gmail.com>
Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com>
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com>
Co-authored-by: Gabe Goodhart <ghart@us.ibm.com>
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 18, 2024
remove buffer->iface.get_name that used in cann as it was removed in backend registry refactor PR.
apicalshark added a commit to apicalshark/llama.cpp that referenced this pull request Nov 22, 2024
* Merge (#21)

* merge (#20)

* Master1 (#17)

* Merge PR (#10) (#11) (#13)

Merge

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dennyxbox890 <58538165+dennyxbox890@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump requests from 2.31.0 to 2.32.2 in the pip group across 1 directory

Bumps the pip group with 1 update in the / directory: [requests](https://github.com/psf/requests).


Updates `requests` from 2.31.0 to 2.32.2
- [Release notes](https://github.com/psf/requests/releases)
- [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md)
- [Commits](psf/requests@v2.31.0...v2.32.2)

---
updated-dependencies:
- dependency-name: requests
  dependency-type: direct:production
  dependency-group: pip
...

Signed-off-by: dependabot[bot] <support@github.com>

* Temp (#15)

* metal : fix minor string leaks (ggml/1004)

* cmake : make it possible linking ggml as external lib (ggml/1003)

* sync : ggml

* CANN: adjust backend registry refactor. (ggerganov#10158)

remove buffer->iface.get_name that used in cann as it was removed in backend registry refactor PR.

* metal : move dequantize templates to beginning of MSL source (#0)

* metal : simplify f16 and f32 dequant kernels (#0)

* cuda : clear error after changing peer access (ggerganov#10153)

* fix build break on arm64 linux (ggerganov#10166)

This fixes the build break from the recent changes
to move the CPU backend to separate files
ggerganov#10144

* server : clarify /slots endpoint, add is_processing (ggerganov#10162)

* server : clarify /slots endpoint, add is_processing

* fix tests

* ggml : fix q4xx mat mul, increase ggml_aligned_malloc alignment (ggerganov#10167)

* ggml : fix gelu tables initialization (ggerganov#10172)

* Q6_K AVX improvements (ggerganov#10118)

* q6_k instruction reordering attempt

* better subtract method

* should be theoretically faster

small improvement with shuffle lut, likely because all loads are already done at that stage

* optimize bit fiddling

* handle -32 offset separately. bsums exists for a reason!

* use shift

* Update ggml-quants.c

* have to update ci macos version to 13 as 12 doesnt work now. 13 is still x86

* ggml : fix arch check in bf16_to_fp32 (ggerganov#10164)

* llama : add <|tool_call|> formatting to Granite template (ggerganov#10177)

Branch: GraniteToolCallTemplate

Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>

* metal : add quantized FA support (ggerganov#10149)

* metal : add quantized FA (vec) support

ggml-ci

* metal : add quantized FA (non-vec) support

* metal : fix support check

ggml-ci

* metal : clean-up

* metal : clean-up (cont)

* metal : fix shared memory calc + reduce smem + comments

* metal : float-correctness

* metal : minor [no ci]

* ggml : adjust is_first_call init value (ggerganov#10193)

ggml-ci

* metal : fix from ptr buffer name (ggerganov#10189)

* server : remove hack for extra parallel slot (ggerganov#10187)

ggml-ci

* metal : add BF16 support (ggerganov#8439)

* ggml : add initial BF16 support

ggml-ci

* metal : add mul_mat_id BF16 support

ggml-ci

* metal : check for bfloat support on the Metal device

ggml-ci

* metal : better var names [no ci]

* metal : do not build bfloat kernels when not supported

ggml-ci

* metal : try to fix BF16 support check

ggml-ci

* metal : this should correctly check bfloat support

---------

Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
Co-authored-by: Plamen Minev <pacominev@gmail.com>
Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: leo-pony <nengjunma@outlook.com>
Co-authored-by: Diego Devesa <slarengh@gmail.com>
Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com>
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com>
Co-authored-by: Gabe Goodhart <ghart@us.ibm.com>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
Co-authored-by: dennyxbox890 <58538165+dennyxbox890@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Plamen Minev <pacominev@gmail.com>
Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: leo-pony <nengjunma@outlook.com>
Co-authored-by: Diego Devesa <slarengh@gmail.com>
Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com>
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com>
Co-authored-by: Gabe Goodhart <ghart@us.ibm.com>

* Rename build.yml to build-ci.yml

* build.yml

* Update build-ci.yml

* Update CMakeLists.txt

* Update CMakeLists.txt

* Update CMakeLists.txt

* Delete ggml/src/vulkan-shaders/CMakeLists.txt

* Update build.yml

* Update build-ci.yml

* Update build-ci.yml

---------

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
Co-authored-by: dennyxbox890 <58538165+dennyxbox890@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Plamen Minev <pacominev@gmail.com>
Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: leo-pony <nengjunma@outlook.com>
Co-authored-by: Diego Devesa <slarengh@gmail.com>
Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com>
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com>
Co-authored-by: Gabe Goodhart <ghart@us.ibm.com>

* Update build-ci.yml

---------

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
Co-authored-by: dennyxbox890 <58538165+dennyxbox890@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Plamen Minev <pacominev@gmail.com>
Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: leo-pony <nengjunma@outlook.com>
Co-authored-by: Diego Devesa <slarengh@gmail.com>
Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com>
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com>
Co-authored-by: Gabe Goodhart <ghart@us.ibm.com>

* Update build-ci.yml

* Update build-ci.yml

---------

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
Co-authored-by: dennyxbox890 <58538165+dennyxbox890@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Plamen Minev <pacominev@gmail.com>
Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: leo-pony <nengjunma@outlook.com>
Co-authored-by: Diego Devesa <slarengh@gmail.com>
Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com>
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com>
Co-authored-by: Gabe Goodhart <ghart@us.ibm.com>
apicalshark added a commit to apicalshark/llama.cpp that referenced this pull request Dec 1, 2024
* Temp (#23)

* Merge (#21)

* merge (#20)

* Master1 (#17)

* Merge PR (#10) (#11) (#13)

Merge

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dennyxbox890 <58538165+dennyxbox890@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump requests from 2.31.0 to 2.32.2 in the pip group across 1 directory

Bumps the pip group with 1 update in the / directory: [requests](https://github.com/psf/requests).


Updates `requests` from 2.31.0 to 2.32.2
- [Release notes](https://github.com/psf/requests/releases)
- [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md)
- [Commits](psf/requests@v2.31.0...v2.32.2)

---
updated-dependencies:
- dependency-name: requests
  dependency-type: direct:production
  dependency-group: pip
...

Signed-off-by: dependabot[bot] <support@github.com>

* Temp (#15)

* metal : fix minor string leaks (ggml/1004)

* cmake : make it possible linking ggml as external lib (ggml/1003)

* sync : ggml

* CANN: adjust backend registry refactor. (ggerganov#10158)

remove buffer->iface.get_name that used in cann as it was removed in backend registry refactor PR.

* metal : move dequantize templates to beginning of MSL source (#0)

* metal : simplify f16 and f32 dequant kernels (#0)

* cuda : clear error after changing peer access (ggerganov#10153)

* fix build break on arm64 linux (ggerganov#10166)

This fixes the build break from the recent changes
to move the CPU backend to separate files
ggerganov#10144

* server : clarify /slots endpoint, add is_processing (ggerganov#10162)

* server : clarify /slots endpoint, add is_processing

* fix tests

* ggml : fix q4xx mat mul, increase ggml_aligned_malloc alignment (ggerganov#10167)

* ggml : fix gelu tables initialization (ggerganov#10172)

* Q6_K AVX improvements (ggerganov#10118)

* q6_k instruction reordering attempt

* better subtract method

* should be theoretically faster

small improvement with shuffle lut, likely because all loads are already done at that stage

* optimize bit fiddling

* handle -32 offset separately. bsums exists for a reason!

* use shift

* Update ggml-quants.c

* have to update ci macos version to 13 as 12 doesnt work now. 13 is still x86

* ggml : fix arch check in bf16_to_fp32 (ggerganov#10164)

* llama : add <|tool_call|> formatting to Granite template (ggerganov#10177)

Branch: GraniteToolCallTemplate

Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>

* metal : add quantized FA support (ggerganov#10149)

* metal : add quantized FA (vec) support

ggml-ci

* metal : add quantized FA (non-vec) support

* metal : fix support check

ggml-ci

* metal : clean-up

* metal : clean-up (cont)

* metal : fix shared memory calc + reduce smem + comments

* metal : float-correctness

* metal : minor [no ci]

* ggml : adjust is_first_call init value (ggerganov#10193)

ggml-ci

* metal : fix from ptr buffer name (ggerganov#10189)

* server : remove hack for extra parallel slot (ggerganov#10187)

ggml-ci

* metal : add BF16 support (ggerganov#8439)

* ggml : add initial BF16 support

ggml-ci

* metal : add mul_mat_id BF16 support

ggml-ci

* metal : check for bfloat support on the Metal device

ggml-ci

* metal : better var names [no ci]

* metal : do not build bfloat kernels when not supported

ggml-ci

* metal : try to fix BF16 support check

ggml-ci

* metal : this should correctly check bfloat support

---------

Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
Co-authored-by: Plamen Minev <pacominev@gmail.com>
Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: leo-pony <nengjunma@outlook.com>
Co-authored-by: Diego Devesa <slarengh@gmail.com>
Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com>
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com>
Co-authored-by: Gabe Goodhart <ghart@us.ibm.com>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
Co-authored-by: dennyxbox890 <58538165+dennyxbox890@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Plamen Minev <pacominev@gmail.com>
Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: leo-pony <nengjunma@outlook.com>
Co-authored-by: Diego Devesa <slarengh@gmail.com>
Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com>
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com>
Co-authored-by: Gabe Goodhart <ghart@us.ibm.com>

* Rename build.yml to build-ci.yml

* build.yml

* Update build-ci.yml

* Update CMakeLists.txt

* Update CMakeLists.txt

* Update CMakeLists.txt

* Delete ggml/src/vulkan-shaders/CMakeLists.txt

* Update build.yml

* Update build-ci.yml

* Update build-ci.yml

---------

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
Co-authored-by: dennyxbox890 <58538165+dennyxbox890@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Plamen Minev <pacominev@gmail.com>
Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: leo-pony <nengjunma@outlook.com>
Co-authored-by: Diego Devesa <slarengh@gmail.com>
Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com>
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com>
Co-authored-by: Gabe Goodhart <ghart@us.ibm.com>

* Update build-ci.yml

---------

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
Co-authored-by: dennyxbox890 <58538165+dennyxbox890@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Plamen Minev <pacominev@gmail.com>
Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: leo-pony <nengjunma@outlook.com>
Co-authored-by: Diego Devesa <slarengh@gmail.com>
Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com>
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com>
Co-authored-by: Gabe Goodhart <ghart@us.ibm.com>

* Update build-ci.yml

* Update build-ci.yml

---------

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
Co-authored-by: dennyxbox890 <58538165+dennyxbox890@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Plamen Minev <pacominev@gmail.com>
Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: leo-pony <nengjunma@outlook.com>
Co-authored-by: Diego Devesa <slarengh@gmail.com>
Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com>
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com>
Co-authored-by: Gabe Goodhart <ghart@us.ibm.com>

* Bump the pip group across 2 directories with 2 updates (#24)

Updates the requirements on [pillow](https://github.com/python-pillow/Pillow) and [aiohttp](https://github.com/aio-libs/aiohttp) to permit the latest version.

Updates `pillow` to 11.0.0
- [Release notes](https://github.com/python-pillow/Pillow/releases)
- [Changelog](https://github.com/python-pillow/Pillow/blob/main/CHANGES.rst)
- [Commits](python-pillow/Pillow@10.2.0...11.0.0)

Updates `aiohttp` to 3.11.7
- [Release notes](https://github.com/aio-libs/aiohttp/releases)
- [Changelog](https://github.com/aio-libs/aiohttp/blob/master/CHANGES.rst)
- [Commits](aio-libs/aiohttp@v3.9.3...v3.11.7)

---
updated-dependencies:
- dependency-name: pillow
  dependency-type: direct:production
  dependency-group: pip
- dependency-name: aiohttp
  dependency-type: direct:production
  dependency-group: pip
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: apicalshark <58538165+apicalshark@users.noreply.github.com>

* Update build-ci.yml

* Update build-ci.yml

* Update build-ci.yml

* Update build-ci.yml

* Update build-ci.yml

* Update build-ci.yml

* Update build-ci.yml

* Update build-ci.yml

* Create docker.yml

* Create python-lint.yml

* Create server.yml

* Update requirements.txt

---------

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
Co-authored-by: dennyxbox890 <58538165+dennyxbox890@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Plamen Minev <pacominev@gmail.com>
Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: leo-pony <nengjunma@outlook.com>
Co-authored-by: Diego Devesa <slarengh@gmail.com>
Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com>
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com>
Co-authored-by: Gabe Goodhart <ghart@us.ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ascend NPU issues specific to Ascend NPUs ggml changes relating to the ggml tensor library for machine learning
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants