-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix build break on arm64 linux #10166
Conversation
This fixes the build break from the recent changes to move the CPU backend to separate files ggerganov#10144
Can we please consider adding CI support for |
There is a arm64 CI, but I guess it doesn't have SVE support. It should still be possible to test the at least that the build completes with SVE enabled with the github runner, it's just not done at the moment. |
thanks @slaren , please let's know if the SVE capable runner support can be prioritized, happy to collaborate with you on this. |
Feel free to open a PR adding a SVE build to the |
It looks like Qemu supports ARM SVE (https://wiki.qemu.org/Features/ARM/SVE). A good approach might be to run this on an x86_64 ubuntu-latest and run it through qemu. |
* metal : fix minor string leaks (ggml/1004) * cmake : make it possible linking ggml as external lib (ggml/1003) * sync : ggml * CANN: adjust backend registry refactor. (ggerganov#10158) remove buffer->iface.get_name that used in cann as it was removed in backend registry refactor PR. * metal : move dequantize templates to beginning of MSL source (#0) * metal : simplify f16 and f32 dequant kernels (#0) * cuda : clear error after changing peer access (ggerganov#10153) * fix build break on arm64 linux (ggerganov#10166) This fixes the build break from the recent changes to move the CPU backend to separate files ggerganov#10144 * server : clarify /slots endpoint, add is_processing (ggerganov#10162) * server : clarify /slots endpoint, add is_processing * fix tests * ggml : fix q4xx mat mul, increase ggml_aligned_malloc alignment (ggerganov#10167) * ggml : fix gelu tables initialization (ggerganov#10172) * Q6_K AVX improvements (ggerganov#10118) * q6_k instruction reordering attempt * better subtract method * should be theoretically faster small improvement with shuffle lut, likely because all loads are already done at that stage * optimize bit fiddling * handle -32 offset separately. bsums exists for a reason! * use shift * Update ggml-quants.c * have to update ci macos version to 13 as 12 doesnt work now. 13 is still x86 * ggml : fix arch check in bf16_to_fp32 (ggerganov#10164) * llama : add <|tool_call|> formatting to Granite template (ggerganov#10177) Branch: GraniteToolCallTemplate Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * metal : add quantized FA support (ggerganov#10149) * metal : add quantized FA (vec) support ggml-ci * metal : add quantized FA (non-vec) support * metal : fix support check ggml-ci * metal : clean-up * metal : clean-up (cont) * metal : fix shared memory calc + reduce smem + comments * metal : float-correctness * metal : minor [no ci] * ggml : adjust is_first_call init value (ggerganov#10193) ggml-ci * metal : fix from ptr buffer name (ggerganov#10189) * server : remove hack for extra parallel slot (ggerganov#10187) ggml-ci * metal : add BF16 support (ggerganov#8439) * ggml : add initial BF16 support ggml-ci * metal : add mul_mat_id BF16 support ggml-ci * metal : check for bfloat support on the Metal device ggml-ci * metal : better var names [no ci] * metal : do not build bfloat kernels when not supported ggml-ci * metal : try to fix BF16 support check ggml-ci * metal : this should correctly check bfloat support --------- Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> Co-authored-by: Plamen Minev <pacominev@gmail.com> Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: leo-pony <nengjunma@outlook.com> Co-authored-by: Diego Devesa <slarengh@gmail.com> Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com> Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com> Co-authored-by: Gabe Goodhart <ghart@us.ibm.com>
* Merge PR (#10) (#11) (#13) Merge --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dennyxbox890 <58538165+dennyxbox890@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump requests from 2.31.0 to 2.32.2 in the pip group across 1 directory Bumps the pip group with 1 update in the / directory: [requests](https://github.com/psf/requests). Updates `requests` from 2.31.0 to 2.32.2 - [Release notes](https://github.com/psf/requests/releases) - [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md) - [Commits](psf/requests@v2.31.0...v2.32.2) --- updated-dependencies: - dependency-name: requests dependency-type: direct:production dependency-group: pip ... Signed-off-by: dependabot[bot] <support@github.com> * Temp (#15) * metal : fix minor string leaks (ggml/1004) * cmake : make it possible linking ggml as external lib (ggml/1003) * sync : ggml * CANN: adjust backend registry refactor. (ggerganov#10158) remove buffer->iface.get_name that used in cann as it was removed in backend registry refactor PR. * metal : move dequantize templates to beginning of MSL source (#0) * metal : simplify f16 and f32 dequant kernels (#0) * cuda : clear error after changing peer access (ggerganov#10153) * fix build break on arm64 linux (ggerganov#10166) This fixes the build break from the recent changes to move the CPU backend to separate files ggerganov#10144 * server : clarify /slots endpoint, add is_processing (ggerganov#10162) * server : clarify /slots endpoint, add is_processing * fix tests * ggml : fix q4xx mat mul, increase ggml_aligned_malloc alignment (ggerganov#10167) * ggml : fix gelu tables initialization (ggerganov#10172) * Q6_K AVX improvements (ggerganov#10118) * q6_k instruction reordering attempt * better subtract method * should be theoretically faster small improvement with shuffle lut, likely because all loads are already done at that stage * optimize bit fiddling * handle -32 offset separately. bsums exists for a reason! * use shift * Update ggml-quants.c * have to update ci macos version to 13 as 12 doesnt work now. 13 is still x86 * ggml : fix arch check in bf16_to_fp32 (ggerganov#10164) * llama : add <|tool_call|> formatting to Granite template (ggerganov#10177) Branch: GraniteToolCallTemplate Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * metal : add quantized FA support (ggerganov#10149) * metal : add quantized FA (vec) support ggml-ci * metal : add quantized FA (non-vec) support * metal : fix support check ggml-ci * metal : clean-up * metal : clean-up (cont) * metal : fix shared memory calc + reduce smem + comments * metal : float-correctness * metal : minor [no ci] * ggml : adjust is_first_call init value (ggerganov#10193) ggml-ci * metal : fix from ptr buffer name (ggerganov#10189) * server : remove hack for extra parallel slot (ggerganov#10187) ggml-ci * metal : add BF16 support (ggerganov#8439) * ggml : add initial BF16 support ggml-ci * metal : add mul_mat_id BF16 support ggml-ci * metal : check for bfloat support on the Metal device ggml-ci * metal : better var names [no ci] * metal : do not build bfloat kernels when not supported ggml-ci * metal : try to fix BF16 support check ggml-ci * metal : this should correctly check bfloat support --------- Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> Co-authored-by: Plamen Minev <pacominev@gmail.com> Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: leo-pony <nengjunma@outlook.com> Co-authored-by: Diego Devesa <slarengh@gmail.com> Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com> Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com> Co-authored-by: Gabe Goodhart <ghart@us.ibm.com> --------- Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> Co-authored-by: dennyxbox890 <58538165+dennyxbox890@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Plamen Minev <pacominev@gmail.com> Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: leo-pony <nengjunma@outlook.com> Co-authored-by: Diego Devesa <slarengh@gmail.com> Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com> Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com> Co-authored-by: Gabe Goodhart <ghart@us.ibm.com>
* Master1 (#17) * Merge PR (#10) (#11) (#13) Merge --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dennyxbox890 <58538165+dennyxbox890@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump requests from 2.31.0 to 2.32.2 in the pip group across 1 directory Bumps the pip group with 1 update in the / directory: [requests](https://github.com/psf/requests). Updates `requests` from 2.31.0 to 2.32.2 - [Release notes](https://github.com/psf/requests/releases) - [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md) - [Commits](psf/requests@v2.31.0...v2.32.2) --- updated-dependencies: - dependency-name: requests dependency-type: direct:production dependency-group: pip ... Signed-off-by: dependabot[bot] <support@github.com> * Temp (#15) * metal : fix minor string leaks (ggml/1004) * cmake : make it possible linking ggml as external lib (ggml/1003) * sync : ggml * CANN: adjust backend registry refactor. (ggerganov#10158) remove buffer->iface.get_name that used in cann as it was removed in backend registry refactor PR. * metal : move dequantize templates to beginning of MSL source (#0) * metal : simplify f16 and f32 dequant kernels (#0) * cuda : clear error after changing peer access (ggerganov#10153) * fix build break on arm64 linux (ggerganov#10166) This fixes the build break from the recent changes to move the CPU backend to separate files ggerganov#10144 * server : clarify /slots endpoint, add is_processing (ggerganov#10162) * server : clarify /slots endpoint, add is_processing * fix tests * ggml : fix q4xx mat mul, increase ggml_aligned_malloc alignment (ggerganov#10167) * ggml : fix gelu tables initialization (ggerganov#10172) * Q6_K AVX improvements (ggerganov#10118) * q6_k instruction reordering attempt * better subtract method * should be theoretically faster small improvement with shuffle lut, likely because all loads are already done at that stage * optimize bit fiddling * handle -32 offset separately. bsums exists for a reason! * use shift * Update ggml-quants.c * have to update ci macos version to 13 as 12 doesnt work now. 13 is still x86 * ggml : fix arch check in bf16_to_fp32 (ggerganov#10164) * llama : add <|tool_call|> formatting to Granite template (ggerganov#10177) Branch: GraniteToolCallTemplate Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * metal : add quantized FA support (ggerganov#10149) * metal : add quantized FA (vec) support ggml-ci * metal : add quantized FA (non-vec) support * metal : fix support check ggml-ci * metal : clean-up * metal : clean-up (cont) * metal : fix shared memory calc + reduce smem + comments * metal : float-correctness * metal : minor [no ci] * ggml : adjust is_first_call init value (ggerganov#10193) ggml-ci * metal : fix from ptr buffer name (ggerganov#10189) * server : remove hack for extra parallel slot (ggerganov#10187) ggml-ci * metal : add BF16 support (ggerganov#8439) * ggml : add initial BF16 support ggml-ci * metal : add mul_mat_id BF16 support ggml-ci * metal : check for bfloat support on the Metal device ggml-ci * metal : better var names [no ci] * metal : do not build bfloat kernels when not supported ggml-ci * metal : try to fix BF16 support check ggml-ci * metal : this should correctly check bfloat support --------- Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> Co-authored-by: Plamen Minev <pacominev@gmail.com> Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: leo-pony <nengjunma@outlook.com> Co-authored-by: Diego Devesa <slarengh@gmail.com> Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com> Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com> Co-authored-by: Gabe Goodhart <ghart@us.ibm.com> --------- Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> Co-authored-by: dennyxbox890 <58538165+dennyxbox890@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Plamen Minev <pacominev@gmail.com> Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: leo-pony <nengjunma@outlook.com> Co-authored-by: Diego Devesa <slarengh@gmail.com> Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com> Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com> Co-authored-by: Gabe Goodhart <ghart@us.ibm.com> * Rename build.yml to build-ci.yml * build.yml * Update build-ci.yml * Update CMakeLists.txt * Update CMakeLists.txt * Update CMakeLists.txt * Delete ggml/src/vulkan-shaders/CMakeLists.txt * Update build.yml * Update build-ci.yml * Update build-ci.yml --------- Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> Co-authored-by: dennyxbox890 <58538165+dennyxbox890@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Plamen Minev <pacominev@gmail.com> Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: leo-pony <nengjunma@outlook.com> Co-authored-by: Diego Devesa <slarengh@gmail.com> Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com> Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com> Co-authored-by: Gabe Goodhart <ghart@us.ibm.com>
Given there are already multiple arm64 platforms with SVE support available to public (for example, AWS Graviton3 and Graviton4 based EC2 instances), my request is to consider arm64 native builds instead of Qemu emulation. |
We can add a CI instance from Azure cloud. But last time I checked, they were still in preview. |
Yeah GitHub actions have the feature for their enterprise tier but not yet for open source projects. In the interest of simplicity though, keeping on GitHub actions instead of calling out to AWS or something might be desirable. Although, with caching for qemu images and running on a low-cost Ubuntu runner for one or two jobs should be fine too. In addition, that pattern can apply for FreeBSD &co. I plan on getting FreeBSD set up this way soon but haven't had time yet. 😊 |
This fixes the build break from the recent changes to move the CPU backend to separate files ggerganov#10144
* merge (#20) * Master1 (#17) * Merge PR (#10) (#11) (#13) Merge --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dennyxbox890 <58538165+dennyxbox890@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump requests from 2.31.0 to 2.32.2 in the pip group across 1 directory Bumps the pip group with 1 update in the / directory: [requests](https://github.com/psf/requests). Updates `requests` from 2.31.0 to 2.32.2 - [Release notes](https://github.com/psf/requests/releases) - [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md) - [Commits](psf/requests@v2.31.0...v2.32.2) --- updated-dependencies: - dependency-name: requests dependency-type: direct:production dependency-group: pip ... Signed-off-by: dependabot[bot] <support@github.com> * Temp (#15) * metal : fix minor string leaks (ggml/1004) * cmake : make it possible linking ggml as external lib (ggml/1003) * sync : ggml * CANN: adjust backend registry refactor. (ggerganov#10158) remove buffer->iface.get_name that used in cann as it was removed in backend registry refactor PR. * metal : move dequantize templates to beginning of MSL source (#0) * metal : simplify f16 and f32 dequant kernels (#0) * cuda : clear error after changing peer access (ggerganov#10153) * fix build break on arm64 linux (ggerganov#10166) This fixes the build break from the recent changes to move the CPU backend to separate files ggerganov#10144 * server : clarify /slots endpoint, add is_processing (ggerganov#10162) * server : clarify /slots endpoint, add is_processing * fix tests * ggml : fix q4xx mat mul, increase ggml_aligned_malloc alignment (ggerganov#10167) * ggml : fix gelu tables initialization (ggerganov#10172) * Q6_K AVX improvements (ggerganov#10118) * q6_k instruction reordering attempt * better subtract method * should be theoretically faster small improvement with shuffle lut, likely because all loads are already done at that stage * optimize bit fiddling * handle -32 offset separately. bsums exists for a reason! * use shift * Update ggml-quants.c * have to update ci macos version to 13 as 12 doesnt work now. 13 is still x86 * ggml : fix arch check in bf16_to_fp32 (ggerganov#10164) * llama : add <|tool_call|> formatting to Granite template (ggerganov#10177) Branch: GraniteToolCallTemplate Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * metal : add quantized FA support (ggerganov#10149) * metal : add quantized FA (vec) support ggml-ci * metal : add quantized FA (non-vec) support * metal : fix support check ggml-ci * metal : clean-up * metal : clean-up (cont) * metal : fix shared memory calc + reduce smem + comments * metal : float-correctness * metal : minor [no ci] * ggml : adjust is_first_call init value (ggerganov#10193) ggml-ci * metal : fix from ptr buffer name (ggerganov#10189) * server : remove hack for extra parallel slot (ggerganov#10187) ggml-ci * metal : add BF16 support (ggerganov#8439) * ggml : add initial BF16 support ggml-ci * metal : add mul_mat_id BF16 support ggml-ci * metal : check for bfloat support on the Metal device ggml-ci * metal : better var names [no ci] * metal : do not build bfloat kernels when not supported ggml-ci * metal : try to fix BF16 support check ggml-ci * metal : this should correctly check bfloat support --------- Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> Co-authored-by: Plamen Minev <pacominev@gmail.com> Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: leo-pony <nengjunma@outlook.com> Co-authored-by: Diego Devesa <slarengh@gmail.com> Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com> Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com> Co-authored-by: Gabe Goodhart <ghart@us.ibm.com> --------- Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> Co-authored-by: dennyxbox890 <58538165+dennyxbox890@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Plamen Minev <pacominev@gmail.com> Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: leo-pony <nengjunma@outlook.com> Co-authored-by: Diego Devesa <slarengh@gmail.com> Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com> Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com> Co-authored-by: Gabe Goodhart <ghart@us.ibm.com> * Rename build.yml to build-ci.yml * build.yml * Update build-ci.yml * Update CMakeLists.txt * Update CMakeLists.txt * Update CMakeLists.txt * Delete ggml/src/vulkan-shaders/CMakeLists.txt * Update build.yml * Update build-ci.yml * Update build-ci.yml --------- Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> Co-authored-by: dennyxbox890 <58538165+dennyxbox890@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Plamen Minev <pacominev@gmail.com> Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: leo-pony <nengjunma@outlook.com> Co-authored-by: Diego Devesa <slarengh@gmail.com> Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com> Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com> Co-authored-by: Gabe Goodhart <ghart@us.ibm.com> * Update build-ci.yml --------- Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> Co-authored-by: dennyxbox890 <58538165+dennyxbox890@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Plamen Minev <pacominev@gmail.com> Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: leo-pony <nengjunma@outlook.com> Co-authored-by: Diego Devesa <slarengh@gmail.com> Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com> Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com> Co-authored-by: Gabe Goodhart <ghart@us.ibm.com>
This fixes the build break from the recent changes to move the CPU backend to separate files ggerganov#10144
* Merge (#21) * merge (#20) * Master1 (#17) * Merge PR (#10) (#11) (#13) Merge --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dennyxbox890 <58538165+dennyxbox890@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump requests from 2.31.0 to 2.32.2 in the pip group across 1 directory Bumps the pip group with 1 update in the / directory: [requests](https://github.com/psf/requests). Updates `requests` from 2.31.0 to 2.32.2 - [Release notes](https://github.com/psf/requests/releases) - [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md) - [Commits](psf/requests@v2.31.0...v2.32.2) --- updated-dependencies: - dependency-name: requests dependency-type: direct:production dependency-group: pip ... Signed-off-by: dependabot[bot] <support@github.com> * Temp (#15) * metal : fix minor string leaks (ggml/1004) * cmake : make it possible linking ggml as external lib (ggml/1003) * sync : ggml * CANN: adjust backend registry refactor. (ggerganov#10158) remove buffer->iface.get_name that used in cann as it was removed in backend registry refactor PR. * metal : move dequantize templates to beginning of MSL source (#0) * metal : simplify f16 and f32 dequant kernels (#0) * cuda : clear error after changing peer access (ggerganov#10153) * fix build break on arm64 linux (ggerganov#10166) This fixes the build break from the recent changes to move the CPU backend to separate files ggerganov#10144 * server : clarify /slots endpoint, add is_processing (ggerganov#10162) * server : clarify /slots endpoint, add is_processing * fix tests * ggml : fix q4xx mat mul, increase ggml_aligned_malloc alignment (ggerganov#10167) * ggml : fix gelu tables initialization (ggerganov#10172) * Q6_K AVX improvements (ggerganov#10118) * q6_k instruction reordering attempt * better subtract method * should be theoretically faster small improvement with shuffle lut, likely because all loads are already done at that stage * optimize bit fiddling * handle -32 offset separately. bsums exists for a reason! * use shift * Update ggml-quants.c * have to update ci macos version to 13 as 12 doesnt work now. 13 is still x86 * ggml : fix arch check in bf16_to_fp32 (ggerganov#10164) * llama : add <|tool_call|> formatting to Granite template (ggerganov#10177) Branch: GraniteToolCallTemplate Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * metal : add quantized FA support (ggerganov#10149) * metal : add quantized FA (vec) support ggml-ci * metal : add quantized FA (non-vec) support * metal : fix support check ggml-ci * metal : clean-up * metal : clean-up (cont) * metal : fix shared memory calc + reduce smem + comments * metal : float-correctness * metal : minor [no ci] * ggml : adjust is_first_call init value (ggerganov#10193) ggml-ci * metal : fix from ptr buffer name (ggerganov#10189) * server : remove hack for extra parallel slot (ggerganov#10187) ggml-ci * metal : add BF16 support (ggerganov#8439) * ggml : add initial BF16 support ggml-ci * metal : add mul_mat_id BF16 support ggml-ci * metal : check for bfloat support on the Metal device ggml-ci * metal : better var names [no ci] * metal : do not build bfloat kernels when not supported ggml-ci * metal : try to fix BF16 support check ggml-ci * metal : this should correctly check bfloat support --------- Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> Co-authored-by: Plamen Minev <pacominev@gmail.com> Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: leo-pony <nengjunma@outlook.com> Co-authored-by: Diego Devesa <slarengh@gmail.com> Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com> Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com> Co-authored-by: Gabe Goodhart <ghart@us.ibm.com> --------- Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> Co-authored-by: dennyxbox890 <58538165+dennyxbox890@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Plamen Minev <pacominev@gmail.com> Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: leo-pony <nengjunma@outlook.com> Co-authored-by: Diego Devesa <slarengh@gmail.com> Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com> Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com> Co-authored-by: Gabe Goodhart <ghart@us.ibm.com> * Rename build.yml to build-ci.yml * build.yml * Update build-ci.yml * Update CMakeLists.txt * Update CMakeLists.txt * Update CMakeLists.txt * Delete ggml/src/vulkan-shaders/CMakeLists.txt * Update build.yml * Update build-ci.yml * Update build-ci.yml --------- Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> Co-authored-by: dennyxbox890 <58538165+dennyxbox890@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Plamen Minev <pacominev@gmail.com> Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: leo-pony <nengjunma@outlook.com> Co-authored-by: Diego Devesa <slarengh@gmail.com> Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com> Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com> Co-authored-by: Gabe Goodhart <ghart@us.ibm.com> * Update build-ci.yml --------- Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> Co-authored-by: dennyxbox890 <58538165+dennyxbox890@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Plamen Minev <pacominev@gmail.com> Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: leo-pony <nengjunma@outlook.com> Co-authored-by: Diego Devesa <slarengh@gmail.com> Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com> Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com> Co-authored-by: Gabe Goodhart <ghart@us.ibm.com> * Update build-ci.yml * Update build-ci.yml --------- Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> Co-authored-by: dennyxbox890 <58538165+dennyxbox890@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Plamen Minev <pacominev@gmail.com> Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: leo-pony <nengjunma@outlook.com> Co-authored-by: Diego Devesa <slarengh@gmail.com> Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com> Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com> Co-authored-by: Gabe Goodhart <ghart@us.ibm.com>
* Temp (#23) * Merge (#21) * merge (#20) * Master1 (#17) * Merge PR (#10) (#11) (#13) Merge --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dennyxbox890 <58538165+dennyxbox890@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump requests from 2.31.0 to 2.32.2 in the pip group across 1 directory Bumps the pip group with 1 update in the / directory: [requests](https://github.com/psf/requests). Updates `requests` from 2.31.0 to 2.32.2 - [Release notes](https://github.com/psf/requests/releases) - [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md) - [Commits](psf/requests@v2.31.0...v2.32.2) --- updated-dependencies: - dependency-name: requests dependency-type: direct:production dependency-group: pip ... Signed-off-by: dependabot[bot] <support@github.com> * Temp (#15) * metal : fix minor string leaks (ggml/1004) * cmake : make it possible linking ggml as external lib (ggml/1003) * sync : ggml * CANN: adjust backend registry refactor. (ggerganov#10158) remove buffer->iface.get_name that used in cann as it was removed in backend registry refactor PR. * metal : move dequantize templates to beginning of MSL source (#0) * metal : simplify f16 and f32 dequant kernels (#0) * cuda : clear error after changing peer access (ggerganov#10153) * fix build break on arm64 linux (ggerganov#10166) This fixes the build break from the recent changes to move the CPU backend to separate files ggerganov#10144 * server : clarify /slots endpoint, add is_processing (ggerganov#10162) * server : clarify /slots endpoint, add is_processing * fix tests * ggml : fix q4xx mat mul, increase ggml_aligned_malloc alignment (ggerganov#10167) * ggml : fix gelu tables initialization (ggerganov#10172) * Q6_K AVX improvements (ggerganov#10118) * q6_k instruction reordering attempt * better subtract method * should be theoretically faster small improvement with shuffle lut, likely because all loads are already done at that stage * optimize bit fiddling * handle -32 offset separately. bsums exists for a reason! * use shift * Update ggml-quants.c * have to update ci macos version to 13 as 12 doesnt work now. 13 is still x86 * ggml : fix arch check in bf16_to_fp32 (ggerganov#10164) * llama : add <|tool_call|> formatting to Granite template (ggerganov#10177) Branch: GraniteToolCallTemplate Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * metal : add quantized FA support (ggerganov#10149) * metal : add quantized FA (vec) support ggml-ci * metal : add quantized FA (non-vec) support * metal : fix support check ggml-ci * metal : clean-up * metal : clean-up (cont) * metal : fix shared memory calc + reduce smem + comments * metal : float-correctness * metal : minor [no ci] * ggml : adjust is_first_call init value (ggerganov#10193) ggml-ci * metal : fix from ptr buffer name (ggerganov#10189) * server : remove hack for extra parallel slot (ggerganov#10187) ggml-ci * metal : add BF16 support (ggerganov#8439) * ggml : add initial BF16 support ggml-ci * metal : add mul_mat_id BF16 support ggml-ci * metal : check for bfloat support on the Metal device ggml-ci * metal : better var names [no ci] * metal : do not build bfloat kernels when not supported ggml-ci * metal : try to fix BF16 support check ggml-ci * metal : this should correctly check bfloat support --------- Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> Co-authored-by: Plamen Minev <pacominev@gmail.com> Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: leo-pony <nengjunma@outlook.com> Co-authored-by: Diego Devesa <slarengh@gmail.com> Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com> Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com> Co-authored-by: Gabe Goodhart <ghart@us.ibm.com> --------- Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> Co-authored-by: dennyxbox890 <58538165+dennyxbox890@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Plamen Minev <pacominev@gmail.com> Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: leo-pony <nengjunma@outlook.com> Co-authored-by: Diego Devesa <slarengh@gmail.com> Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com> Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com> Co-authored-by: Gabe Goodhart <ghart@us.ibm.com> * Rename build.yml to build-ci.yml * build.yml * Update build-ci.yml * Update CMakeLists.txt * Update CMakeLists.txt * Update CMakeLists.txt * Delete ggml/src/vulkan-shaders/CMakeLists.txt * Update build.yml * Update build-ci.yml * Update build-ci.yml --------- Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> Co-authored-by: dennyxbox890 <58538165+dennyxbox890@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Plamen Minev <pacominev@gmail.com> Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: leo-pony <nengjunma@outlook.com> Co-authored-by: Diego Devesa <slarengh@gmail.com> Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com> Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com> Co-authored-by: Gabe Goodhart <ghart@us.ibm.com> * Update build-ci.yml --------- Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> Co-authored-by: dennyxbox890 <58538165+dennyxbox890@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Plamen Minev <pacominev@gmail.com> Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: leo-pony <nengjunma@outlook.com> Co-authored-by: Diego Devesa <slarengh@gmail.com> Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com> Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com> Co-authored-by: Gabe Goodhart <ghart@us.ibm.com> * Update build-ci.yml * Update build-ci.yml --------- Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> Co-authored-by: dennyxbox890 <58538165+dennyxbox890@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Plamen Minev <pacominev@gmail.com> Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: leo-pony <nengjunma@outlook.com> Co-authored-by: Diego Devesa <slarengh@gmail.com> Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com> Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com> Co-authored-by: Gabe Goodhart <ghart@us.ibm.com> * Bump the pip group across 2 directories with 2 updates (#24) Updates the requirements on [pillow](https://github.com/python-pillow/Pillow) and [aiohttp](https://github.com/aio-libs/aiohttp) to permit the latest version. Updates `pillow` to 11.0.0 - [Release notes](https://github.com/python-pillow/Pillow/releases) - [Changelog](https://github.com/python-pillow/Pillow/blob/main/CHANGES.rst) - [Commits](python-pillow/Pillow@10.2.0...11.0.0) Updates `aiohttp` to 3.11.7 - [Release notes](https://github.com/aio-libs/aiohttp/releases) - [Changelog](https://github.com/aio-libs/aiohttp/blob/master/CHANGES.rst) - [Commits](aio-libs/aiohttp@v3.9.3...v3.11.7) --- updated-dependencies: - dependency-name: pillow dependency-type: direct:production dependency-group: pip - dependency-name: aiohttp dependency-type: direct:production dependency-group: pip ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: apicalshark <58538165+apicalshark@users.noreply.github.com> * Update build-ci.yml * Update build-ci.yml * Update build-ci.yml * Update build-ci.yml * Update build-ci.yml * Update build-ci.yml * Update build-ci.yml * Update build-ci.yml * Create docker.yml * Create python-lint.yml * Create server.yml * Update requirements.txt --------- Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> Co-authored-by: dennyxbox890 <58538165+dennyxbox890@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Plamen Minev <pacominev@gmail.com> Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: leo-pony <nengjunma@outlook.com> Co-authored-by: Diego Devesa <slarengh@gmail.com> Co-authored-by: snadampal <87143774+snadampal@users.noreply.github.com> Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> Co-authored-by: Eve <139727413+netrunnereve@users.noreply.github.com> Co-authored-by: Gabe Goodhart <ghart@us.ibm.com>
This fixes the build break from the recent changes to move the CPU backend to separate files
#10144