Temp #16

apicalshark · 2024-11-08T02:17:10Z

I have read the contributing guidelines
Self-reported review complexity:
- Low
- Medium
- High

remove buffer->iface.get_name that used in cann as it was removed in backend registry refactor PR.

This fixes the build break from the recent changes to move the CPU backend to separate files ggerganov#10144

* server : clarify /slots endpoint, add is_processing * fix tests

* q6_k instruction reordering attempt * better subtract method * should be theoretically faster small improvement with shuffle lut, likely because all loads are already done at that stage * optimize bit fiddling * handle -32 offset separately. bsums exists for a reason! * use shift * Update ggml-quants.c * have to update ci macos version to 13 as 12 doesnt work now. 13 is still x86

…0177) Branch: GraniteToolCallTemplate Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>

* metal : add quantized FA (vec) support ggml-ci * metal : add quantized FA (non-vec) support * metal : fix support check ggml-ci * metal : clean-up * metal : clean-up (cont) * metal : fix shared memory calc + reduce smem + comments * metal : float-correctness * metal : minor [no ci]

ggml-ci

* ggml : add initial BF16 support ggml-ci * metal : add mul_mat_id BF16 support ggml-ci * metal : check for bfloat support on the Metal device ggml-ci * metal : better var names [no ci] * metal : do not build bfloat kernels when not supported ggml-ci * metal : try to fix BF16 support check ggml-ci * metal : this should correctly check bfloat support

…eleration (ggerganov#10133) * rwkv6: rename to wkv6 * rwkv6: support avx2 avx512 armv8 armv9 * rwkv6: update cuda file name * rwkv6: rename params * wkv on sycl * sycl: add some ops * sycl: Enhance OP support judgment * wkv6: drop armv9 and tranfer to GGML style ggml-ci * sync : ggml * update the function to use appropriate types * fix define error * Update ggml/src/ggml-cpu.c * add appropriate asserts * move element-wise functions outside * put the declaration outside the loop * rewrite to be more inline with the common pattern for distributing threads * use recommended way GGML_TENSOR_LOCALS --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: Diego Devesa <slarengh@gmail.com> Co-authored-by: Plamen Minev <pacominev@gmail.com> Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com> Co-authored-by: Meng, Hengyu <airdldl@163.com>

Co-authored-by: EC2 Default User <ec2-user@ip-172-31-62-167.us-west-2.compute.internal>

* server : simple chat UI with vuejs and daisyui * move old files to legacy folder * embed deps into binary * basic markdown support * add conversation history, save to localStorage * fix bg-base classes * save theme preferences * fix tests * regenerate, edit, copy buttons * small fixes * docs: how to use legacy ui * better error handling * make CORS preflight more explicit * add GET method for CORS * fix tests * clean up a bit * better auto scroll * small fixes * use collapse-arrow * fix closeAndSaveConfigDialog * small fix * remove console.log * fix style for <pre> element * lighter bubble color (less distract when reading)

pminev and others added 29 commits November 4, 2024 10:33

metal : fix minor string leaks (ggml/1004)

e2292aa

cmake : make it possible linking ggml as external lib (ggml/1003)

284e5b0

sync : ggml

ce027ad

CANN: adjust backend registry refactor. (ggerganov#10158)

329ed91

remove buffer->iface.get_name that used in cann as it was removed in backend registry refactor PR.

metal : move dequantize templates to beginning of MSL source (#0)

f8e5813

metal : simplify f16 and f32 dequant kernels (#0)

05697f6

cuda : clear error after changing peer access (ggerganov#10153)

ea02c75

fix build break on arm64 linux (ggerganov#10166)

6a066b9

This fixes the build break from the recent changes to move the CPU backend to separate files ggerganov#10144

server : clarify /slots endpoint, add is_processing (ggerganov#10162)

9e0ecfb

* server : clarify /slots endpoint, add is_processing * fix tests

ggml : fix q4xx mat mul, increase ggml_aligned_malloc alignment (gger…

401558b

…ganov#10167)

ggml : fix gelu tables initialization (ggerganov#10172)

d5a409e

ggml : fix arch check in bf16_to_fp32 (ggerganov#10164)

a9e8a9a

llama : add <|tool_call|> formatting to Granite template (ggerganov#1…

b8deef0

…0177) Branch: GraniteToolCallTemplate Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>

ggml : adjust is_first_call init value (ggerganov#10193)

1dc04b2

ggml-ci

metal : fix from ptr buffer name (ggerganov#10189)

94d8cb8

server : remove hack for extra parallel slot (ggerganov#10187)

b11f9ba

ggml-ci

fix q4_0_8_8 format for corrupted tokens issue (ggerganov#10198)

2319126

Co-authored-by: EC2 Default User <ec2-user@ip-172-31-62-167.us-west-2.compute.internal>

DRY: Fixes clone functionality (ggerganov#10192)

5107e8c

Remove identical wte/etw logic for jais (ggerganov#10203)

60e17ce

ggml : add ggml-cpu.h to the public headers (ggerganov#10204)

97404c4

scripts : sync update

a2c6fd7

sync : ggml

3b08828

scripts : add amx to sync-ggml.sh [no ci]

eec4d71

server : minor UI fix (ggerganov#10207)

76c6e7f

github-actions bot added the documentation Improvements or additions to documentation label Nov 8, 2024

github-actions bot added examples server devops testing python script ggml SYCL Nvidia GPU labels Nov 8, 2024

Merge branch 'master' into temp

c0d480a

apicalshark merged commit 9e3c483 into master Nov 8, 2024
10 checks passed

apicalshark deleted the temp branch November 8, 2024 02:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Temp #16

Temp #16

apicalshark commented Nov 8, 2024

Temp #16

Temp #16

Conversation

apicalshark commented Nov 8, 2024