b4702

github-actions released this 13 Feb 00:41

a394039

ggml-cpu : add chunking support to mul_mat_id (#11666)

* ggml-cpu : add chunking support to mul_mat_id

* allocate chunk counter in wdata
parallelize src1 quantization by column to allows parallelization even when there is only one row

* disable for arm

* cleanup

* better way to disable for arm

* fix uninitialized counter when using 1 thread only

* revert test-backend-ops changes

Assets 23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

b4702