sync : ggml (backend v2, k-quants, CUDA opts, Metal opts, etc.) #1422

ggerganov · 2023-11-03T08:22:16Z

Bringing back many improvements from llama.cpp and ggml

These will be used to prepare a new release that will include:

Efficient Beam Search
Full CUDA GPU offloading
K-quants support
Grammar support (whisper : add grammar-based sampling #1229)

* Allow env variable to override resource path * Update ggml-metal.m --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

nchudleigh · 2023-11-04T18:33:58Z

Testing this on 2015 MBP with i7

* whisper : check state->ctx_metal not null * whisper : add whisper_context_params { use_gpu } * whisper : new API with params & deprecate old API * examples : use no-gpu param && whisper_init_from_file_with_params * whisper.objc : enable metal & disable on simulator * whisper.swiftui, metal : enable metal & support load default.metallib * whisper.android : use new API * bindings : use new API * addon.node : fix build & test * bindings : updata java binding * bindings : add missing whisper_context_default_params_by_ref WHISPER_API for java * metal : use SWIFTPM_MODULE_BUNDLE for GGML_SWIFT and reuse library load * metal : move bundle var into block * metal : use SWIFT_PACKAGE instead of GGML_SWIFT * style : minor updates --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

…ganov#1422) * sync : ggml (backend v2, k-quants, CUDA opts, Metal opts, etc.) * metal : allow env metal variable to override resource path (ggerganov#1415) * Allow env variable to override resource path * Update ggml-metal.m --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * sync : restore common / main from `master` * sync : restore whisper from `master` * talk-llama : update to latest llama.cpp * ruby : fix build * ggml : fix 32-bit ARM build * ggml : fix MIN / MAX macro collisions + update ios bindings * ggml : fix ifdefs and MIN / MAX again * exampels : fix Obj-C and Swift examples * ggml : fix 32-bit ARM compatibility * ggml : one more attempt to fix 32-bit ARM compat * whisper : fix support for larger graphs --------- Co-authored-by: Chris Raethke <codesoda@users.noreply.github.com>

sync : ggml (backend v2, k-quants, CUDA opts, Metal opts, etc.)

a8cee86

codesoda and others added 8 commits November 3, 2023 14:00

metal : allow env metal variable to override resource path (#1415)

592aed1

* Allow env variable to override resource path * Update ggml-metal.m --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

sync : restore common / main from master

68dd17b

sync : restore whisper from master

e2c06e8

talk-llama : update to latest llama.cpp

b3fc77c

ruby : fix build

e39d42b

ggml : fix 32-bit ARM build

002aeb7

ggml : fix MIN / MAX macro collisions + update ios bindings

d7735f8

ggml : fix ifdefs and MIN / MAX again

4bd643f

ggerganov force-pushed the sync branch from a237db6 to 4bd643f Compare November 3, 2023 16:04

ggerganov added 2 commits November 3, 2023 18:34

exampels : fix Obj-C and Swift examples

72c8697

ggml : fix 32-bit ARM compatibility

db1093e

ggerganov force-pushed the sync branch from 2aac9af to db1093e Compare November 3, 2023 18:08

ggerganov added 2 commits November 3, 2023 20:28

ggml : one more attempt to fix 32-bit ARM compat

36f375b

whisper : fix support for larger graphs

0a7eba8

ggerganov marked this pull request as ready for review November 3, 2023 19:23

ggerganov merged commit f96e1c5 into master Nov 3, 2023
70 of 71 checks passed

ggerganov deleted the sync branch November 3, 2023 19:35

bobqianic mentioned this pull request Nov 9, 2023

ggml_metal_init failure: loading kernel function on Intel based mac #1292

Closed

cebtenzzre mentioned this pull request Nov 28, 2023

cublas Cuda 801 on Maxwell Titan X #1447

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sync : ggml (backend v2, k-quants, CUDA opts, Metal opts, etc.) #1422

sync : ggml (backend v2, k-quants, CUDA opts, Metal opts, etc.) #1422

ggerganov commented Nov 3, 2023

nchudleigh commented Nov 4, 2023

sync : ggml (backend v2, k-quants, CUDA opts, Metal opts, etc.) #1422

sync : ggml (backend v2, k-quants, CUDA opts, Metal opts, etc.) #1422

Conversation

ggerganov commented Nov 3, 2023

nchudleigh commented Nov 4, 2023