GGUF support #412

philpax · 2023-08-20T20:18:37Z

Implements support for loading and saving GGUF support.

TODO:

Open questions:

Should we still support the old formats?
- No. Instead, we'll take the old code and build a converter with it. Preferably one that can ingest HF files to provide the necessary information for a fully-compliant GGUF.
How resilient should we be to malformed GGUF models?
- Answer: The usual Rust standard. Don't panic if you can avoid it.

Closes #365.

svenstaro · 2023-08-27T11:37:34Z

I think having a migration tool for converting previous formats to GGUF and then removing support for other models might be the most maintainable solution. It might be too early to definitely call this but I think it's prudent to assume that the ecosystem will converge on GGUF as the preferred format soon.

* with some heavy caveats, see the PR

KerfuffleV2 · 2023-08-30T00:11:16Z

I've been messing around cleaning up the Python scripts in llama.cpp (like the converters, Python side of GGUF) so if you need to pick someone's brain about GGUF stuff I might be able to help. I'm not a expert by any means.

philpax · 2023-08-30T07:15:54Z

Aye, I noticed you contributed the conversion script upstream; I'll definitely reach out if I have any questions about the specifics there.

philpax mentioned this pull request Aug 20, 2023

GGUF file format specification ggerganov/ggml#302

Merged

philpax added 3 commits August 21, 2023 02:17

refactor: move ggml format to module

ffb0519

fix(ggml): use byte-arrays for magic

d5c2562

feat(ggml): impl unwired gguf loader

dd7aa26

philpax force-pushed the gguf branch from 81f4b14 to dd7aa26 Compare August 21, 2023 00:18

philpax added this to the 0.2 milestone Aug 21, 2023

philpax added 2 commits August 27, 2023 19:29

feat(gguf): gguf-v2 support

e166b7c

chore(gguf): clippy fix

90c6797

philpax mentioned this pull request Aug 27, 2023

Rust 1.72 fixes #416

Merged

philpax added 9 commits August 27, 2023 23:30

Merge branch 'main' into gguf

ddf4e40

fix(gguf): drop the null terminator

38dd730

refactor(ggml): begin decoupling the old formats

41462ed

feat(bin): add gguf-explorer as debugging tool

2de2df7

refactor(gguf): split metadata out + use macros

0da661f

wip: rewrite loader to use GGUF; almost wire up llama

e182444

fix(cli): use info log level

178a0fb

wip: successfully load a llama2 gguf*

2a9417a

* with some heavy caveats, see the PR

wip: disable everything that's broken

823828d

feat(llama): validate tensor data layout

eb8c508

LLukas22 mentioned this pull request Sep 27, 2023

thread '<unnamed>' panicked at 'called Result::unwrap() on an Err value: InvalidMagic { path: "model_merak.bin" }', src/model.rs:47:12 LLukas22/llm-rs-python#34

Open

philpax added 7 commits October 8, 2023 04:29

feat(llm): remove architecture param

f398ebd

feat(ggml): use newtype for metadata

588eb98

feat(llm): first pass at tokenizer re-port

388fa87

fix(llm): embedded tokenizer decode

f827517

wip(llm): reconvert ggml tokenizer with GPT-4

58cb8cc

Merge branch 'main' into gguf

43ebc3d

fix(ggml/llmb): use IndexMap for GGUF

e4db5b9

philpax added 10 commits October 29, 2023 19:55

fix(llmb): disable embedded tokenizer

8996061

refactor: move loading logic back into llmb, simplify

34379ac

feat: implement GGUF write / llm gguf rebuild

a4bbdbf

feat(llm): implement gguf add-hf-tokenizer

df1aa0e

fix(gguf): add support for ggufv3

d5e7b61

fix(gguf): load bools correctly

5457414

feat(llm): get GPT-NeoX loading again

6114076

feat(llm): get GPT-NeoX closer to working

8961ff7

feat(llm): more attempted GPT-NeoX fixes

be709ed

fix(cli): in info, elide known large items

a728852

philpax mentioned this pull request Oct 31, 2023

Support for Mistral-7b #434

Open

fix(llmb): usercallback show error

5ed38be

philpax mentioned this pull request Nov 1, 2023

Build against newer GGML version #428

Merged

6 tasks

philpax added 2 commits November 12, 2023 23:00

Merge in develop

ab956c9

chore: fix precommit

7c3d1cf

philpax changed the base branch from main to develop November 12, 2023 22:08

Merge branch 'develop' into gguf

4401631

philpax marked this pull request as ready for review November 12, 2023 22:11

philpax merged commit 535eda1 into develop Nov 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GGUF support #412

GGUF support #412

philpax commented Aug 20, 2023 •

edited

Loading

svenstaro commented Aug 27, 2023

KerfuffleV2 commented Aug 30, 2023

philpax commented Aug 30, 2023

GGUF support #412

GGUF support #412

Conversation

philpax commented Aug 20, 2023 • edited Loading

svenstaro commented Aug 27, 2023

KerfuffleV2 commented Aug 30, 2023

philpax commented Aug 30, 2023

philpax commented Aug 20, 2023 •

edited

Loading