convert.py doesn't support BF16 safetensors... #1473

shimaowo · 2023-05-16T01:57:29Z

...but this seems more like an oversight. No PR because I don't understand enough of what is going on yet.

When loading a safetensors model in BF16 format (like Metharme-7B after merging), convert.py will throw a KeyError because this dictionary does not contain an entry for 'BF16'.

Simply adding this key and having it map to DT_BF16 made the script work, but I haven't validated the output models yet.

I thought this might just be an oversight from #1309, but I'm not actually clear that what my change is doing is correct or desirable.

Regardless, there seems to be a bug in that the script can't work as-is with BF16 safetensors models.

This was as of commit 63d2046 (tip of master at time of writing)

The text was updated successfully, but these errors were encountered:

Fixes ggml-org#1473

akx · 2023-05-26T10:04:51Z

A merged Metharme-7B seems to work fine. I made a (trivial) PR: #1598

FNsi · 2023-05-28T05:05:14Z

"bf16 which is only available on Ampere and later, I would expect some performance degradation if running it in fp16 instead"

😅

shimaowo · 2023-05-28T05:25:56Z

This wasn't really intended to be a philosophical discussion of whether bf16 is appropriate, but more like "hey, a PR was made and merged here that added support for a thing, but it turns out that thing doesn't actually work without this additional change."

We also don't control the source model, and if we want cpu inference I'm not really aware of an alternative.

Fixes #1473

akx added a commit to akx/llama.cpp that referenced this issue May 26, 2023

convert.py: add mapping for safetensors bf16

f519e81

Fixes ggml-org#1473

akx mentioned this issue May 26, 2023

convert.py: add mapping for safetensors bf16 #1598

Merged

evanmiller closed this as completed in #1598 Jul 7, 2023

evanmiller pushed a commit that referenced this issue Jul 7, 2023

convert.py: add mapping for safetensors bf16 (#1598)

3e08ae9

Fixes #1473

akarX23 mentioned this issue Nov 10, 2023

fixed bf16 error in convert_llama.py intel/intel-extension-for-transformers#661

Merged

Bearsaerker mentioned this issue Mar 12, 2025

Eval bug: Gemma 3 extremly slow prompt processing when using quantized kv cache. #12352

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

convert.py doesn't support BF16 safetensors... #1473

convert.py doesn't support BF16 safetensors... #1473

shimaowo commented May 16, 2023 •

edited

Loading

akx commented May 26, 2023

FNsi commented May 28, 2023

shimaowo commented May 28, 2023

convert.py doesn't support BF16 safetensors... #1473

convert.py doesn't support BF16 safetensors... #1473

Comments

shimaowo commented May 16, 2023 • edited Loading

akx commented May 26, 2023

FNsi commented May 28, 2023

shimaowo commented May 28, 2023

shimaowo commented May 16, 2023 •

edited

Loading