Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

convert.py doesn't support BF16 safetensors... #1473

Closed
shimaowo opened this issue May 16, 2023 · 3 comments · Fixed by #1598
Closed

convert.py doesn't support BF16 safetensors... #1473

shimaowo opened this issue May 16, 2023 · 3 comments · Fixed by #1598

Comments

@shimaowo
Copy link

shimaowo commented May 16, 2023

...but this seems more like an oversight. No PR because I don't understand enough of what is going on yet.

When loading a safetensors model in BF16 format (like Metharme-7B after merging), convert.py will throw a KeyError because this dictionary does not contain an entry for 'BF16'.

Simply adding this key and having it map to DT_BF16 made the script work, but I haven't validated the output models yet.

I thought this might just be an oversight from #1309, but I'm not actually clear that what my change is doing is correct or desirable.

Regardless, there seems to be a bug in that the script can't work as-is with BF16 safetensors models.

This was as of commit 63d2046 (tip of master at time of writing)

@akx
Copy link
Contributor

akx commented May 26, 2023

A merged Metharme-7B seems to work fine. I made a (trivial) PR: #1598

@FNsi
Copy link
Contributor

FNsi commented May 28, 2023

"bf16 which is only available on Ampere and later, I would expect some performance degradation if running it in fp16 instead"

😅

@shimaowo
Copy link
Author

This wasn't really intended to be a philosophical discussion of whether bf16 is appropriate, but more like "hey, a PR was made and merged here that added support for a thing, but it turns out that thing doesn't actually work without this additional change."

We also don't control the source model, and if we want cpu inference I'm not really aware of an alternative.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants