-
Notifications
You must be signed in to change notification settings - Fork 11.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
convert.py doesn't support BF16 safetensors... #1473
Comments
A merged Metharme-7B seems to work fine. I made a (trivial) PR: #1598 |
"bf16 which is only available on Ampere and later, I would expect some performance degradation if running it in fp16 instead" 😅 |
This wasn't really intended to be a philosophical discussion of whether bf16 is appropriate, but more like "hey, a PR was made and merged here that added support for a thing, but it turns out that thing doesn't actually work without this additional change." We also don't control the source model, and if we want cpu inference I'm not really aware of an alternative. |
...but this seems more like an oversight. No PR because I don't understand enough of what is going on yet.
When loading a safetensors model in BF16 format (like Metharme-7B after merging), convert.py will throw a KeyError because this dictionary does not contain an entry for 'BF16'.
Simply adding this key and having it map to DT_BF16 made the script work, but I haven't validated the output models yet.
I thought this might just be an oversight from #1309, but I'm not actually clear that what my change is doing is correct or desirable.
Regardless, there seems to be a bug in that the script can't work as-is with BF16 safetensors models.
This was as of commit 63d2046 (tip of master at time of writing)
The text was updated successfully, but these errors were encountered: