-
Notifications
You must be signed in to change notification settings - Fork 392
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use last path component for unnamed ggufs #1281
use last path component for unnamed ggufs #1281
Conversation
Usually, a ggml-model-XXX.gguf file will reside in the directory named after the model, e.g. if the user generated the quant themselves and didn't move it. Instead of displaying models as ggml-model-xxx.gguf, this picks the last directory component and tacks the quant info on it.
Hmm... I'm concerned about cases where the directory name is not the desired model name. For example, people might simply download https://huggingface.co/nmerkle/Meta-Llama-3-8B-Instruct-ggml-model-Q4_K_M.gguf/tree/main to It's also very possible that the model path doesn't contain the name directly. For example or simply be equally unhelpful
This approach also potentially exposes the directory structure of the host in unwanted ways. |
Good point. Would it be acceptable if it was restricted to the case where there were .safetensors files in the same directory as the gguf file, perhaps? Added that code and updated OP. |
…s files in the same dir
I think probably still not a good idea. I myself have copied the files into random directories on my filepath when testing so it would definitely break for me too. Anyway, this is already configurable as the horde model display name, it would be best for the user to set that themselves anyway, which they can. If that's done, it will override the display name, and if an API key is not set the worker won't start. So that can be used to rename displayed model names at will. |
Well, for this to take effect you would have to:
I see your concern, but it seems quite unlikely that this would happen very frequently. What if there was a flag that enabled this behavior?
This is mostly for the use case where you are jumping between models, and/or when you are quanting things yourself (I often download the HF model, then quant it myself without moving the resulting quant out of the dir). This is particularly tiresome for model makers, who might be quanting and testing models throughout training, and although e.g. Silly Tavern tracks the model name used, all you get to see is I can see if such a niche use case would be low priority, though. It would be cool if other model makers chimed in on this one. |
Closing for now due to lack of interest. It may be possible to get an upstream PR in to allow GGUF names to adapt path component, which would be more or less equivalent without the downsides. |
Usually, a ggml-model-XXX.gguf file will reside in the directory named after the model, e.g. if the user generated the quant themselves and didn't move it. Instead of displaying models as
koboldcpp/ggml-model-xxx
, this picks the last directory component and tacks the quant info on it, but only if there is at least one file with the.safetensors
extension in the same directory.