Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use last path component for unnamed ggufs #1281

Conversation

kallewoof
Copy link

@kallewoof kallewoof commented Dec 24, 2024

Usually, a ggml-model-XXX.gguf file will reside in the directory named after the model, e.g. if the user generated the quant themselves and didn't move it. Instead of displaying models as koboldcpp/ggml-model-xxx, this picks the last directory component and tacks the quant info on it, but only if there is at least one file with the .safetensors extension in the same directory.

Usually, a ggml-model-XXX.gguf file will reside in the directory named after the model, e.g. if the user generated the quant themselves and didn't move it. Instead of displaying models as ggml-model-xxx.gguf, this picks the last directory component and tacks the quant info on it.
@LostRuins
Copy link
Owner

Hmm... I'm concerned about cases where the directory name is not the desired model name.

For example, people might simply download https://huggingface.co/nmerkle/Meta-Llama-3-8B-Instruct-ggml-model-Q4_K_M.gguf/tree/main to C:\Users\Bob\Desktop\ggml-model-Q4_K_M.gguf, and then the model name would be ... "Desktop".

It's also very possible that the model path doesn't contain the name directly. For example C:\Users\Bob\Desktop\Mistral-Airoboros-7B\ggufquant\ggml-model-Q4_K_M.gguf

or simply be equally unhelpful

C:\Users\Bob\Desktop\model\ggml-model-Q4_K_M.gguf

This approach also potentially exposes the directory structure of the host in unwanted ways.

@kallewoof
Copy link
Author

kallewoof commented Dec 25, 2024

Good point. Would it be acceptable if it was restricted to the case where there were .safetensors files in the same directory as the gguf file, perhaps?

Added that code and updated OP.

@LostRuins
Copy link
Owner

I think probably still not a good idea. I myself have copied the files into random directories on my filepath when testing so it would definitely break for me too.

Anyway, this is already configurable as the horde model display name, it would be best for the user to set that themselves anyway, which they can. If that's done, it will override the display name, and if an API key is not set the worker won't start. So that can be used to rename displayed model names at will.

image

image

@kallewoof
Copy link
Author

kallewoof commented Dec 25, 2024

I think probably still not a good idea. I myself have copied the files into random directories on my filepath when testing so it would definitely break for me too.

Well, for this to take effect you would have to:

  1. Copy the ggml-model-xxx.gguf file, as is, to some random directory unrelated to the model name (now you have no idea what model it is anymore).
  2. Also copy .safetensors files into the same directory, again, without this directory being related to the model in question (i.e. you actively moved them from e.g. a huggingface transformers model directory into this random dir)
  3. Serve this unnamed unknown model to the public.

I see your concern, but it seems quite unlikely that this would happen very frequently.

What if there was a flag that enabled this behavior?

Anyway, this is already configurable as the horde model display name, it would be best for the user to set that themselves anyway, which they can. If that's done, it will override the display name, and if an API key is not set the worker won't start. So that can be used to rename displayed model names at will.

This is mostly for the use case where you are jumping between models, and/or when you are quanting things yourself (I often download the HF model, then quant it myself without moving the resulting quant out of the dir). This is particularly tiresome for model makers, who might be quanting and testing models throughout training, and although e.g. Silly Tavern tracks the model name used, all you get to see is ggml-model for all of them, unless you specifically rename the file each time. So I might make a testx-checkpoint1234 dir with a model, quant it, boot it up and test it, and then later on, I have no idea what model this was.

I can see if such a niche use case would be low priority, though. It would be cool if other model makers chimed in on this one.

@LostRuins LostRuins added the KIV for now Some issues prevent this from being merged label Dec 28, 2024
@kallewoof
Copy link
Author

kallewoof commented Dec 28, 2024

Screenshot 2024-12-28 at 16 13 48

This is llama.cpp's llama-server, by the way. It prints the model name exactly as you entered it in the command line. Here I did

$ ./build/bin/llama-server -c 16384 -ngl 90 -m ../llm/Qwen2.5-32B-Instruct-Q8_0.gguf --host 0.0.0.0

A better example (which uses ggml-model) is this, freshly quanted from a model dir:
Screenshot 2024-12-28 at 16 42 47

@kallewoof
Copy link
Author

Closing for now due to lack of interest. It may be possible to get an upstream PR in to allow GGUF names to adapt path component, which would be more or less equivalent without the downsides.

@kallewoof kallewoof closed this Jan 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
KIV for now Some issues prevent this from being merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants