-
Notifications
You must be signed in to change notification settings - Fork 10.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
convert-*.py: GGUF Naming Convention Refactor and Metadata Override Refactor #7499
Conversation
65dfe23
to
53096ef
Compare
438e88d
to
3189f15
Compare
gguf-py/gguf/utility.py
Outdated
expert_count_chunk = f"{expert_count_int}x" if expert_count_int is not None else "" | ||
parameters = model_parameter_count_rounded_notation(model_params_count) | ||
encodingScheme = encodingScheme.upper() | ||
return f"{name}{version}-{expert_count_chunk}{parameters}-{encodingScheme}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this missing a .gguf
? Unless it's intended to be added at call sites?
77e928c
to
9626f4f
Compare
Setting to ready to review as it looks workable now. Was hoping to also include But if I understand most people are mostly only using |
7bc6948
to
16958a9
Compare
rebased to get on top of the ci fixes in main branch for cleaner CI checks |
@compilade thanks for the review so far. I've had a bit more extra thought and think this is getting closer to the ideal. Btw... I'm having a bit of a second thought about the position of version string... should it be Also what about |
#7430 merged, so will need to refactor |
550a97b
to
38733e4
Compare
38733e4
to
a77eb3d
Compare
@compilade rebased as convert.py is now a legacy script. Also after studying the naming scheme I think mistral had the right idea in their naming arrangement so this is the new naming arrangement I would like to propose.
If this PR passes I'll make the change to https://github.com/ggerganov/ggml/blob/master/docs/gguf.md |
@compilade
Seems a bit convoluted to output json. Anyway I'll just do it as a seperate PR as I personally classify it as a nice to have but slightly out of scope to kv store refactoring.
That makes sense. Well it's partially also because I don't want to spend too much brainpower on it. Anyway I would prefer regex and I hope the new test cases I added will make it's graceful failure behavior more robust (Aka output no breakdown of sub components).
Yeah I see now that it's not super intuitive. Well at least it sounds like it's not a blocker here.
I actually agree with you on that, we could have a different form specifically for vocab out
I agree we want to avoid long term issue hence the refactor, but I'm also a bit annoyed that this PR is getting to the point that I'm constantly fixing conflicts with changes to main branch. So really would like to feature freeze and make sure we get just the most important stuff in (like stuff that you would regret if not adjusted as mentioned above) and merged in as soon as possible. |
It might help with the convert_lora_to_gguf.py script if default values were added here Co-authored-by: compilade <git@compilade.net>
@mofosyne I agree this can be annoying. Something which I've found useful with my long-lived PRs is to use |
@compilade thanks for the tip regarding using merge over rebase. I have also addressed your other point about needing to add in the vocab only default filename. Hopefully if it all looks good, don't forget to press the approve button, so we can get this merged in. Also happy for you to directly add your changes in if you spot anything you want to correct as well. |
@compilade btw while you have a look though, just wondering what you are still looking for? I think I've at least addressed all the major pain points now, but if there is anything I've overlooked do let me know. I've got all the points addressed as far as I know but this PR ticket is still locked because it's marked as 'compilade requested changes'. |
Just so you know, if you believe I'm not responsive and my request for changes has been addressed, you can always "dismiss" a review (from the reviewer list summary at the bottom), but I think the reason given is not editable afterwards. |
Had a look again at all the review requested marked code, looked to all be resolved already
I think in this case it's unavoidable as microsoft is definitely going against convention in the community. So wontfix in this PR at least.
Easy to sort out in future PR if needed
Hmmm... tempted to keep PR open if it's going to uncover an annoying issue later. But again probbly fixable in a later PR
Yeah might be worth testing. Bit busy, but you can slide some changes would be appreciated.
Undecided, but can fix later
True true, let's keep in mind for later PR. Will merge in a day or two if no objection, but be good to see if we can handle |
I adapted my test cases from https://regex101.com/r/NlzLoK/1 to more closely (but not perfectly) fit this PR. Feel free to use and adapt as you see fit for unit tests and matching regex. To clarify: I have not tried the PR. Just posting this here for inspiration.
|
Using more than one regex to annotate the parts of the name, this way, the order doesn't have to be fixed and this should work correctly for more edge cases. Also, the total parameter count of the model is used to figure out if a size label is not actually a size label, but a context size. * convert_lora : fix duplicate model type key
* gguf-py : output the split plan on stdout when using dry_run * convert_hf : unify vocab naming convention with the standard one This also adds a way to name LoRA models.
@mofosyne I've fixed most of the problem with the regex by using more than one regex, but on smaller parts of the name at a time. It was otherwise hard to allow for variations in the order of fields. I'll try to summarize what my recent commits did:
I'm starting to be mostly satisfied of the behavior of
|
Personally reviewed each commit you made. Looks sensible. Interesting that you switched to capital for size label, but not a big deal. Going to continue to merge soon as your points above still sounds like it could be a later PR anyway. Really appreciated the extra test cases. |
@mofosyne This commit caused model conversion to stop working for me. I had to revert back to b3412 to be able to convert glm-4-9b-chat. And since this change I get:
Conversion command: |
see if this PR fixes your issue |
…efactor (ggerganov#7499) Main thing is that the default output filename will take this form {name}{parameters}{finetune}{version}{encoding}{kind} In addition this add and remove some entries in the KV store and adds a metadata class with automatic heuristics capability to derive some values based on model card content * No Change: - Internal GGUF Spec - `general.architecture` - `general.quantization_version` - `general.alignment` - `general.file_type` - General Model Details - `general.name` - `general.author` - `general.version` - `general.description` - Licensing details - `general.license` - Typically represents the converted GGUF repo (Unless made from scratch) - `general.url` - Model Source during conversion - `general.source.url` * Removed: - Model Source during conversion - `general.source.huggingface.repository` * Added: - General Model Details - `general.organization` - `general.finetune` - `general.basename` - `general.quantized_by` - `general.size_label` - Licensing details - `general.license.name` - `general.license.link` - Typically represents the converted GGUF repo (Unless made from scratch) - `general.doi` - `general.uuid` - `general.repo_url` - Model Source during conversion - `general.source.doi` - `general.source.uuid` - `general.source.repo_url` - Base Model Source - `general.base_model.count` - `general.base_model.{id}.name` - `general.base_model.{id}.author` - `general.base_model.{id}.version` - `general.base_model.{id}.organization` - `general.base_model.{id}.url` (Model Website/Paper) - `general.base_model.{id}.doi` - `general.base_model.{id}.uuid` - `general.base_model.{id}.repo_url` (Model Source Repository (git/svn/etc...)) - Array based KV stores - `general.tags` - `general.languages` - `general.datasets` --------- Co-authored-by: compilade <git@compilade.net> Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
In the interest of encouraging users to stick to GGUF Naming Convention I'm trying to move over the changes I made to convert.py to
convert-hf-to-gguf.py
for the default file name generation.I would like to also port over the metadata override to the other converter script as well.
This is a work in progress but is made a draft PR in the interest of potential feedback.
E.g. I haven't got an idea yet on how to calculate the tensor parameter count for convert-llama-ggml-to-gguf.py . Also gotta figure out a smooth way for all of these script to share the same metadata override settings.