Creating a GGUF model

Run download_snapshot.py for your desired model
Clone llama.cpp. Ideally use the latest version, but if there is no convert_hf_to_gguf.py file, you can run git checkout 19d8762.
pip install -r llama.cpp/requirements.txt
Check that your model has a tokenizer.model file. If not, you'll need to get it from the base model. e.g for Phi-3-mini-4k-instruct-graph, there was no such file, so I downloaded it from the original Phi-3-mini repo. Put this file in your downloaded model's dir. PLEASE NOTE: if the tokenizer/vocab was modified from the base model to your desried finetuned model, this approach will likely cause issues.
Run:

python llama.cpp/convert_hf_to_gguf.py create-gguf/Phi-3-mini-4k-instruct-graph \
  --outfile create-gguf/Phi-3-mini-4k-instruct-graph.Q8_0.gguf \
  --outtype q8_0

Adding a model to Huggingface

To add to a huggingface model repo, follow these steps

Tip

To clear huggingface cache:

pip install -U "huggingface_hub[cli]"
huggingface-cli delete-cache

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Creating a GGUF model

Adding a model to Huggingface

Files

README.md

Latest commit

History

README.md

File metadata and controls

Creating a GGUF model

Adding a model to Huggingface