[Feature Request]: Need LoRA model in `.gguf` format #243

bioinformatist · 2024-09-20T15:21:29Z

Feature request / 功能建议

Similar to #231, but useful

Hey my dear bros, we're building an RAG application (especially for one of our products) using MiniCPM3. Below is our stack:

Type	Component
LLM	MiniCPM3
Web server	Shuttle \| Axum
OpenAI-compatible API server	llama.cpp
Vector database	qdrant

It's almost done.

As MiniCPM3 comes with an RAG suite, we'd like to use the LoRA adapter for better performance, just like:

# Suppose we already have downloaded MiniCPM3-4B and MiniCPM3-RAG-LoRA-GGUF models in current directory
docker run --rm -it -p 8080:8080 -v $PWD/MiniCPM3-4B-GGUF:/models -v $PWD/MiniCPM3-RAG-LoRA-GGUF:/lora --gpus all ghcr.io/ggerganov/llama.cpp:server-cuda -m models/minicpm3-4b-q4_k_m.gguf --host 0.0.0.0 --port 8080 --n-gpu-layers 99 -v -ub 1024 -b 4096 --lora lora/lora-adapter-fp16.gguf

And the LoRA model cannot be converted to .gguf format now as the ggerganov/llama.cpp#9396 haven't be merged:

# As ditto
docker run -it --rm --entrypoint /app/convert_lora_to_gguf.py -v $PWD/MiniCPM3-4B:/models -v $PWD/MiniCPM3-RAG-LoRA:/lora ghcr.io/ggerganov/llama.cpp:full --outtype q8_0 --base /models /lora

It said:

The repository for /models contains custom code which must be executed to correctly load the model. You can inspect the repository content at https://hf.co//models.
You can avoid this prompt in future by passing the argument `trust_remote_code=True`.

Or could you give us some tips for converting? Thanks a lot!

MiniCPM3 is, de facto, an ideal edge-side LLM for small companies.

The text was updated successfully, but these errors were encountered:

LDLINGLINGLING · 2024-09-22T13:19:59Z

Hello, I think the best solution at present is to merge the original weights of lora and minicpm3, and then start your process

bioinformatist · 2024-09-23T03:15:13Z

Got. Let me have a try.

bioinformatist added the feature New features label Sep 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request]: Need LoRA model in `.gguf` format #243

[Feature Request]: Need LoRA model in `.gguf` format #243

bioinformatist commented Sep 20, 2024 •

edited

Loading

LDLINGLINGLING commented Sep 22, 2024

bioinformatist commented Sep 23, 2024

[Feature Request]: Need LoRA model in .gguf format #243

[Feature Request]: Need LoRA model in .gguf format #243

Comments

bioinformatist commented Sep 20, 2024 • edited Loading

Feature request / 功能建议

LDLINGLINGLING commented Sep 22, 2024

bioinformatist commented Sep 23, 2024

[Feature Request]: Need LoRA model in `.gguf` format #243

[Feature Request]: Need LoRA model in `.gguf` format #243

bioinformatist commented Sep 20, 2024 •

edited

Loading