docs : add Moondream2 pre-quantized link #13745

ddpasa · 2025-05-24T10:35:19Z

Moondream2 model GGUF has been updated in https://huggingface.co/vikhyatk/moondream2 to the latest version, and it works with llama.cpp. However, the model vikhyatk published does not have a default chat template. The version at https://huggingface.co/Hahasb/moondream2-20250414-GGUF has been updated with tokenizer.chat_template=vicuna, which seems to work ok, but not sure if this is the optimal setup.

Fixes #13332
Fixes vikhyat/moondream#96

Moondream2 is an crazy good model compared to its tiny size. After this is merged, I'll start experimenting with quantizations, but even the fp16 version is tiny (less than 3GB for text, less than 1GB for the mmproj).

ddpasa · 2025-05-24T10:36:14Z

@ngxson for visibility. It might be good to move the model ggufs from a private repo to the official ggml-org repo.

ngxson · 2025-05-24T12:54:48Z

Can you also share the steps and commands you used to generate the mmproj GGUF?

It would be nice if we can add llava to convert_hf_to_gguf, but I still don't yet have time. A guide specifically for moondream can be a temporary solution

ddpasa · 2025-05-24T13:05:52Z

Can you also share the steps and commands you used to generate the mmproj GGUF?

It would be nice if we can add llava to convert_hf_to_gguf, but I still don't yet have time. A guide specifically for moondream can be a temporary solution

Hello @ngxson , I didn't create the mmproj. The author updated them in Huggingface a few days ago. However, that text model didn't have a chat template in it, so I just edited the gguf to add that field.

There is a create_gguf.py script in one of the branches of the moondream repo, I expect it came from there: https://github.com/vikhyat/moondream/blob/moondream-ggml/create_gguf.py

docs/multimodal.md

kth8 · 2025-05-25T21:24:59Z

I saw this model on /r/locallama the other day and benchmarks looked impressive so I ran this through a few tests with Gemini 2.5 as judge https://gist.github.com/kth8/195bfe61e8c3b2ef8cce4bf263808e2d

lus105 · 2025-05-28T06:18:27Z

Hello, is it possible to use it with detect or point methods in llama.cpp?

vikhyat · 2025-05-30T21:00:57Z

I saw this model on /r/locallama the other day and benchmarks looked impressive so I ran this through a few tests with Gemini 2.5 as judge https://gist.github.com/kth8/195bfe61e8c3b2ef8cce4bf263808e2d

This is cool, mind sharing the prompt you used for this?

Also just to clarify, the .gguf files in the HF repository are a year old. We've made a bunch of architectural changes since then so it's no longer possible to run the latest versions (that include the detect, point etc. capabilities) using llama.cpp.

kth8 · 2025-05-30T21:12:29Z

For the test model, just a simple Provide a very detailed description of this image.

and the prompt for Gemini

prompt = "You are a sophisticated, advanced multimodal language model. Your primary function in this task is to act as an expert evaluator. You will be provided with an image and corresponding description of that image generated by a smaller, potentially less capable, vision-language model {model}. Your task is to first generate an expert analysis of provided image then conduct a thorough and critical analysis of the smaller {model} VLM's generated description of the same image for inaccuracies and hallucinations. Tally up all the inaccuracies at the end and provide an overall conclusion.".format(model=test_model)

Multimodal: Added Moondream2 model and fixed ggml.org link

8db2386

github-actions bot added the documentation Improvements or additions to documentation label May 24, 2025

ngxson reviewed May 25, 2025

View reviewed changes

docs/multimodal.md Outdated Show resolved Hide resolved

docs/multimodal.md Outdated Show resolved Hide resolved

Apply suggestions from code review

40fb654

ngxson changed the title ~~Multimodal: Added Moondream2 model and fixed ggml.org link~~ docs : add Moondream2 pre-quantized link May 25, 2025

ngxson approved these changes May 25, 2025

View reviewed changes

ngxson merged commit a08c1d2 into ggml-org:master May 25, 2025
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs : add Moondream2 pre-quantized link #13745

docs : add Moondream2 pre-quantized link #13745

Uh oh!

ddpasa commented May 24, 2025

Uh oh!

ddpasa commented May 24, 2025

Uh oh!

ngxson commented May 24, 2025 •

edited

Loading

Uh oh!

ddpasa commented May 24, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kth8 commented May 25, 2025

Uh oh!

lus105 commented May 28, 2025

Uh oh!

vikhyat commented May 30, 2025

Uh oh!

kth8 commented May 30, 2025

Uh oh!

Uh oh!

docs : add Moondream2 pre-quantized link #13745

docs : add Moondream2 pre-quantized link #13745

Uh oh!

Conversation

ddpasa commented May 24, 2025

Uh oh!

ddpasa commented May 24, 2025

Uh oh!

ngxson commented May 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ddpasa commented May 24, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kth8 commented May 25, 2025

Uh oh!

lus105 commented May 28, 2025

Uh oh!

vikhyat commented May 30, 2025

Uh oh!

kth8 commented May 30, 2025

Uh oh!

Uh oh!

ngxson commented May 24, 2025 •

edited

Loading