Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert to Gguf format to work with Llama.cpp? #32

Closed
chigkim opened this issue Jan 12, 2024 · 10 comments
Closed

Convert to Gguf format to work with Llama.cpp? #32

chigkim opened this issue Jan 12, 2024 · 10 comments

Comments

@chigkim
Copy link

chigkim commented Jan 12, 2024

Llava has various quantized models in gguf format, so it can be used with Llama.cpp.
ggerganov/llama.cpp#3436
Is this possible?

@czczup
Copy link
Member

czczup commented Jan 16, 2024

Hi, thank you for your suggestion. I will add compatibility with community tools to my to-do list.

@leeaction
Copy link

gguf format is good for ollama users, Any Update?

@GHOST1834
Copy link

It will be nice to have this model in gguf format in ollama.

@nischalj10
Copy link

any updates on this? the 4b intern model is killer for its size! would love to see it supported with llama.cpp

@KOG-Nisse
Copy link

Would love internvl-chat-v1-5 in a gguf format!
https://internvl.opengvlab.com/

@thomas-rooty
Copy link

I second this

@ghost
Copy link

ghost commented Jul 11, 2024

@ErfeiCui why did you close this as completed?

@chigkim
Copy link
Author

chigkim commented Aug 11, 2024

Any update on this? InternVL2-Llama3-76B on Ollama/llama.cpp would be amazing!

@kim-gtek
Copy link

If someone gives me a tutorial I will write my own code to tranform this for pytorch to gguf for llama.cpp myself

@chigkim
Copy link
Author

chigkim commented Aug 12, 2024

It's more involved. you have to implement the model architecture and image preprocessing logic to llama.cpp which uses C++.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants