You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Request for llama.cpp Integration to Support Low-End Hardware Compatibility
Description
I'm currently trying to integrate llama.cpp with Meditron for running models on lower-end hardware. Meditron is based on Llama, so in theory, this should be possible. However, I'm encountering issues when attempting to convert the Meditron model using llama.cpp.
Steps to Reproduce
Either run python3 convert-hf-to-gguf.py ../meditron-7b/
./build/bin/main --rope-freq-scale 8.0 -m ../meditron-7b/pytorch_model-00008-of-00008.bin -p "I have pain in my leg from toes to hip"
Output:
Log start
...
error loading model: llama_model_loader: failed to load model from ../meditron-7b/pytorch_model-00008-of-00008.bin
Expected Behavior
Successful integration of llama.cpp with Meditron, allowing the model to run on lower-end hardware.
Actual Behavior
Encountering a NotImplementedError for the architecture "LlamaForCausalLM" when trying to convert the model, and an error loading the model when launching directly with llama.cpp.
Possible Solution
Adjustments in llama.cpp to support the "LlamaForCausalLM" architecture used by Meditron. This could involve modifying the model conversion script or the model loading mechanism in llama.cpp.
I kindly request the team to consider adding support for llama.cpp integration with Meditron. Or to give advices on how to implement it. This would be a significant enhancement, enabling the use of Meditron models on more diverse hardware setups, especially those at the lower end.
The text was updated successfully, but these errors were encountered:
Request for llama.cpp Integration to Support Low-End Hardware Compatibility
Description
I'm currently trying to integrate
llama.cpp
with Meditron for running models on lower-end hardware. Meditron is based on Llama, so in theory, this should be possible. However, I'm encountering issues when attempting to convert the Meditron model usingllama.cpp
.Steps to Reproduce
Either run
python3 convert-hf-to-gguf.py ../meditron-7b/
Or directly launching with
llama.cpp
using:Expected Behavior
Successful integration of
llama.cpp
with Meditron, allowing the model to run on lower-end hardware.Actual Behavior
Encountering a
NotImplementedError
for the architecture "LlamaForCausalLM" when trying to convert the model, and an error loading the model when launching directly withllama.cpp
.Possible Solution
Adjustments in
llama.cpp
to support the "LlamaForCausalLM" architecture used by Meditron. This could involve modifying the model conversion script or the model loading mechanism inllama.cpp
.Additional Context
Link to llama.cpp
Request
I kindly request the team to consider adding support for
llama.cpp
integration with Meditron. Or to give advices on how to implement it. This would be a significant enhancement, enabling the use of Meditron models on more diverse hardware setups, especially those at the lower end.The text was updated successfully, but these errors were encountered: