Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to load models with Llamacpp #686

Open
aniketmaurya opened this issue Mar 10, 2024 · 6 comments
Open

Unable to load models with Llamacpp #686

aniketmaurya opened this issue Mar 10, 2024 · 6 comments

Comments

@aniketmaurya
Copy link

aniketmaurya commented Mar 10, 2024

The bug
A clear and concise description of what the bug is.

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
[/teamspace/studios/this_studio/eval.ipynb](https://vscode-remote+vscode-002d01hraxt48damrgfwy9ven0ye4s-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/teamspace/studios/this_studio/eval.ipynb) Cell 3 line 4
      [1](vscode-notebook-cell://vscode-01hraxt48damrgfwy9ven0ye4s.studio.lightning.ai/teamspace/studios/this_studio/eval.ipynb#W2sdnNjb2RlLXJlbW90ZQ%3D%3D?line=0) from guidance import models, gen, select
      [3](vscode-notebook-cell://vscode-01hraxt48damrgfwy9ven0ye4s.studio.lightning.ai/teamspace/studios/this_studio/eval.ipynb#W2sdnNjb2RlLXJlbW90ZQ%3D%3D?line=2) path = "mistral-7b-instruct-v0.2.Q8_0.gguf"
----> [4](vscode-notebook-cell://vscode-01hraxt48damrgfwy9ven0ye4s.studio.lightning.ai/teamspace/studios/this_studio/eval.ipynb#W2sdnNjb2RlLXJlbW90ZQ%3D%3D?line=3) llm = models.LlamaCpp(model=path, n_ctx=4096, n_gpu_layers=-1)
      [6](vscode-notebook-cell://vscode-01hraxt48damrgfwy9ven0ye4s.studio.lightning.ai/teamspace/studios/this_studio/eval.ipynb#W2sdnNjb2RlLXJlbW90ZQ%3D%3D?line=5) from llama_cpp import Llama
      [7](vscode-notebook-cell://vscode-01hraxt48damrgfwy9ven0ye4s.studio.lightning.ai/teamspace/studios/this_studio/eval.ipynb#W2sdnNjb2RlLXJlbW90ZQ%3D%3D?line=6) llm = Llama(
      [8](vscode-notebook-cell://vscode-01hraxt48damrgfwy9ven0ye4s.studio.lightning.ai/teamspace/studios/this_studio/eval.ipynb#W2sdnNjb2RlLXJlbW90ZQ%3D%3D?line=7)       model_path=path,
      [9](vscode-notebook-cell://vscode-01hraxt48damrgfwy9ven0ye4s.studio.lightning.ai/teamspace/studios/this_studio/eval.ipynb#W2sdnNjb2RlLXJlbW90ZQ%3D%3D?line=8)       n_gpu_layers=-1,
     [10](vscode-notebook-cell://vscode-01hraxt48damrgfwy9ven0ye4s.studio.lightning.ai/teamspace/studios/this_studio/eval.ipynb#W2sdnNjb2RlLXJlbW90ZQ%3D%3D?line=9) )

File [/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/guidance/models/llama_cpp/_llama_cpp.py:74](https://vscode-remote+vscode-002d01hraxt48damrgfwy9ven0ye4s-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/guidance/models/llama_cpp/_llama_cpp.py:74), in LlamaCpp.__init__(self, model, tokenizer, echo, compute_log_probs, caching, temperature, **kwargs)
     [71](https://vscode-remote+vscode-002d01hraxt48damrgfwy9ven0ye4s-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/guidance/models/llama_cpp/_llama_cpp.py:71) else:
     [72](https://vscode-remote+vscode-002d01hraxt48damrgfwy9ven0ye4s-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/guidance/models/llama_cpp/_llama_cpp.py:72)     raise TypeError("model must be None, a file path string, or a llama_cpp.Llama object.")
---> [74](https://vscode-remote+vscode-002d01hraxt48damrgfwy9ven0ye4s-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/guidance/models/llama_cpp/_llama_cpp.py:74) self._context = _LlamaBatchContext(self.model_obj.n_batch, self.model_obj.n_ctx())
     [76](https://vscode-remote+vscode-002d01hraxt48damrgfwy9ven0ye4s-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/guidance/models/llama_cpp/_llama_cpp.py:76) if tokenizer is None:
     [77](https://vscode-remote+vscode-002d01hraxt48damrgfwy9ven0ye4s-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/guidance/models/llama_cpp/_llama_cpp.py:77)     tokenizer = llama_cpp.LlamaTokenizer(self.model_obj)

File [/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/guidance/models/llama_cpp/_llama_cpp.py:23](https://vscode-remote+vscode-002d01hraxt48damrgfwy9ven0ye4s-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/guidance/models/llama_cpp/_llama_cpp.py:23), in _LlamaBatchContext.__init__(self, n_batch, n_ctx)
     [21](https://vscode-remote+vscode-002d01hraxt48damrgfwy9ven0ye4s-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/guidance/models/llama_cpp/_llama_cpp.py:21) def __init__(self, n_batch, n_ctx):
     [22](https://vscode-remote+vscode-002d01hraxt48damrgfwy9ven0ye4s-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/guidance/models/llama_cpp/_llama_cpp.py:22)     self._llama_batch_free = llama_cpp.llama_batch_free
---> [23](https://vscode-remote+vscode-002d01hraxt48damrgfwy9ven0ye4s-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/guidance/models/llama_cpp/_llama_cpp.py:23)     self.batch = llama_cpp.llama_batch_init(n_tokens=n_batch, embd=0, n_seq_max=n_ctx)
     [24](https://vscode-remote+vscode-002d01hraxt48damrgfwy9ven0ye4s-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/guidance/models/llama_cpp/_llama_cpp.py:24)     if self.batch is None:
     [25](https://vscode-remote+vscode-002d01hraxt48damrgfwy9ven0ye4s-002estudio-002elightning-002eai.vscode-resource.vscode-cdn.net/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/guidance/models/llama_cpp/_llama_cpp.py:25)         raise Exception("call to llama_cpp.llama_batch_init returned NULL.")

TypeError: this function takes at least 3 arguments (0 given)

To Reproduce
Give a full working code snippet that can be pasted into a notebook cell or python file. Make sure to include the LLM load step so we know which model you are using.

from guidance import models, gen, select

path = "mistral-7b-instruct-v0.2.Q8_0.gguf"
llm = models.LlamaCpp(model=path, n_ctx=4096, n_gpu_layers=-1)

System info (please complete the following information):

  • OS (e.g. Ubuntu, Windows 11, Mac OS, etc.):
  • Guidance Version (guidance.__version__):
@Warlord-K
Copy link

I was facing the same error, install llama-cpp-python==0.2.26 and it should work!

@alexandreteles
Copy link

alexandreteles commented Mar 11, 2024

Using an older llama-cpp version works but limits the usage of some newer models. For example, you can't load stabilityai/stablelm-2-zephyr-1_6b on 0.2.26. We need the team to bump compatibility, ideally to the latest version :)

@paulbkoch
Copy link
Collaborator

I think this issue is resolved with PR #665, but there hasn't been a release since then.

@alexandreteles
Copy link

alexandreteles commented Mar 12, 2024

but there hasn't been a release since then.

Should we expect a release any time soon or should I simply cherry pick the change into a local fork?

EDIT: @paulbkoch maybe consider offering a nightly version on pypi that reflects the latest state of development without requiring us to pip install from github?

@michael-conrad
Copy link

Any hope for a release to pypi soon with this fix?

#692

@michael-conrad
Copy link

but there hasn't been a release since then.

Should we expect a release any time soon or should I simply cherry pick the change into a local fork?

EDIT: @paulbkoch maybe consider offering a nightly version on pypi that reflects the latest state of development without requiring us to pip install from github?

Is the pull request merged for installing directly from github? Or is there a way to install directly from github and pull in the pull request at the same time?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants