-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow modification of context length for Ollama - [Roadmap] #495
Comments
@kunchenwork this is working already, right? |
Hi @enricoros. Thanks for the prompt response. To clarify, I am referring to the context length, which is set to 4096 by default for every model even though some of them can handle more. Take the Mistral as an example, the mistral v0.2 can handle 32k context, but it only shows 4k context limit in the model configuration page. |
Hi @kunchenwork, I've researched and this is the conclusion. This issue came up before: #309 I posted this detailed comment on the Ollama GitHub a couple of months ago, requesting for Ollama to declare the Context Size of the models: ollama/ollama#1473 (comment) Analysis results:
As per the analysis, in this example for Mistral-7B (the one you used), Ollama does not report the context window. |
Closing until Ollama fixes this. Very detailed analysis provided. |
@enricoros why not let users configure the context length manually (like we do for the output token) until Ollama provides the information via an API. Fixing it at 4096 is worse. One of the reasons folks use local models is to economize on the cost (by running larger context sizes). |
Here's why: Ollama is a large and funded project with lots of contributors. If the project provided the context length of models, every UI will just work great. I could also add an option to override it, but unfortunately it's not high on my priority list to add workarounds, so I personally cannot do it, but I welcome and I will merge any contribution to Big-AGI that allows a manual override of the context length (it's also not hard, but I have 100 bugs before this, in line). My advice is to go to ask Ollama for having a correct reporting of the number of tokens per model, which is the right solution that scales across the ecosystem. |
I totally understand that there may be other pressing issues. Would you be open to adding |
Note @arsaboo , @kunchenwork , if Ollama does not specify the num_ctx parameter (as it should) and the model list does not contain tokens (as I've requested), there's the option to change the list of models manually (no UI) as per #518 |
@enricoros This works for now. Given that there will not be TON of models, we can keep it updated manually if we have to. |
Why
Context length is a key trait in utilising LLM If we implement this, users would be able to change the context length for models capable of doing so, making big-AGI much more powerful to handle long text.
Description
Ollama allow API calls to specify context length. (see Ollama FAQ)
Can we add the context length setting for Ollama local models.
Requirements
The text was updated successfully, but these errors were encountered: