Allow modification of context length for Ollama - [Roadmap] #495

kunchenwork · 2024-04-09T05:34:11Z

Why
Context length is a key trait in utilising LLM If we implement this, users would be able to change the context length for models capable of doing so, making big-AGI much more powerful to handle long text.

Description
Ollama allow API calls to specify context length. (see Ollama FAQ)
Can we add the context length setting for Ollama local models.

Requirements

[UX] add a slide bar to modify context length for different Ollama model
[API Calls] modify the api calls to local ollama server to use context length as a parameter input from user setting

enricoros · 2024-04-09T06:08:26Z

@kunchenwork this is working already, right?

kunchenwork · 2024-04-09T06:15:53Z

Hi @enricoros. Thanks for the prompt response. To clarify, I am referring to the context length, which is set to 4096 by default for every model even though some of them can handle more. Take the Mistral as an example, the mistral v0.2 can handle 32k context, but it only shows 4k context limit in the model configuration page.

enricoros · 2024-04-09T08:46:06Z

Hi @kunchenwork, I've researched and this is the conclusion. This issue came up before: #309

I posted this detailed comment on the Ollama GitHub a couple of months ago, requesting for Ollama to declare the Context Size of the models: ollama/ollama#1473 (comment)

Analysis results:

Adding the context window in the API is is not done yet (see ticket)
The "tags" API, which list models, does not report Context Window information:
The "show" API, which lists model details, reports context window information only on very few models (I tried many):
The commandline API that you mention, is during a CLI session with ollama, so UIs cannot see it

As per the analysis, in this example for Mistral-7B (the one you used), Ollama does not report the context window.

enricoros · 2024-04-14T23:17:52Z

Closing until Ollama fixes this. Very detailed analysis provided.

arsaboo · 2024-04-26T13:08:58Z

@enricoros why not let users configure the context length manually (like we do for the output token) until Ollama provides the information via an API. Fixing it at 4096 is worse. One of the reasons folks use local models is to economize on the cost (by running larger context sizes).

enricoros · 2024-04-27T06:16:37Z

Here's why: Ollama is a large and funded project with lots of contributors. If the project provided the context length of models, every UI will just work great.

I could also add an option to override it, but unfortunately it's not high on my priority list to add workarounds, so I personally cannot do it, but I welcome and I will merge any contribution to Big-AGI that allows a manual override of the context length (it's also not hard, but I have 100 bugs before this, in line).

My advice is to go to ask Ollama for having a correct reporting of the number of tokens per model, which is the right solution that scales across the ecosystem.

arsaboo · 2024-04-27T15:51:02Z

I totally understand that there may be other pressing issues. Would you be open to adding contextWindow information to the model files (like you have for Anthropic and other models)?

enricoros · 2024-05-07T11:47:41Z

Note @arsaboo , @kunchenwork , if Ollama does not specify the num_ctx parameter (as it should) and the model list does not contain tokens (as I've requested), there's the option to change the list of models manually (no UI) as per #518

arsaboo · 2024-05-07T11:51:19Z

@enricoros This works for now. Given that there will not be TON of models, we can keep it updated manually if we have to.

enricoros closed this as not planned Won't fix, can't repro, duplicate, stale Apr 14, 2024

arsaboo mentioned this issue Apr 27, 2024

Allow specifying num_ctx for Ollama models #518

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow modification of context length for Ollama - [Roadmap] #495

Allow modification of context length for Ollama - [Roadmap] #495

kunchenwork commented Apr 9, 2024

enricoros commented Apr 9, 2024

kunchenwork commented Apr 9, 2024

enricoros commented Apr 9, 2024

enricoros commented Apr 14, 2024

arsaboo commented Apr 26, 2024

enricoros commented Apr 27, 2024

arsaboo commented Apr 27, 2024

enricoros commented May 7, 2024

arsaboo commented May 7, 2024

Allow modification of context length for Ollama - [Roadmap] #495

Allow modification of context length for Ollama - [Roadmap] #495

Comments

kunchenwork commented Apr 9, 2024

enricoros commented Apr 9, 2024

kunchenwork commented Apr 9, 2024

enricoros commented Apr 9, 2024

enricoros commented Apr 14, 2024

arsaboo commented Apr 26, 2024

enricoros commented Apr 27, 2024

arsaboo commented Apr 27, 2024

enricoros commented May 7, 2024

arsaboo commented May 7, 2024