Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ollama functions should not have a default value for keepAlive #7295

Open
5 tasks done
rick-github opened this issue Nov 29, 2024 · 2 comments
Open
5 tasks done

ollama functions should not have a default value for keepAlive #7295

rick-github opened this issue Nov 29, 2024 · 2 comments
Labels
auto:bug Related to a bug, vulnerability, unexpected error with an existing feature

Comments

@rick-github
Copy link

Checked other resources

  • I added a very descriptive title to this issue.
  • I searched the LangChain.js documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain.js rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

import { ChatOllama } from "@langchain/community/chat_models/ollama";
import { HumanMessage } from "@langchain/core/messages";

async function main() {
  const duration = process.argv[2] ? parseInt(process.argv[2], 10) : undefined;
  
  const config: any = {
    baseUrl: "http://localhost:11434",
    model: "llama3.2",
  };
  
  if (duration !== undefined) {
    config.keepAlive = duration;
  }

  const chat = new ChatOllama(config);

  try {
    const response = await chat.invoke([
      new HumanMessage("2+2=?"),
    ]);
    console.log("Response:", response.content);
  } catch (error) {
    console.error("Error:", error);
  }
}

main();

Error Message and Stack Trace (if applicable)

No response

Description

Projects using ChatOllama have a default keepAlive added if the client has not set it. This overrides the value set by other clients or the server. If the ChatOllama client doesn't explicitly set keepAlive, the langchainjs library should not set it.

For example, by default my ollama server is set to never unload a model:

$ curl -s localhost:11434/api/generate -d '{"model":"llama3.2","prompt":"2+2=?","stream":false}' | jq .response
"2 + 2 = 4"
$ ollama ps
NAME               ID              SIZE      PROCESSOR    UNTIL   
llama3.2:latest    a80c4f17acd5    3.1 GB    100% GPU     Forever    

If I use the langchainjs library and explicitly set a timeout, it works as expected:

$ 2>&- node --loader ts-node/esm ollama.ts 1800
Response: 2 + 2 = 4
$ ollama ps
NAME               ID              SIZE      PROCESSOR    UNTIL               
llama3.2:latest    a80c4f17acd5    3.1 GB    100% GPU     29 minutes from now    

If the langchainjs client doesn't set a keepAlive value, the expectation is the value previously set for the model should remain unchanged, but that's not the case:

$ 2>&- node --loader ts-node/esm ollama.ts 
Response: 2 + 2 = 4.
$ ollama ps
NAME               ID              SIZE      PROCESSOR    UNTIL              
llama3.2:latest    a80c4f17acd5    3.1 GB    100% GPU     4 minutes from now    

System Info

└─ langchain@npm:0.3.6
   ├─ Instances: 1
   ├─ Version: 0.3.6
   │
   └─ Dependencies
      ├─ @langchain/openai@npm:>=0.1.0 <0.4.0 → npm:0.3.14
      ├─ @langchain/textsplitters@npm:>=0.0.0 <0.2.0 → npm:0.1.0
      ├─ js-tiktoken@npm:^1.0.12 → npm:1.0.15
      ├─ js-yaml@npm:^4.1.0 → npm:4.1.0
      ├─ jsonpointer@npm:^5.0.1 → npm:5.0.1
      ├─ langsmith@npm:^0.2.0 → npm:0.2.8
      ├─ openapi-types@npm:^12.1.3 → npm:12.1.3
      ├─ p-retry@npm:4 → npm:4.6.2
      ├─ uuid@npm:^10.0.0 → npm:10.0.0
      ├─ yaml@npm:^2.2.1 → npm:2.6.1
      ├─ zod-to-json-schema@npm:^3.22.3 → npm:3.23.5
      └─ zod@npm:^3.22.4 → npm:3.23.8

platform: linux

$ node --version
v20.9.0
$ yarn --version
3.5.1
Copy link

dosubot bot commented Nov 29, 2024

Hey @rick-github! I'm here to help you with any bugs, questions, or if you'd like to become a contributor. Let's tackle this issue together!

I found a similar discussion regarding the keepAlive setting in LangChain.js. The issue was that the library sets a default keepAlive value for ChatOllama to "5m" when the client does not explicitly set it. This behavior can override the server's settings, which might not be desirable in all cases [1].

In the discussion, it was noted that upgrading to Node.js version 20 resolved a related issue with the UND_ERR_HEADERS_TIMEOUT error, which might be relevant if you're experiencing similar timeout issues [2].

If you want to avoid the default keepAlive setting, you might consider explicitly setting it to undefined or another value that aligns with your server's configuration. However, as of now, there doesn't seem to be a built-in option in LangChain.js to disable the default keepAlive behavior entirely.

To continue talking to Dosu, mention @dosu.

@dosubot dosubot bot added the auto:bug Related to a bug, vulnerability, unexpected error with an existing feature label Nov 29, 2024
@rick-github
Copy link
Author

rick-github commented Nov 29, 2024

For comparison, the ollama python client library (maintained by the ollama project) performs as expected:

$ ollama run llama3.2 '2+2=?'
2 + 2 = 4
$ ollama ps
NAME               ID              SIZE      PROCESSOR    UNTIL   
llama3.2:latest    a80c4f17acd5    3.1 GB    100% GPU     Forever    
$ python3 -c 'import ollama;print(ollama.Client().generate(model="llama3.2",prompt="2+2?",keep_alive=1800)["response"])'
The answer to 2 + 2 is 4.
$ ollama ps
NAME               ID              SIZE      PROCESSOR    UNTIL               
llama3.2:latest    a80c4f17acd5    3.1 GB    100% GPU     29 minutes from now    
$ python3 -c 'import ollama;print(ollama.Client().generate(model="llama3.2",prompt="2+2?")["response"])'
2 + 2 = 4
$ ollama ps
NAME               ID              SIZE      PROCESSOR    UNTIL               
llama3.2:latest    a80c4f17acd5    3.1 GB    100% GPU     29 minutes from now   
$ python3 -c 'import ollama;print(ollama.Client().chat(keep_alive=1800,model="llama3.2",messages=[{"role":"user","content":"2+2=?"}])["message"]["content"])'
2 + 2 = 4
$ ollama ps
NAME               ID              SIZE      PROCESSOR    UNTIL               
llama3.2:latest    a80c4f17acd5    3.1 GB    100% GPU     29 minutes from now    
$ python3 -c 'import ollama;print(ollama.Client().chat(model="llama3.2",messages=[{"role":"user","content":"2+2=?"}])["message"]["content"])'
2 + 2 = 4.
$ ollama ps
NAME               ID              SIZE      PROCESSOR    UNTIL               
llama3.2:latest    a80c4f17acd5    3.1 GB    100% GPU     29 minutes from now  

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto:bug Related to a bug, vulnerability, unexpected error with an existing feature
Projects
None yet
Development

No branches or pull requests

1 participant