Skip to content

OLLAMA models not working at all with LiteLLM #2833

@ArindamRoy23

Description

@ArindamRoy23

I am trying to run the basic tutorial. Only changes I have made are: calling the Ollama models thru LiteLLM instead of standard API calls. The same model is working ok thru LiteLLM when calling thru Groq API. When I call the model from Ollama, there are a myriad of errors. I have tried:

  • Llama 3.1 7B
  • Phi 4 mini
  • GPT OSS 20B
  • Llama 3 7B
  • Gemma 240M

Errors I have faced:

- No response:

adk run _agents/
Log setup complete: /tmp/agents_log/agent.20250903_110925.log
To access latest log: tail -F /tmp/agents_log/agent.latest.log
/home/vscode/.cache/pypoetry/virtualenvs/tidyworld-NI6vCWsz-py3.10/lib/python3.10/site-packages/google/adk/cli/cli.py:143: UserWarning: [EXPERIMENTAL] InMemoryCredentialService: This feature is experimental and may change or be removed in future versions without notice. It may introduce breaking changes at any time.
  credential_service = InMemoryCredentialService()
/home/vscode/.cache/pypoetry/virtualenvs/tidyworld-NI6vCWsz-py3.10/lib/python3.10/site-packages/google/adk/auth/credential_service/in_memory_credential_service.py:33: UserWarning: [EXPERIMENTAL] BaseCredentialService: This feature is experimental and may change or be removed in future versions without notice. It may introduce breaking changes at any time.
  super().__init__()
Running agent weather_time_agent, type exit to exit.
[user]: Whats the weather in NY> ?
11:09:34 - LiteLLM:INFO: utils.py:3085 - 
LiteLLM completion() model= gpt-oss:20b; provider = ollama_chat
11:10:59 - LiteLLM:INFO: cost_calculator.py:636 - selected model name for cost calculation: ollama_chat/gpt-oss:20b
11:10:59 - LiteLLM:INFO: cost_calculator.py:636 - selected model name for cost calculation: ollama_chat/gpt-oss:20b
[user]: 

- Tool Metadata as response

[user]: What's the weather in NY?
07:20:02 - LiteLLM:INFO: utils.py:3085 - 
LiteLLM completion() model= llama3.1:8b; provider = ollama_chat
07:20:12 - LiteLLM:INFO: cost_calculator.py:636 - selected model name for cost calculation: ollama_chat/llama3.1:8b
07:20:12 - LiteLLM:INFO: cost_calculator.py:636 - selected model name for cost calculation: ollama_chat/llama3.1:8b
[weather_time_agent]: {"name": "get_weather", "arguments":{"city": "NY"}}

- Hallucinated random structured output (Phi Model)
Gives a random structured output response without using tool.

Code

import datetime

from google.adk.agents import Agent

# from _llm._default import DefaultLLMService
from google.adk.models.lite_llm import LiteLlm
from zoneinfo import ZoneInfo


def get_weather(city: str) -> dict:
    """Retrieves the current weather report for a specified city.

    Args:
        city (str): The name of the city for which to retrieve the weather report.

    Returns:
        dict: status and result or error msg.
    """
    print(f"*************Getting weather for {city}")
    if city.lower() == "new york":
        return {
            "status": "success",
            "report": (
                "The weather in New York is sunny with a temperature of 25 degrees" " Celsius (77 degrees Fahrenheit)."
            ),
        }
    else:
        return {
            "status": "error",
            "error_message": f"Weather information for '{city}' is not available.",
        }


def get_current_time(city: str) -> dict:
    """Returns the current time in a specified city.

    Args:
        city (str): The name of the city for which to retrieve the current time.

    Returns:
        dict: status and result or error msg.
    """
    if city.lower() == "new york":
        tz_identifier = "America/New_York"
    else:
        return {
            "status": "error",
            "error_message": (f"Sorry, I don't have timezone information for {city}."),
        }

    tz = ZoneInfo(tz_identifier)
    now = datetime.datetime.now(tz)
    report = f'The current time in {city} is {now.strftime("%Y-%m-%d %H:%M:%S %Z%z")}'
    return {"status": "success", "report": report}


# llm = DefaultLLMService(
#     model="groq/openai/gpt-oss-20b"  # , config={"api_base": "http://host.docker.internal:11434"}
# ).get_model()
import os

llm = LiteLlm(model="groq/openai/gpt-oss-20b")
# llm = LiteLlm(model="ollama_chat/gpt-oss:20b", api_base="http://host.docker.internal:11434")
root_agent = Agent(
    name="weather_time_agent",
    model=llm,
    description=("Agent to answer questions about the time and weather in a city."),
    instruction=("You are a helpful agent who can answer user questions about the time and weather in a city."),
    tools=[get_weather, get_current_time],
)

Config

  • Ubuntu Docker dev container
  • google-ADK: 1.13.0
  • LiteLLM: 1.65.8
  • Ollama: 0.11.8

Metadata

Metadata

Assignees

Labels

models[Component] Issues related to model supportneeds review[Status] The PR/issue is awaiting review from the maintainer

Type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions