Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Verify new tools api code with GoogleAI #107

Closed
brainlid opened this issue Apr 27, 2024 · 20 comments · Fixed by #152
Closed

Verify new tools api code with GoogleAI #107

brainlid opened this issue Apr 27, 2024 · 20 comments · Fixed by #152

Comments

@brainlid
Copy link
Owner

The PR #105 was merged to main, and the tests pass, but I didn't run it against a GoogleAI LLM. I would appreciate help validating that.

@jadengis, this impacts your code the most. Just a heads-up.

@jadengis
Copy link
Contributor

@brainlid sure i should be able to run main against my setup and see if it breaks anything. Are there any breaking changes to the API i'd need to integrate?

@brainlid
Copy link
Owner Author

brainlid commented Apr 27, 2024 via email

@nileshtrivedi
Copy link

Looking forward to this as Gemini Pro APIs have become publicly available.
This will also allow me to use this project as an abstraction for LLMs in my Elixir-port of Autogen which is a multi-agent framework from Microsoft.

Here are the Function Calling docs for Gemini models: https://ai.google.dev/gemini-api/docs/function-calling

@jadengis
Copy link
Contributor

jadengis commented Jun 6, 2024

@nileshtrivedi if you are interested I would appreciate some help testing out the above change. I tried it out in my project but ran into some errors with the new code and I couldn't get it working. I haven't had time to go back an fix the issues however.

@nileshtrivedi
Copy link

nileshtrivedi commented Jun 6, 2024

@jadengis I thought of modifying the existing tools/calculator_test for Google Gemini models but even the existing test for OpenAI seems to fail for me at this line because message.content seems to be nil. This is the message object when this assertion fails:

%LangChain.Message{content: nil, processed_content: nil, index: 0, status: :complete, role: :assistant, name: nil, tool_calls: [%LangChain.Message.ToolCall{status: :complete, type: :function, call_id: "call_9Fq4ZN3U4Ln4D52Kwg1DXj0G", name: "calculator", arguments: %{"expression" => "100 + 300 - 200"}, index: nil}], tool_results: nil}

I don't know whether it's a bug in the test code itself or something else. Unable to test further. 🫤

For my own work, even Autogen itself currently seems broken for Gemini models. I ended up using Python example code using Google's own SDK.

@brainlid
Copy link
Owner Author

brainlid commented Jun 6, 2024

@nileshtrivedi I updated and fixed the Calculator tool and tests. Thanks for pointing that out!

#132

@brainlid
Copy link
Owner Author

brainlid commented Jun 6, 2024

@jadengis please let me know what errors you're getting! We can hopefully get it ironed out quickly.

@brainlid
Copy link
Owner Author

brainlid commented Jun 6, 2024

For migrating, I tried to document what would be needed in the CHANGELOG. Let me know if you're finding gaps!

@nileshtrivedi
Copy link

I submitted #135 as a failing test. While mix test test/chat_models/chat_google_ai_test.exs --include live_call passes, mix test test/tools/calculator_gemini_test.exs --include live_call fails with "Unexpected response" from the LLM.

@nileshtrivedi
Copy link

nileshtrivedi commented Jun 14, 2024

I think there are multiple errors in calling Gemini model APIs properly:

There may be more issues.

EDITED: Noticed this open PR for fixing the endpoint: #118 Seems like there are subtle difference between Gemini API and VertexAI API which are causing these.

@brainlid
Copy link
Owner Author

@nileshtrivedi We split out ChatVertexAI from ChatGoogleAI because the differences were subtle but throughout. In the published RC, the callbacks have been updated as well. Thank you for looking at it before. How does it look now?

@nileshtrivedi
Copy link

nileshtrivedi commented Jun 22, 2024

Gemini API still seems to fail when testing with

I tested after making this change in test/tools/calculator_test.exs:

--- a/test/tools/calculator_test.exs
+++ b/test/tools/calculator_test.exs
@@ -5,7 +5,7 @@ defmodule LangChain.Tools.CalculatorTest do
   doctest LangChain.Tools.Calculator
   alias LangChain.Tools.Calculator
   alias LangChain.Function
-  alias LangChain.ChatModels.ChatOpenAI
+  alias LangChain.ChatModels.ChatGoogleAI
   alias LangChain.Message.ToolCall
   alias LangChain.Message.ToolResult
 
@@ -80,7 +80,7 @@ defmodule LangChain.Tools.CalculatorTest do
         end
       }
 
-      model = ChatOpenAI.new!(%{seed: 0, temperature: 0, stream: false, callbacks: [llm_handler]})
+      model = ChatGoogleAI.new!(%{model: "gemini-1.5-pro", api_key: System.fetch_env!("GEMINI_API_KEY"), seed: 0, temperature: 0, stream: false, callbacks: [llm_handler]})

This is the test failure I get:

% mix test test/tools/calculator_test.exs --include live_call
Compiling 1 file (.ex)
Including tags: [:live_call]

.

  1) test live test performs repeated calls until complete with a live LLM (LangChain.Tools.CalculatorTest)
     test/tools/calculator_test.exs:68
     ** (MatchError) no match of right hand side value: {:error, %LangChain.Chains.LLMChain{llm: %LangChain.ChatModels.ChatGoogleAI{endpoint: "https://generativelanguage.googleapis.com/v1beta", api_version: "v1beta", model: "gemini-1.5-pro", api_key: "REMOVED_FOR_SAFETY", temperature: 0.0, top_p: 1.0, top_k: 1.0, receive_timeout: 60000, stream: false, callbacks: [%{on_llm_new_message: #Function<1.19352243/2 in LangChain.Tools.CalculatorTest."test live test performs repeated calls until complete with a live LLM"/1>}]}, verbose: false, verbose_deltas: false, tools: [%LangChain.Function{name: "calculator", description: "Perform basic math calculations or expressions", display_text: nil, function: #Function<0.76765322/2 in LangChain.Tools.Calculator.execute>, async: true, parameters_schema: %{type: "object", required: ["expression"], properties: %{expression: %{type: "string", description: "A simple mathematical expression"}}}, parameters: []}], _tool_map: %{"calculator" => %LangChain.Function{name: "calculator", description: "Perform basic math calculations or expressions", display_text: nil, function: #Function<0.76765322/2 in LangChain.Tools.Calculator.execute>, async: true, parameters_schema: %{type: "object", required: ["expression"], properties: %{expression: %{type: "string", description: "A simple mathematical expression"}}}, parameters: []}}, messages: [%LangChain.Message{content: "Answer the following math question: What is 100 + 300 - 200?", processed_content: nil, index: nil, status: :complete, role: :user, name: nil, tool_calls: [], tool_results: nil}], custom_context: nil, message_processors: [], max_retry_count: 3, current_failure_count: 0, delta: nil, last_message: %LangChain.Message{content: "Answer the following math question: What is 100 + 300 - 200?", processed_content: nil, index: nil, status: :complete, role: :user, name: nil, tool_calls: [], tool_results: nil}, needs_response: true, callbacks: [%{on_tool_response_created: #Function<2.19352243/2 in LangChain.Tools.CalculatorTest."test live test performs repeated calls until complete with a live LLM"/1>}]}, "Unexpected response"}
     code: {:ok, updated_chain, %Message{} = message} =
     stacktrace:
       test/tools/calculator_test.exs:85: (test)

     The following output was logged:
     
     14:01:18.910 [error] Trying to process an unexpected response. ""
     
     14:01:18.910 [error] Error during chat call. Reason: "Unexpected response"
     
......
Finished in 1.8 seconds (0.00s async, 1.8s sync)
8 tests, 1 failure

Randomized with seed 98474

I confirmed that my actual api_key was printed where it says REMOVED_FOR_SAFETY. I also tried model: "gemini-1.5-flash" but got the same error.

I think it might be easier if you can signup on https://ai.google.dev/ to get an API key to help with testing?

@ljgago
Copy link

ljgago commented Jun 28, 2024

Hello @nileshtrivedi,

If you change these lines:

@default_base_url "https://generativelanguage.googleapis.com"
@default_api_version "v1beta"
@default_endpoint "#{@default_base_url}/#{@default_api_version}"

by

  @default_endpoint "https://generativelanguage.googleapis.com"
  @default_api_version "v1beta"

Do the tests pass?

@nileshtrivedi
Copy link

nileshtrivedi commented Jun 28, 2024

@ljgago No, it fails but with a different error:

% mix test test/tools/calculator_test.exs --include live_call              
Compiling 39 files (.ex)
Generated langchain app
Including tags: [:live_call]

....

  1) test live test performs repeated calls until complete with a live LLM (LangChain.Tools.CalculatorTest)
     test/tools/calculator_test.exs:68
     ** (LangChain.LangChainError) content: is invalid for role tool
     code: |> LLMChain.run(mode: :while_needs_response)
     stacktrace:
       (langchain 0.3.0-rc.0) lib/message.ex:408: LangChain.Message.new_tool_result!/1
       (langchain 0.3.0-rc.0) lib/chains/llm_chain.ex:692: LangChain.Chains.LLMChain.execute_tool_calls/2
       (langchain 0.3.0-rc.0) lib/chains/llm_chain.ex:321: LangChain.Chains.LLMChain.run_while_needs_response/1
       test/tools/calculator_test.exs:95: (test)

     The following output was logged:
     
     06:58:59.287 [debug] Executing function "calculator"
     
...
Finished in 2.6 seconds (0.00s async, 2.6s sync)
8 tests, 1 failure

Randomized with seed 604620

I am happy to get on a call with any devs to work this out.

@brainlid
Copy link
Owner Author

brainlid commented Jul 5, 2024

@nileshtrivedi I setup a GoogleAI account and got an API key. There are multiple issues with the ChatGoogleAI implementation.

I've fixed a couple locally (not pushed yet), but there's an issue with the ToolResult handling that I'm still trying to figure out.

A big issue is that Google's API docs are really poor. Their API also does things I haven't seen any other API do (aka. odd behaviors). All in all, I don't like Google's service! 😬

Still, I acknowledge your issue, it is valid, and I hope to have a resolution sometime soon. Thanks!

@brainlid
Copy link
Owner Author

brainlid commented Jul 6, 2024

@nileshtrivedi This is hopefully fixed now! 🤞

Just merged PR #152 to main. If you test using main it should be working now. Please let me know!

@nileshtrivedi
Copy link

Thanks @brainlid ! This definitely seems to have improved as now ChatGoogleAI responses are being collected. To the user's message Answer the following math question: What is 100 + 300 - 200?, ChatGoogleAI returns The answer is 200..

However, there seem to be differences between ChatOpenAI and ChatGoogleAI return values.
Specifically, I get a test failure on this line:

1) test live test performs repeated calls until complete with a live LLM (LangChain.Tools.CalculatorTest)
     test/tools/calculator_test.exs:68
     ** (FunctionClauseError) no function clause matching in Kernel.=~/2

     The following arguments were given to Kernel.=~/2:

         # 1
         [%LangChain.Message.ContentPart{type: :text, content: "The answer is 200.", options: nil}]

         # 2
         "100 + 300 - 200"

     Attempted function clauses (showing 3 out of 3):

         def =~(left, "") when is_binary(left)
         def =~(left, right) when is_binary(left) and is_binary(right)
         def =~(left, right) when is_binary(left)

     code: assert message.content =~ "100 + 300 - 200"

With ChatGoogleAI, instead of message.content being a string, it is actually [%LangChain.Message.ContentPart{type: :text, content: "The answer is 200.", options: nil}].

Also, Gemini's response is simply The answer is 200. so this line would fail anyway. Also, message.tool_calls seems to be blank.

@nileshtrivedi
Copy link

I think it may require setting the function calling mode to ANY as per Gemini docs.

@nileshtrivedi
Copy link

My bad, I didn't notice the new tests added (including the one for tool calling). mix test test/chat_models/chat_google_ai_test.exs --include live_google_ai is passing all the tests.

Kudos @brainlid for following up and fixing this! 👏

@brainlid
Copy link
Owner Author

brainlid commented Jul 6, 2024

@nileshtrivedi I'm pretty annoyed by the GoogleAI honestly. It's such an oddball compared to others. The API docs are sparse and difficult to use too. Ugh.

One odd thing that you noticed is the assistant returns the contents in parts every time. That's just a decision they made. We could pattern match on that and if there is only a single text part, flatten it to be content: "the text".

I haven't used it enough to see it return anything else though. Have you? If it makes sense, then doing that would make it easier to swap out the backend AI without impacting an application.

I updated the LangChain.Utils.ChainResult.to_string to match on that make it easier to get the answer out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants