Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Image classification example fails when using Google's Gemini #209

Closed
vkryukov opened this issue Dec 9, 2024 · 0 comments · Fixed by #212
Closed

Image classification example fails when using Google's Gemini #209

vkryukov opened this issue Dec 9, 2024 · 0 comments · Fixed by #212

Comments

@vkryukov
Copy link
Contributor

vkryukov commented Dec 9, 2024

When trying the following modified code from the image description notebook,

gemini =
  LangChain.ChatModels.ChatGoogleAI.new!(%{
    model: "gemini-exp-1206",
    api_key: System.get_env("GOOGLE_API_KEY")
  })
alias LangChain.Chains.LLMChain
alias LangChain.MessageProcessors.JsonProcessor

# This data comes from an external data source per image.
# When we `apply_prompt_templates` below, the data is rendered into the template.
image_data_from_other_system = "image of urban art mural on underpass at 507 King St E"

{:ok, updated_chain} =
  %{llm: gemini, verbose: true}
  |> LLMChain.new!()
  |> LLMChain.apply_prompt_templates(messages, %{extra_image_info: image_data_from_other_system})
  |> LLMChain.message_processors([JsonProcessor.new!()])
  |> LLMChain.run(mode: :until_success)

updated_chain.last_message.processed_content

I'm getting the following error message:

** (Protocol.UndefinedError) protocol Jason.Encoder not implemented for %LangChain.Message.ContentPart{type: :text, content: "Provide the descriptions for the image. Incorporate relevant information from the following additional details if applicable:\n\nimage of urban art mural on underpass at 507 King St E\n\nOutput in the following JSON format:\n\n{\n  \"alt\": \"generated alt text\",\n  \"caption\": \"generation caption text\"\n}\n", options: nil} of type LangChain.Message.ContentPart (a struct), Jason.Encoder protocol must always be explicitly implemented.

If you own the struct, you can derive the implementation specifying which fields should be encoded to JSON:

    @derive {Jason.Encoder, only: [....]}
    defstruct ...

It is also possible to encode all fields, although this should be used carefully to avoid accidentally leaking private information when new fields are added:

    @derive Jason.Encoder
    defstruct ...

Finally, if you don't own the struct you want to encode to JSON, you may use Protocol.derive/3 placed outside of any module:

    Protocol.derive(Jason.Encoder, NameOfTheStruct, only: [...])
    Protocol.derive(Jason.Encoder, NameOfTheStruct)
. This protocol is implemented for the following type(s): Any, Atom, BitString, Date, DateTime, Decimal, Ecto.Association.NotLoaded, Ecto.Schema.Metadata, Float, Integer, Jason.Fragment, Jason.OrderedObject, List, Map, NaiveDateTime, Time
    (jason 1.4.4) lib/jason.ex:213: Jason.encode_to_iodata!/2
    (req 0.5.8) lib/req/steps.ex:442: Req.Steps.encode_body/1
    (req 0.5.8) lib/req/request.ex:1090: Req.Request.run_request/1
    (req 0.5.8) lib/req/request.ex:1050: Req.Request.run/1
    (langchain 0.3.0-rc.0) lib/chat_models/chat_google_ai.ex:335: LangChain.ChatModels.ChatGoogleAI.do_api_request/3
    (langchain 0.3.0-rc.0) lib/chat_models/chat_google_ai.ex:307: LangChain.ChatModels.ChatGoogleAI.call/3
    (langchain 0.3.0-rc.0) lib/chains/llm_chain.ex:521: LangChain.Chains.LLMChain.do_run/1

I believe this is caused by the different message format for Google vs. OpenAI/Anthropic.

vkryukov added a commit to vkryukov/langchain that referenced this issue Dec 11, 2024
vkryukov added a commit to vkryukov/langchain that referenced this issue Dec 11, 2024
brainlid pushed a commit that referenced this issue Dec 12, 2024
* Make JsonProcessor process ContentPart properly

* Explicitly remove ```json ```

* Add a failing test for #209

* Pass the tests for #209

* Fix JasonProcessor content processing when regex is present

* Add live google ai call tests for messages with image parts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant