Add support for bulk embedding in Ollama #735

bricolage · 2024-08-22T19:21:00Z

Description

As of v0.3.4 Ollama supports bulk embedding. This PR adds support for bulk embeddings via a new input: keyword param

Changes made

Added the input: keyword param to the #embed method
Made the text: keyword param optional (caveat: this will no longer throw an error if you omit it)
Updated related method #embeddings
Updated Ollama API endpoint for embeddings
Updated specs

Why this change is needed

The API for bulk embedding requires the input: keyword param with an array of strings
#embed currently requires the text: keyword param, which is unused when doing bulk embedding

Open topics

You can now call #embed but pass neither text: nor input: and it won't throw an error. Is there a preferred way to validate inputs when not relying on keyword argument defaults?

…g :text OR :input.

The example req/response looks like this: ``` curl http://localhost:11434/api/embed -d '{ "model": "all-minilm", "input": ["Why is the sky blue?", "Why is the grass green?"] }' { "model": "all-minilm", "embeddings": [[ 0.010071029, -0.0017594862, 0.05007221, 0.04692972, 0.054916814, 0.008599704, 0.105441414, -0.025878139, 0.12958129, 0.031952348 ],[ -0.0098027075, 0.06042469, 0.025257962, -0.006364387, 0.07272725, 0.017194884, 0.09032035, -0.051705178, 0.09951512, 0.09072481 ]] } ```

andreibondarev · 2024-08-27T20:46:24Z

lib/langchain/llm/ollama.rb

      model: defaults[:embeddings_model_name],
+      input: [],


@bricolage What if we keep the same method interface but instead accept both text: "string" and text: [...]; similar to how OpenAI here takes either a string or an array of strings.

You could also also just always wrap the text below and pass it as an array:

parameters = { model: model, input: Array(text) }

Oh that's nicer. Will do!

andreibondarev · 2024-08-28T13:04:23Z

lib/langchain/llm/ollama.rb

@@ -219,7 +219,8 @@ def embed(
    )
      parameters = {
        prompt: text,


@chalmagean I think we can just remove this prompt: parameter, right?

andreibondarev · 2024-08-28T13:05:05Z

@chalmagean Mind adding "Add support for bulk embedding in Ollama" to CHANGELOG.md? Thank you!

bricolage and others added 3 commits August 22, 2024 22:43

Add 'input' param to #chat

caa427c

Set nil default for text:. Crude solution when we should be validatin…

18161b8

…g :text OR :input.

bricolage force-pushed the main branch from e1724f0 to 3acbd9f Compare August 23, 2024 05:43

andreibondarev reviewed Aug 27, 2024

View reviewed changes

chalmagean and others added 2 commits August 28, 2024 13:18

Allow input to be string or array

d9f2b3d

Merge branch 'main' into main

616caf2

andreibondarev reviewed Aug 28, 2024

View reviewed changes

chalmagean added 3 commits August 28, 2024 16:19

Remove unused prompt param

fd62d89

Update changelog

9f58643

Update changelog

523df54

andreibondarev merged commit 31ef26b into patterns-ai-core:main Aug 28, 2024
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for bulk embedding in Ollama #735

Add support for bulk embedding in Ollama #735

bricolage commented Aug 22, 2024 •

edited

Loading

andreibondarev Aug 27, 2024 •

edited

Loading

bricolage Aug 27, 2024

andreibondarev Aug 28, 2024 •

edited

Loading

andreibondarev commented Aug 28, 2024

Add support for bulk embedding in Ollama #735

Add support for bulk embedding in Ollama #735

Conversation

bricolage commented Aug 22, 2024 • edited Loading

Description

Changes made

Why this change is needed

Open topics

andreibondarev Aug 27, 2024 • edited Loading

Choose a reason for hiding this comment

bricolage Aug 27, 2024

Choose a reason for hiding this comment

andreibondarev Aug 28, 2024 • edited Loading

Choose a reason for hiding this comment

andreibondarev commented Aug 28, 2024

bricolage commented Aug 22, 2024 •

edited

Loading

andreibondarev Aug 27, 2024 •

edited

Loading

andreibondarev Aug 28, 2024 •

edited

Loading