Add embeddings for LocalAI #8134

mudler · 2023-07-22T17:35:19Z

Description:

This PR adds embeddings for LocalAI ( https://github.com/go-skynet/LocalAI ), a self-hosted OpenAI drop-in replacement. As LocalAI can re-use OpenAI clients it is mostly following the lines of the OpenAI embeddings, however when embedding documents, it just uses string instead of sending tokens as sending tokens is best-effort depending on the model being used in LocalAI. Sending tokens is also tricky as token id's can mismatch with the model - so it's safer to just send strings in this case.

Partly related to: #5256

Dependencies: No new dependencies

Twitter: @mudler_it

Maintainers: @rlancemartin, @eyurtsev, @hwchase17

vercel · 2023-07-22T17:35:22Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
langchain	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Jul 24, 2023 6:39pm

Signed-off-by: mudler <mudler@localai.io>

hwchase17

lets add an example notebook for this
lets add this to langchain/embeddings/__init__.py

Signed-off-by: mudler <mudler@localai.io>

mudler · 2023-07-23T10:01:36Z

@hwchase17 done! will follow up along with #5256 and once we have complete integration with LocalAI I'll update also the documentation page accordingly.

Taking this opportunity to ask - is there any interest into adding e.g. voice capabilities? LocalAI supports tts and audio-to-text as well

baskaryan · 2023-07-24T19:16:23Z

looks awesome, thanks @mudler!

there definitely is interest in voice but adding other modalities is a big change that we want to be super thoughtful about, and we haven't had the time to think it through just yet. very open to suggestions on the interface if you're eager to see it in langchain

mkhludnev · 2024-12-16T21:01:40Z

libs/langchain/langchain/embeddings/localai.py

+
+# https://stackoverflow.com/questions/76469415/getting-embeddings-of-length-1-from-langchain-openaiembeddings
+def _check_response(response: dict) -> dict:
+    if any(len(d["embedding"]) == 1 for d in response["data"]):


Pardon for commenting old code. But the SO thread discusses OpenAI service, so it's hardly the case in LocalAI. Isn't it?
This is why I want to remove this retry condition in new integration package https://github.com/mkhludnev/langchain-localai WDYT?
This package spin off from the discussion #22399 (comment)

mkhludnev · 2024-12-17T12:14:23Z

libs/langchain/langchain/embeddings/localai.py

+        """Call out to LocalAI's embedding endpoint."""
+        # handle large input text
+        if self.model.endswith("001"):
+            # See: https://github.com/openai/openai-python/issues/418#issuecomment-1525939500


Should we bother about OpenAI specifics when integrate with LocalAI? I don't think so. I'm going to wipe it there https://github.com/mkhludnev/langchain-localai

dosubot bot added Ɑ: embeddings Related to text embedding models module 🤖:enhancement A large net-new component, integration, or chain. Use sparingly. The largest features labels Jul 22, 2023

mudler force-pushed the localai_embeddings branch from 99131ec to 0addb58 Compare July 22, 2023 17:36

Add embeddings for LocalAI

76ebaf2

Signed-off-by: mudler <mudler@localai.io>

mudler force-pushed the localai_embeddings branch from 0addb58 to 76ebaf2 Compare July 22, 2023 17:38

hwchase17 reviewed Jul 22, 2023

View reviewed changes

hwchase17 added the needs documentation PR needs to be updated with documentation label Jul 22, 2023

mudler added 2 commits July 23, 2023 11:59

Add LocalAI to embeddings __init__

d5ad48e

Signed-off-by: mudler <mudler@localai.io>

Add LocalAI embedding notebook

309abe2

Signed-off-by: mudler <mudler@localai.io>

fmt

d26e2c2

baskaryan removed the needs documentation PR needs to be updated with documentation label Jul 24, 2023

baskaryan added 2 commits July 24, 2023 11:30

fmt

64c14cf

merge

da0117f

vercel bot deployed to Preview – langchain July 24, 2023 18:39 View deployment

baskaryan merged commit ae28568 into langchain-ai:master Jul 24, 2023

mudler deleted the localai_embeddings branch July 25, 2023 18:31

mkhludnev reviewed Dec 16, 2024

View reviewed changes

mkhludnev mentioned this pull request Dec 17, 2024

[test] check retry on singleton embedding mkhludnev/langchain-localai#1

Closed

mkhludnev reviewed Dec 17, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add embeddings for LocalAI #8134

Add embeddings for LocalAI #8134

mudler commented Jul 22, 2023 •

edited

Loading

vercel bot commented Jul 22, 2023 •

edited

Loading

hwchase17 left a comment

mudler commented Jul 23, 2023

baskaryan commented Jul 24, 2023

mkhludnev Dec 16, 2024

mkhludnev Dec 17, 2024

Add embeddings for LocalAI #8134

Add embeddings for LocalAI #8134

Conversation

mudler commented Jul 22, 2023 • edited Loading

vercel bot commented Jul 22, 2023 • edited Loading

hwchase17 left a comment

Choose a reason for hiding this comment

mudler commented Jul 23, 2023

baskaryan commented Jul 24, 2023

mkhludnev Dec 16, 2024

Choose a reason for hiding this comment

mkhludnev Dec 17, 2024

Choose a reason for hiding this comment

mudler commented Jul 22, 2023 •

edited

Loading

vercel bot commented Jul 22, 2023 •

edited

Loading