Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FR] Add Ollama as Embedding Provider #269

Closed
MiracleXYZ opened this issue Jan 28, 2024 · 11 comments
Closed

[FR] Add Ollama as Embedding Provider #269

MiracleXYZ opened this issue Jan 28, 2024 · 11 comments

Comments

@MiracleXYZ
Copy link

Unlike chatting, I can only use online providers for QA embedding.

Any plans to support local embedding providers e.g. Ollama? They do have an embedding endpoint...

@logancyang
Copy link
Owner

@MiracleXYZ thanks for the suggestion. I looked into it briefly before and asked in Ollama's discord, from what I gathered the Ollama embedding isn't really the embedding needed for retrieval. For example what does this embedding endpoint example even mean when they have llama2 as model.

If you get an answer please let me know. If they can serve embeddings for retrieval efficiently it will be the best option.

@MiracleXYZ
Copy link
Author

@logancyang You're right. After some further research, I found that Ollama generates embeddings based on a chat model/base model. They are relatively slow and not that suitable for retrieval purposes.

(If you're using a base model rather than a chat model, the vectors will make more sense, but the process is still slow.)

What we usually use are some encoder-decoder or encoder-only embedding models. These models are fast, and the results make much more sense, but they are currently unsupported by Ollama.

So, maybe we could add Ollama as an option and state that the results could be slow and suboptimal. Or maybe we could just keep it simple and not add it at all.

@logancyang
Copy link
Owner

@MiracleXYZ I'll probably wait for Ollama to add the right solutions. For now I'm still searching for the best way to serve embeddings for retrieval locally efficiently.

This was an attempt to use huggingface transformers / Transformers.js for local embeddings #245, but had hit a roadblock on Obsidian env issues. The current best option could still be LocalAI, its setup is not very user-friendly for non-technical folks. I really wish one of LM Studio and Ollama could ship this...

@MiracleXYZ
Copy link
Author

@logancyang Yeah. Maybe Ollama could ship this feature once llama.cpp finishes adding BERT support...

@MiracleXYZ
Copy link
Author

MiracleXYZ commented Feb 21, 2024

Ollama just pre-released v0.1.26, which added support for embedding models such as nomic-embed-text.

Update: it's been released now.

@swoh816
Copy link

swoh816 commented Feb 25, 2024

On a note related to Ollama's version update, I recently updated to Ollama v0.1.25, and I can no more run OLLAMA_ORIGINS=app://obsidian.md* ollama serve. The error message says:

time=2024-02-25T02:49:44.816Z level=INFO source=images.go:706 msg="total blobs: 10"
time=2024-02-25T02:49:44.818Z level=INFO source=images.go:713 msg="total unused blobs removed: 0"
time=2024-02-25T02:49:44.818Z level=INFO source=routes.go:1014 msg="Listening on 127.0.0.1:11434 (version 0.1.25)"
time=2024-02-25T02:49:44.819Z level=INFO source=payload_common.go:107 msg="Extracting dynamic libraries..."
Error: unable to initialize llm library Radeon card detected, but permissions not set up properly.  Either run ollama as root, or add you user account to the render group.

As the error message suggests, I tried running the command with root permission (using sudo), then I somehow had the following error message:

zsh: no matches found: OLLAMA_ORIGINS=app://obsidian.md*

I'm having difficulty mainly because I'm not familiar with the syntax of the command, but it worked with no problem at all before I updated both Ollama v0.1.25 and Obsidian Copilot. Since I love this Obsidian Copilot so much, I really want to get it back to work!

P.S., I didn't create a separate issue for this because this probably is an easy fix, and I found this issue mentioning about the new Ollama version v0.1.26, so I thought I could tag my issue along..!

@MiracleXYZ
Copy link
Author

@swoh816 Hi, I've tried the command on my mac and it works fine. I'm on ollama 0.1.27.

Looks like this might be more about ollama? Maybe you could try creating a new issue in their repo. You'll likely get more relevant responses there.

@swoh816
Copy link

swoh816 commented Feb 25, 2024

@MiracleXYZ Thanks so much for pointing me to the right place! :)

@HyperUpscale
Copy link

Ollama just pre-released v0.1.26, which added support for embedding models such as nomic-embed-text.

Update: it's been released now.

I can't make it work... even more with embedding :]

Could you please update the menu settings, where we select ollama easily as now it has full compatibility with OpenAI API specifications, and also so we would be able to easily type or select also embedding like "nomic-embed-text"?

@logancyang
Copy link
Owner

Ollama just pre-released v0.1.26, which added support for embedding models such as nomic-embed-text.

Update: it's been released now.

I can't make it work... even more with embedding :]

Could you please update the menu settings, where we select ollama easily as now it has full compatibility with OpenAI API specifications, and also so we would be able to easily type or select also embedding like "nomic-embed-text"?

Your issue looks like system-specific. Are you on Linux? Does simply ollama serve work without error?

Also, try

OLLAMA_ORIGINS='app://obsidian.md*'

or

OLLAMA_ORIGINS=app://obsidian.md\*

or

export OLLAMA_ORIGINS='app://obsidian.md*' first and then ollama serve

TBH just ask GPT4 with your error message until it works.

@swoh816
Copy link

swoh816 commented Feb 28, 2024

@logancyang Just a note, but somehow (either since I upgraded Ollama or upgraded obsidian-copilot) I need to run the command OLLAMA_ORIGINS=app://obsidian.md* ollama serve with the root permission:

sudo OLLAMA_ORIGINS='app://obsidian.md*' ollama serve
(Note sudo and the single quotation ' in 'app://obsidian.md*')

I didn't need the root permission prior to the upgrades, but I don't know if it's the change in Ollama or the change in Obsidian Copilot that requires me to run with the root permission. I'm noting this because it's probably worth adding this to the setup guide: https://github.com/logancyang/obsidian-copilot/blob/master/local_copilot.md

I'm running on Ubuntu 22.04.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants