-
-
Notifications
You must be signed in to change notification settings - Fork 214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FR] Add Ollama as Embedding Provider #269
Comments
@MiracleXYZ thanks for the suggestion. I looked into it briefly before and asked in Ollama's discord, from what I gathered the Ollama embedding isn't really the embedding needed for retrieval. For example what does this embedding endpoint example even mean when they have If you get an answer please let me know. If they can serve embeddings for retrieval efficiently it will be the best option. |
@logancyang You're right. After some further research, I found that Ollama generates embeddings based on a chat model/base model. They are relatively slow and not that suitable for retrieval purposes. (If you're using a base model rather than a chat model, the vectors will make more sense, but the process is still slow.) What we usually use are some encoder-decoder or encoder-only embedding models. These models are fast, and the results make much more sense, but they are currently unsupported by Ollama. So, maybe we could add Ollama as an option and state that the results could be slow and suboptimal. Or maybe we could just keep it simple and not add it at all. |
@MiracleXYZ I'll probably wait for Ollama to add the right solutions. For now I'm still searching for the best way to serve embeddings for retrieval locally efficiently. This was an attempt to use huggingface transformers / Transformers.js for local embeddings #245, but had hit a roadblock on Obsidian env issues. The current best option could still be LocalAI, its setup is not very user-friendly for non-technical folks. I really wish one of LM Studio and Ollama could ship this... |
@logancyang Yeah. Maybe Ollama could ship this feature once llama.cpp finishes adding BERT support... |
Ollama just pre-released v0.1.26, which added support for embedding models such as nomic-embed-text.
|
On a note related to Ollama's version update, I recently updated to Ollama v0.1.25, and I can no more run
As the error message suggests, I tried running the command with root permission (using
I'm having difficulty mainly because I'm not familiar with the syntax of the command, but it worked with no problem at all before I updated both Ollama v0.1.25 and Obsidian Copilot. Since I love this Obsidian Copilot so much, I really want to get it back to work! P.S., I didn't create a separate issue for this because this probably is an easy fix, and I found this issue mentioning about the new Ollama version v0.1.26, so I thought I could tag my issue along..! |
@MiracleXYZ Thanks so much for pointing me to the right place! :) |
I can't make it work... even more with embedding :] Could you please update the menu settings, where we select ollama easily as now it has full compatibility with OpenAI API specifications, and also so we would be able to easily type or select also embedding like "nomic-embed-text"? |
Your issue looks like system-specific. Are you on Linux? Does simply Also, try
or
or
TBH just ask GPT4 with your error message until it works. |
@logancyang Just a note, but somehow (either since I upgraded Ollama or upgraded obsidian-copilot) I need to run the command
I didn't need the root permission prior to the upgrades, but I don't know if it's the change in Ollama or the change in Obsidian Copilot that requires me to run with the root permission. I'm noting this because it's probably worth adding this to the setup guide: https://github.com/logancyang/obsidian-copilot/blob/master/local_copilot.md I'm running on Ubuntu 22.04. |
Unlike chatting, I can only use online providers for QA embedding.
Any plans to support local embedding providers e.g. Ollama? They do have an embedding endpoint...
The text was updated successfully, but these errors were encountered: