Enhanced Docs: LLM Embedding Examples #844

igorlima · 2024-12-27T03:56:44Z

This PR enriches the documentation with an array of embedding examples. These examples illustrate how to generate embeddings using the power of renowned LLMs within the txtai library.

This documentation enhancement was sparked by an issue seeking guidance on integrating Gemini with txtai. The idea is to hope these examples make someone else's journey smoother and more enjoyable!

igorlima · 2024-12-27T04:01:13Z

Since it's my first PR for this library, I'm open to suggestions or improvements.

If everything looks good as it is, that's awesome! Feel free to share thoughts on organizing or listing the examples on the Examples page.

davidmezzetti · 2024-12-27T10:32:42Z

Hello. Thank you for wanting to improve the documentation here! This is great!

What if this was a notebook instead? Then we could also publish a corresponding blog article like the others, which would give it more visibility. It could be titled something like "Getting started with LLM APIs", with a subtitle of "Generate embeddings and run LLMs with Gemini, VertexAI, Mistral, Cohere and AWS Bedrock".

Would it make sense to add OpenAI and Claude given those are the two most popular?

In a notebook, I believe the common code could also be consolidated into a two functions, one for the LLM and one for the embeddings. Then there could be a section per provide that passes the model path and initializes any keys / other calls.

Additionally, I don't believe any of the method='litellm' calls are required. There is autodetection logic that should know it's a litellm model.

igorlima · 2024-12-27T20:05:34Z

I've moved it already to a Notebook, and you can check it out here:

Notebook Link

While I plan to add other models like OpenAI, Claude, and Groq later, here's a sneak peek at the current Table of Contents for the Notebook.

ToC

davidmezzetti · 2024-12-27T22:52:34Z

Looking great! Did you want to wait to merge until the other models were added?

davidmezzetti · 2024-12-27T23:56:25Z

One other thing with Groq, does it have an embeddings API? I see it's using a local embeddings model.

igorlima · 2024-12-28T00:09:40Z

I've got another version of the Notebook. Feel free to check it out here:

Notebook Link

Interestingly, Groq and Claude don't have their embedding models. Instead, they refer to other models:

Groq's blog highlights the jinaai/jina-embeddings-v2-base-en, which seems to be a locally hosted model. Read more here: Groq Blog
Claude's documentation points to the Voyage AI embedding model. More details are available here: Claude Anthropic Docs

In the config, I've defaulted both options to False. Do you think I should keep them there?

Interestingly, the final outputs show how different text embedding searches can be across models. The notebook provides a handy comparison, helping us understand which text embedding best suits a context.

davidmezzetti · 2024-12-28T10:47:56Z

What if for the two that don't have embeddings APIs, you just use Voyage? Or there can be a note saying there is no embeddings API for this provider but they suggest "...".

The other notebooks all install txtai from GitHub vs PyPI. I see the note about using a specific version but the flip side to that is that limiting code to a specific version misses important security updates. I've always taken the approach to always upgrade and fix the problems as they arise (usually it's caught during the GitHub actions build).

Once again this is really cool and I really appreciate it. Once this PR is merged, I'll get a corresponding article up with a note crediting you. Would you prefer me to link to your GitHub profile or LinkedIn profile in those articles?

igorlima · 2024-12-28T20:26:36Z

Here is the updated version of Notebook:

Notebook link
- diff link

A couple of notes:

I left a note for Groq, Claude, and Voyage mentioning their minor limitations in this specific Notebook context.
I've also addressed the txtai installation by pointing directly to the Github page instead of PyPI.
If you want to give credit, I'd be thrilled if you could link to my GitHub profile. It's where all my geek stuff and coding come to flourish! Thank you so much for considering this. 😊

davidmezzetti · 2024-12-30T12:46:58Z

All sounds great and thanks again!

I'll merge and create an article crediting you (linking to your GitHub profile). I may make a few minor edits but otherwise everything is looking good.

I'll follow up here once everything is ready. It will take me a few days before I get to this though.

davidmezzetti · 2025-01-02T18:04:59Z

@igorlima Thank you for this contribution! This PR has been merged and the following articles published.

https://dev.to/neuml/getting-started-with-llm-apis-2j89
https://neuml.hashnode.dev/getting-started-with-llm-apis

igorlima · 2025-01-05T22:01:34Z

I saw this post on LinkedIn, too, and I must say that I liked it! Thanks for sharing.

LinkedIn post to Getting Started with LLM APIs

davidmezzetti · 2025-01-07T01:55:24Z

Thank you for putting this together. I'm sure it will gain more and more traction as time goes on. It's a solid resource!

enhance documentation

3c8f0a8

igorlima marked this pull request as ready for review December 27, 2024 04:01

igorlima mentioned this pull request Dec 27, 2024

How to use Gemini with txtai #843

Closed

igorlima added 2 commits December 27, 2024 16:28

Address PR comment

6253c21

improvement and enhancmenet

b0c4673

davidmezzetti assigned igorlima Dec 27, 2024

davidmezzetti added this to the v8.2.0 milestone Dec 27, 2024

update notebook

1d071cd

Address PR comment

b49edbf

igorlima mentioned this pull request Dec 30, 2024

Add usage considerations to the docs getomni-ai/zerox#121

Open

davidmezzetti merged commit 991ea6a into neuml:master Jan 2, 2025
3 checks passed

igorlima deleted the enhance-doc branch January 2, 2025 17:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhanced Docs: LLM Embedding Examples #844

Enhanced Docs: LLM Embedding Examples #844

igorlima commented Dec 27, 2024

igorlima commented Dec 27, 2024

davidmezzetti commented Dec 27, 2024 •

edited

Loading

igorlima commented Dec 27, 2024

davidmezzetti commented Dec 27, 2024

davidmezzetti commented Dec 27, 2024

igorlima commented Dec 28, 2024 •

edited

Loading

davidmezzetti commented Dec 28, 2024

igorlima commented Dec 28, 2024 •

edited

Loading

davidmezzetti commented Dec 30, 2024 •

edited

Loading

davidmezzetti commented Jan 2, 2025

igorlima commented Jan 5, 2025

davidmezzetti commented Jan 7, 2025

Enhanced Docs: LLM Embedding Examples #844

Enhanced Docs: LLM Embedding Examples #844

Conversation

igorlima commented Dec 27, 2024

igorlima commented Dec 27, 2024

davidmezzetti commented Dec 27, 2024 • edited Loading

igorlima commented Dec 27, 2024

davidmezzetti commented Dec 27, 2024

davidmezzetti commented Dec 27, 2024

igorlima commented Dec 28, 2024 • edited Loading

davidmezzetti commented Dec 28, 2024

igorlima commented Dec 28, 2024 • edited Loading

davidmezzetti commented Dec 30, 2024 • edited Loading

davidmezzetti commented Jan 2, 2025

igorlima commented Jan 5, 2025

davidmezzetti commented Jan 7, 2025

davidmezzetti commented Dec 27, 2024 •

edited

Loading

igorlima commented Dec 28, 2024 •

edited

Loading

igorlima commented Dec 28, 2024 •

edited

Loading

davidmezzetti commented Dec 30, 2024 •

edited

Loading