Skip to content

High quality resources & applications for LLMs, multi-modal models and VectorDBs

License

Notifications You must be signed in to change notification settings

statlib/vectordb-recipes

 
 

Repository files navigation

VectorDB-recipes


Dive into building GenAI applications! This repository contains examples, applications, starter code, & tutorials to help you kickstart your GenAI projects.
  • These are built using LanceDB, a free, open-source, serverless vectorDB that requires no setup.
  • It integrates into python data ecosystem so you can simply start using these in your existing data pipelines in pandas, arrow, pydantic etc.
  • LanceDB has native Typescript SDK using which you can run vector search in serverless functions!


Join our community for support - DiscordTwitter

This repository is divided into 3 sections:

  • Examples - Get right into the code with minimal introduction, aimed at getting you from an idea to PoC within minutes!
  • Applications - Ready to use Python and web apps using applied LLMs, VectorDB and GenAI tools
  • Tutorials - A curated list of tutorials, blogs, Colabs and courses to get you started with GenAI in greater depth.

Examples

Applied examples that get right into the code with minimal introduction, aimed at getting you from an idea to PoC within minutes! Examples are available as:

  • Colab notebooks - that builds the application is stages allowing you to investigate results at every intermediate stage.
  • Python scripts - for cases where you'd like directly to use the file or snippets to integrate in your application
  • JS/TS scripts - Some examples are written using lancedb's native js library! These script/snippets can also be directly integrated in your web applications.

If you're looking for in-depth tutorial-like examples, checkout the tutorials section!

Example Interactive Envs Scripts
Youtube transcript search bot Open In Colab Python JavaScript
Langchain: Code Docs QA bot Open In Colab Python JavaScript
AI Agents: Reducing Hallucination Open In Colab Python JavaScript
Multimodal CLIP: DiffusionDB Open In Colab Python
Multimodal CLIP: Youtube videos Open In Colab Python
TransformersJS Embedding example JavaScript
Movie Recommender Open In Colab Python
Product Recommender Open In Colab Python
Audio Search Open In Colab Python
Multimodal Image + Text Search Open In Colab Python
Evaluating Prompts with Prompttools Open In Colab
Arxiv paper recommender Open In Colab Python
Multi-lingual search Open In Colab Python

Projects & Applications

These are ready to use applications built using LanceDB serverless vector database. You can explore these open source projects, use parts of them in your projects or build your applications on top of these.

Project Name Description Screenshot
YOLOExplorer Iterate on your YOLO / CV datasets using SQL, Vector semantic search, and more within seconds YOLOExplorer
Website Chatbot (Deployable Vercel Template) Create a chatbot from the sitemap of any website/docs of your choice. Built using vectorDB serverless native javascript package. Chatbot
Multi-Modal Search Engine Create a Multi-modal search engine app, to search images using both images or text Search
Chat with multiple URL/website Conversational AI for Any Website with Mistral,Bge Embedding & LanceDB webui_aa
Hr chatbot Hr chatbot - ask your personal query using zero-shot React agent & tools image
Talk with Youtube Video using GPT4 Vision API Talk with Youtube Video using GPT4 Vision API and Langchain demo

Tutorials

Looking to get starte with LLMs, vectorDBs, and the world of Generative AI? These in-depth tutorials and courses cover these concepts with practical follow along colabs where possible.

Tutorial Interactive Environment Blog Link
LLMs, RAG, & the missing storage layer for AI Read the Blog
Fine-Tuning LLM using PEFT & QLoRA Open In Colab Read the Blog
Context-Aware Chatbot using Llama 2 & LanceDB Open In Colab Read the Blog
A Primer on Text Chunking and its Types Open In Colab Read the Blog
NER-powered Semantic Search using LanceDB Open In Colab Read the Blog

🌟 New! 🌟 Applied GenAI and VectorDB course on Udacity Learn about GenAI and vectorDBs using LanceDB in the recently launched Udacity Course

Contributing Examples

Create a new folder with either a main.py or index.js file. If you are writing solely in python, be sure also include a main.ipynb file that walks through your example. Additionally, please include test.py file that include pytest unit tests for your functions (or main). Take a look at some of the other examples, and please mock your api calls using pytest. If you are writing api calls in javascript, add to the files ignored within the compile_testing.js file in the root directory.

If you require a dataset to be downloaded before you can run either files, please include bash script within your test.py file, like this:

subprocess.Popen("wget dataset.zip", shell=True).wait()

Note: If you're not sure about the steps, please simply open a PR with your example and we'll be happy to help you out!

About

High quality resources & applications for LLMs, multi-modal models and VectorDBs

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 99.7%
  • Other 0.3%