A minimal Node.js RESTful server for a Retrieval-Augmented Generation (RAG) system using ChromaDB and Ollama!
- Clone this repo.
git clone https://github.com/vteam27/RagXOllama
cd RagXOllama
- Add your documents to
Docs/
folder. - Simply build the docker containers using
docker compose up
- Wait for the data to be ingested to DB and the LLM model to be downloaded.
- Go to
http://localhost:3000
-
Ollama : To serve open source LLMs locally (one time setup)
ollama serve ollama run llama3
-
ChromaDB Backend: Spin up the ChromaDB core
docker pull chromadb/chroma docker run -p 8000:8000 chromadb/chroma
-
Add your PDF documents to
Docs/
-
Install dependencies and run!
git clone https://github.com/vteam27/RagXOllama cd RagXOllama npm install node app.js
-
Go to
http://localhost:3000
to chat using a web demo. -
Alternatively, you can run
node pipeline.js
to interact using terminal, easy development and testing.
- Chat with private documents securely!
- Search and query your data easily using Natural Language only!
- Keep all your LLMs up to date with the latest data.
- Use any open source LLM of your choice. Browse LLMs
- This repo can be used as a template to easily integrate LLMs with RAG functionality into your MERN (or any other node.js) stack apps.
- Ingestion: Feed relevant information into the chromaDB vector store collection.
- Retrieve: We retrieve relevant (top n) chunks of factual information stored in chromaDB using a algorithm that calculates it's similarity score with the user query.
- Augment: We append this context into our prompt.
- Generate: We feed this prompt to a
8B llama3 4-bit quantized
model served by ollama to generate the desired response.
View the logs.txt
file for the full output of my code.
With the help of Retrieval-Augmented Generation (RAG) we can achieve:
- Improved Factual Accuracy: By relying on retrieved information, RAG systems can ensure answers are grounded in real-world data.
- Domain Specificity: RAG allows you to integrate domain-specific knowledge bases, making the system more knowledgeable in a particular area.
- Adaptability to New Information: By using external knowledge sources, RAG systems can stay up-to-date with the latest information, even if the LLM itself wasn't specifically trained on it.
- Setup chromaDB and ollama
- Build a basic RAG pipeline
- Build a data loader to chunk and ingest data into chromaDB.
- Add support for text, pdf and docx files.
- Implement a REST API Architecture.
- Build demo UI for easy interaction
- Containarize and publish image to dockerhub
- All done for now :)