A production-ready RAG system built with:
- 💡 OpenAI for embeddings and completions
- 📦 ChromaDB as the local vector store
- ⚡ FastAPI for serving as an API
- 🧱 Modular code structure with utilities, configs, and logging
rag-gemini-app/
├── app/ # Core logic: ingestion, retrieval, generation
│ ├── ingest.py
│ ├── retriever.py
│ ├── generator.py
│ ├── rag_pipeline.py
│ ├── utils.py
│ └── __init__.py
├── api/ # FastAPI server
│ ├── main.py
│ ├── routes.py
│ └── schemas.py
├── data/ # Your input text files
│ └── documents/
├── vector_store/ # Chroma persistence
├── .env # API keys and environment variables
├── .gitignore
├── requirements.txt
├── run.py # CLI interface for RAG
└── README.md
git clone https://github.com/yourusername/rag-gemini-app.git
cd rag-gemini-app
python -m venv venv
source venv/bin/activate # or venv\Scripts\activate on Windows
pip install -r requirements.txt
OPENAI_API_KEY=your-openai-key-here
Place your .txt
files into data/documents/
, then run:
python app/ingest.py
python run.py
uvicorn api.main:app --reload
Then open: http://localhost:8000/docs
-
Ingestion:
- Text files → Chunked → Embedded using OpenAI
- Stored in ChromaDB with metadata
-
Retrieval:
- Query embedded → Similarity search via Chroma
-
Generation:
- Top-k chunks + Query → Prompt sent to OpenAI ChatCompletion
- Python 3.9+
- OpenAI API Key
- Internet access for embedding & LLM
- Improve chunking with NLP (spaCy, LangChain)
- Add document upload via API
- Add
/ingest
and/health
endpoints - Optional: Swap ChromaDB for FAISS or Qdrant in production
MIT
Made with ❤️ by Jenish Thapa