A full-stack Retrieval-Augmented Generation (RAG) chatbot built with FastAPI (backend) and Streamlit (frontend). It allows users to upload documents, ask questions about them, and receive AI-generated answers based on document content.
-
Backend:
- FastAPI: RESTful API with efficient request handling
- LangChain: RAG pipeline implementation
- OpenAI API: LLM for question answering
- Sentence Transformers: Alternative document embeddings
-
Frontend:
- Streamlit: Interactive web interface
-
Database:
-
Debugging/Tracing:
- LangSmith: Debug and trace LangChain components
- LangSmith: Debug and trace LangChain components
- Interactive Chat Interface: Chat with streaming responses
- Document Management: Upload, view, and delete PDFs, DOCX, and HTML files
- Multi-Session Support: Create and manage multiple chat sessions
- Model Selection: Choose between GPT-4o and GPT-4o-mini
- Vector Database Options: Use ChromaDB (local) or Pinecone (cloud)
- Embedding Model Options: HuggingFace models (local) or OpenAI (cloud)
- Source Attribution: View source documents used for responses
- Conda or Python 3.10+
-
Clone the Repository
git clone https://github.com/jmayank23/rag-application.git cd rag-application
-
Create and Activate Conda Environment
conda create --name rag-application python=3.10 conda activate rag-application
-
Install Dependencies
pip install -r backend/requirements.txt
-
Set Up Environment Variables
Create a
.env
file in the project directory with:OPENAI_API_KEY="your-openai-api-key" LANGCHAIN_TRACING_V2=true LANGCHAIN_API_KEY="your-langchain-api-key" LANGCHAIN_PROJECT="rag-application" # Optional for Pinecone PINECONE_API_KEY="your-pinecone-api-key" PINECONE_ENVIRONMENT="your-pinecone-environment" PINECONE_INDEX_NAME="your-pinecone-index-name"
-
Run the Backend Server
cd backend uvicorn main:app --reload
-
Run the Frontend Application
In a new terminal:
cd frontend streamlit run app.py
-
Access the Application
Open
http://localhost:8501
in your browser
- Upload Documents: Use the sidebar to upload PDFs, DOCX, and HTML files
- Chat: Enter queries in the chat input and receive AI responses
- Manage Documents: View and delete documents from the sidebar
- Session Management: Create, switch between, and delete chat sessions
- Model Selection: Choose your preferred LLM, vector database, and embedding model
- LangSmith Trace: View detailed pipeline traces at LangSmith
- API Documentation: Access Swagger UI at
localhost:8000/docs
The application follows a client-server architecture with:
-
Frontend (Streamlit)
- User interface for chat, document management, and settings
- Communicates with the backend via REST API calls
- Manages UI state and user interactions
-
Backend (FastAPI)
- REST API endpoints for chat, document management, and file serving
- RAG pipeline implementation using LangChain
- Document processing and indexing
- Database management
-
Document Processing:
- Document uploads are processed in chunks with appropriate overlaps
- Chunks are embedded using the selected embedding model
- Embeddings are stored in the vector database with metadata
-
Query Processing:
- User questions are contextualized using chat history
- Relevant document chunks are retrieved from the vector database
- LLM generates responses using the question, context, and history
- Responses are streamed to the frontend with source attribution
- Conversations are stored in the database
The application uses SQLite with three primary tables:
- application_logs: Chat message history
- document_store: Document metadata
- sessions: Chat session configuration and state
- Application logs are stored in
backend/app.log
- LangSmith provides detailed tracing of the RAG pipeline execution
- Make documents accessible per chat session
- Deploy to AWS