Retrieval-Augmented Generation (RAG) System

A Python-based RAG system that processes PDF and DOCX files, creates embeddings for document chunks, uses FAISS for efficient similarity search, and generates responses based on retrieved contexts.

Features

Document Processing
- Support for PDF and DOCX files
- Configurable text chunking with overlap
- Automatic handling of document boundaries
Embedding Creation
- Uses SentenceTransformer models
- Efficient caching system for embeddings
- Support for different embedding models
Similarity Search
- FAISS-based vector search
- Multiple index types (Flat, IVF, HNSW)
- Optimizable search parameters
Response Generation
- Context-based response generation
- Multiple context combination strategies
- Configurable context length
Optimization Features
- Hyperparameter tuning
- Index parameter optimization
- Performance metrics tracking
- Caching system
- Batch processing support

Installation

Clone the repository:

git clone <repository-url>
cd rag-system

Install dependencies:

pip install -r requirements.txt

Quick Start

from rag_system import RAGSystem

# Initialize the system
rag = RAGSystem(
    embedding_model='all-mpnet-base-v2',
    chunk_size=256,
    chunk_overlap=64,
    cache_dir='cache'
)

# Add documents
rag.add_documents('path/to/documents')

# Query the system
response = rag.query("What is the main topic of the documents?")
print(response)

Usage Guide

Document Processing

The system can process both PDF and DOCX files:

# Process all documents in a directory
rag.add_documents('path/to/documents')

# Process a single document
rag.add_document('path/to/document.pdf')

Querying

# Simple query
response = rag.query("What are the key findings?")

# Query with metadata
response = rag.query(
    "What methodology was used?",
    k=5,  # Number of contexts to retrieve
    include_metadata=True
)

Index Types

The system supports different FAISS index types:

# Flat index (exact search)
rag = RAGSystem(index_type='flat')

# IVF index (approximate search with clustering)
rag = RAGSystem(index_type='ivf')

# HNSW index (graph-based approximate search)
rag = RAGSystem(index_type='hnsw')

Context Strategies

Choose how to combine multiple relevant contexts:

# Concatenate all relevant contexts
rag = RAGSystem(context_strategy='concatenate')

# Use only the best matching context
rag = RAGSystem(context_strategy='best_match')

Optimization

# Optimize retrieval parameters
best_params = rag.optimize_retrieval([
    "What are the key findings?",
    "What methodology was used?",
    "What are the main conclusions?"
])

Saving and Loading

# Save the index
rag.save_index('path/to/index')

# Load the index
rag.load_index('path/to/index')

System Components

DocumentProcessor: Handles document loading and text chunking
EmbeddingEngine: Creates and manages embeddings
RetrievalEngine: Handles FAISS index and similarity search
ResponseGenerator: Generates responses from retrieved contexts
RAGSystem: Main class that orchestrates all components

Performance Optimization

Caching
- Enable caching by providing a cache directory
- Embeddings and indices are cached for reuse
- Reduces computation time for repeated operations
Chunking Strategy
- Adjust chunk size based on your documents
- Use appropriate overlap for context continuity
- Consider semantic chunking for better results
Index Selection
- Flat index: Best for small datasets (<100K documents)
- IVF index: Good for medium datasets
- HNSW index: Best for large datasets
Parameter Tuning
- Use optimize_retrieval() for index parameters
- Adjust batch size for large document collections
- Configure context length based on your needs

Jupyter Notebook

A comprehensive Jupyter notebook (rag_system.ipynb) is included that demonstrates:

System setup and configuration
Document processing
Querying and response generation
System optimization
Advanced usage examples

Web Interface

The system includes a Gradio-based web interface for easy interaction. To launch the interface:

python gradio_interface.py

The interface provides three main tabs:

Upload Documents
- Upload individual PDF/DOCX files
- Process entire directories of documents
- View processing status
Configuration
- Adjust chunk size and overlap
- Select index type (Flat, IVF, HNSW)
- Choose context strategy
- Set maximum context length
Query
- Enter questions
- Adjust number of contexts to retrieve
- View answers with source documents
- Toggle metadata display

The interface automatically manages document processing, embedding creation, and query handling through an intuitive UI.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md
document_processor.py		document_processor.py
embedding_engine.py		embedding_engine.py
gradio_interface.py		gradio_interface.py
main.py		main.py
rag_system.ipynb		rag_system.ipynb
requirements.txt		requirements.txt
response_generator.py		response_generator.py
retrieval_engine.py		retrieval_engine.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Retrieval-Augmented Generation (RAG) System

Features

Installation

Quick Start

Usage Guide

Document Processing

Querying

Index Types

Context Strategies

Optimization

Saving and Loading

System Components

Performance Optimization

Jupyter Notebook

Web Interface

Contributing

License

About

Releases

Packages

Languages

cfleming22/learnlocalllm

Folders and files

Latest commit

History

Repository files navigation

Retrieval-Augmented Generation (RAG) System

Features

Installation

Quick Start

Usage Guide

Document Processing

Querying

Index Types

Context Strategies

Optimization

Saving and Loading

System Components

Performance Optimization

Jupyter Notebook

Web Interface

Contributing

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages