A flexible Retrieval-Augmented Generation (RAG) system built with Ollama, designed for efficient document processing and contextual query responses.
- 📄 Multi-format document support (PDF, DOCX, Markdown, TXT)
- 🔍 Advanced vector similarity search
- 🤖 Integration with Ollama's language models
- 📊 Built-in benchmarking capabilities
- 🎯 High-precision context retrieval
- Python 3.8+
- Ollama installed and running
- Required Python packages (specified in
requirements.txt
)
# Clone the repository
git clone https://github.com/yourusername/custom-rag-ollama.git
# Navigate to project directory
cd custom-rag-ollama
# Install dependencies
pip install -r requirements.txt
- Access the Document Upload section in the UI
- Supported file formats:
- DOCX
- Markdown
- Plain text
- Documents are automatically processed and chunked for optimal retrieval
- Navigate to the Query tab
- Input your question
- Configure parameters:
- Number of results to retrieve
- Ollama model selection
- View results with:
- Generated response
- Source document metadata
- Similarity scores
Run performance tests to evaluate:
- Embedding generation speed
- Search efficiency
- Model response quality
- Overall system performance
-
Document Processing Pipeline
- Text extraction from multiple formats
- Intelligent document chunking
- Metadata preservation
-
Embedding Engine
- Vector embedding generation
- Optimization for search performance
- Model-agnostic design
-
Vector Search System
- High-performance similarity matching
- Configurable search parameters
- Result ranking optimization
-
Response Generation
- Integration with Ollama LLMs
- Context-aware response synthesis
- Source attribution
We welcome contributions! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.
- Ollama team for their excellent LLM framework
- Contributors and community members
For questions and support, please open an issue in the GitHub repository.