Skip to content

A Chainlit-powered multimodal PDF chatbot utilizing the ColQwen2-v1.0 visual retriever and Qwen2-VL-2B-Instruct model for efficient document retrieval and question answering.

Notifications You must be signed in to change notification settings

ahsannawazch/Multimodal-RAG

Repository files navigation

📚 Multimodal RAG App

Overview

This app leverages the power of Multimodal Retrieval-Augmented Generation (RAG) to help you find and understand information from PDFs and documents that contain images, charts, tables, and graphs. Using the ColQwen retriever from the Byaldi library, this app can efficiently index and search through your documents.

Features

  • 📄 PDF Upload: Upload your PDF documents directly to the app.
  • 🔍 Efficient Search: Perform searches within the document using advanced RAG models.
  • 🖼️ Image Handling: Extract and display images, charts, tables, and graphs from your documents.
  • 🤖 AI-Powered: Utilize state-of-the-art models for conditional generation and retrieval.

Requirements

Before you begin, ensure you have the following installed:

  • Python 3.10 or higher
  • Poppler (used for PDF processing)

Installing Poppler

For Linux (Ubuntu)

sudo apt-get install -y poppler-utils

For macOS

brew install poppler

Installation

  1. Clone the repository:
git clone https://github.com/yourusername/your-repo-name.git
cd your-repo-name
  1. Install the required Python packages:
pip install -r requirements.txt

Usage

  1. Run the app:
chainlit run app.py
  1. Upload a PDF: When prompted, upload your PDF file to begin indexing and searching.

  2. Ask Questions: Once the PDF is uploaded and indexed, you can ask questions about the content, and the app will retrieve and display relevant information, including images and text.

Future Enhancements

  • 📸 Screenshots: We will add screenshots of the app in action soon!

Enjoy using the Multimodal RAG App! 🚀

About

A Chainlit-powered multimodal PDF chatbot utilizing the ColQwen2-v1.0 visual retriever and Qwen2-VL-2B-Instruct model for efficient document retrieval and question answering.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published