Skip to content

FazlOmar9/RAG-ready

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RAG-ready

RAG-ready pipelines your PDF preprocessing, chunking, vectorisation and storage to Pinecone. Go from documents to a rag ready pinecone index in one go. Be it for search applications, chatbot development or recommendation systems - RAG-ready simplifies it all.

Requirements

  • Python 3.6+
  • .env file with the following keys:
    • JINA_API_KEY
    • PINECONE_API_KEY
    • PINECONE_INDEX_NAME

Installation

  1. Clone the repository:

    git clone https://github.com/FazlOmar9/RAG-ready.git
    cd RAG-ready
  2. Create a virtual environment:

    python -m venv .venv
  3. Activate the virtual environment:

    • On Windows:

      .venv\Scripts\activate
    • On macOS/Linux:

      source .venv/bin/activate
  4. Install the required packages:

    pip install -r requirements.txt
  5. Create a .env file in the root directory with the following content:

    JINA_API_KEY=your_jina_api_key
    PINECONE_API_KEY=your_pinecone_api_key
    PINECONE_INDEX_NAME=your_pinecone_index_name

Usage

  1. Place your PDF file in the documents directory.

  2. Run the main script with the path to your PDF file as an argument:

    python main.py documents/your_pdf_file.pdf

This will extract text from the PDF, segment it, generate embeddings, and upload the vectors to Pinecone.

About

Pipeline that makes RAG ready Pinecone indexes from documents in one go.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages