LocalRQA: A Private, Local Question-Answering System

LocalRQA is a toolkit for building and running your own private "ChatGPT" that is an expert on your specific documents. It uses the Retrieval-Augmented Generation (RAG) technique to provide answers that are factually grounded in a knowledge base you provide, ensuring all processing is handled locally to maintain data privacy.

Features

Retrieval-Augmented Generation (RAG): Ensures answers are based on provided documents, reducing factual errors and hallucinations.
100% Local & Private: Your documents, questions, and the AI model's processing never leave your machine.
GPU Accelerated: Designed to run on local NVIDIA GPUs for high-performance inference.
Interactive UI: Comes with a Gradio-based web interface for easy interaction and demonstration.

Installation Guide for Windows with NVIDIA GPU (e.g., RTX 3050/4060)

The original installation process is not compatible with modern Windows environments. The following is a definitive guide that includes all necessary fixes.

1. Prerequisites

Before you begin, ensure you have installed the following:

Git for Windows: Download here
Python 3.10 (64-bit): Download here.
- CRITICAL: During installation, you MUST check the box "Add Python 3.10 to PATH".
Visual Studio Build Tools: Download here.
- CRITICAL: During installation, you MUST select the "Desktop development with C++" workload.

2. Environment Setup

Open your terminal (Windows PowerShell or Command Prompt).

# Clone the project from GitHub
git clone [https://github.com/jasonyux/LocalRQA.git](https://github.com/jasonyux/LocalRQA.git)

# Navigate into the project folder
cd LocalRQA

# Create a Python 3.10 virtual environment
py -3.10 -m venv rtx_env

# Activate the environment
.\rtx_env\Scripts\activate

3. Manual Code & Data Fixes

You must make these changes before installing dependencies.

A. Edit setup.py:

Open the setup.py file.
Inside the install_requires=[ ... ] list, find and delete the entire lines for 'deepspeed', 'faiss-gpu', and 'flash_attn'.
Save the file.

B. Check huggingface.py:

Open the file local_rqa\qa_llms\huggingface.py.
Ensure the line model = model.cuda() (around line 58) is active (it should NOT have a # in front of it).

C. Create Runner Scripts:

In the main LocalRQA folder, create the following three new files:

run_controller.py:

import subprocess
import sys
subprocess.run([sys.executable, 'local_rqa/serve/controller.py'] + sys.argv[1:])

run_worker.py:

import subprocess
import sys
subprocess.run([sys.executable, 'local_rqa/serve/model_worker.py'] + sys.argv[1:])

run_web.py:

import subprocess
import sys
subprocess.run([sys.executable, 'local_rqa/serve/gradio_web_server.py'] + sys.argv[1:])

D. Prepare the Database:

Run these commands in your activated terminal:

# Create the folder the code expects
mkdir example\databricks\database_fixed

# Copy and rename the database file
copy example\databricks\database\databricks.pkl example\databricks\database_fixed\documents.pkl

4. Install Dependencies

Run these commands one by one in your activated (rtx_env) terminal.

# 1. Install the GPU version of PyTorch for CUDA 11.8
pip install torch torchvision torchaudio --index-url [https://download.pytorch.org/whl/cu118](https://download.pytorch.org/whl/cu118)

# 2. Install the stable CPU version of FAISS for Windows
pip install faiss-cpu

# 3. Install missing dependencies
pip install -U langchain-community uvicorn

# 4. Install the project itself and all other requirements
pip install -e .

Usage: Running the Application

After a successful installation, open three separate terminals. In each one, navigate to your LocalRQA folder and activate the environment (.\rtx_env\Scripts\activate).

In Terminal 1 (Launch the Controller):
```
python run_controller.py
```
In Terminal 2 (Launch the Model Worker): Note: The first time you run this, it will download the language model (approx. 2 GB). This command is tailored for GPUs with low VRAM (e.g., RTX 3050).
```
python run_worker.py --qa_model_name_or_path TinyLlama/TinyLlama-1.1B-Chat-v1.0 --database_path example/databricks/database_fixed --load_8bit
```
Wait for this to finish loading. It will say "Uvicorn running on..."
In Terminal 3 (Launch the Web UI): Wait for Terminal 2 to be ready, then run this.
```
python run_web.py --example "What is LocalRQA?"
```

Finally, open your web browser and navigate to http://localhost:7860 to use the application.

How It Works

The system uses a two-step "Open-Book Exam" process:

Retrieval: When you ask a question, the system first searches its knowledge base (the documents.pkl file) to find the most relevant text chunks.
Generation: It then gives your question and these retrieved chunks to the language model (TinyLlama), which generates an answer based only on the provided facts.

Name		Name	Last commit message	Last commit date
Latest commit History 170 Commits
.github/workflows		.github/workflows
docs		docs
example		example
local_rqa		local_rqa
scripts		scripts
#		#
.gitignore		.gitignore
CITATION.bib		CITATION.bib
CODEOWNERS		CODEOWNERS
LICENSE.md		LICENSE.md
README.md		README.md
copy		copy
demo.py		demo.py
mkdir		mkdir
requirements.txt		requirements.txt
run_controller.py		run_controller.py
run_web.py		run_web.py
run_worker.py		run_worker.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LocalRQA: A Private, Local Question-Answering System

Features

Installation Guide for Windows with NVIDIA GPU (e.g., RTX 3050/4060)

1. Prerequisites

2. Environment Setup

3. Manual Code & Data Fixes

4. Install Dependencies

Usage: Running the Application

How It Works

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

License

Ajey95/Sibyl-System

Folders and files

Latest commit

History

Repository files navigation

LocalRQA: A Private, Local Question-Answering System

Features

Installation Guide for Windows with NVIDIA GPU (e.g., RTX 3050/4060)

1. Prerequisites

2. Environment Setup

3. Manual Code & Data Fixes

4. Install Dependencies

Usage: Running the Application

How It Works

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages