This project provides an interactive research assistant designed to enhance your document comprehension and navigation. The assistant allows users to query multiple PDFs for specific answers, retrieving relevant document sections. It includes an intuitive interface for viewing and navigating to specific pages of PDFs or PowerPoint files, offering targeted previews for improved understanding.
Ensure you have all required dependencies installed. You can do this by running:
pip install -r requirements.txt
- Place the
vector_store
directory (for storing vectorized document representations) and thenlp_data
directory (containing your PDFs) in the project's root directory. - Ensure all necessary files are available in these directories.
-
Start the Flask API:
python backend.py
This will launch the API on
localhost:5003
. -
Run the Streamlit application:
streamlit run app.py
The interface will be available at
localhost:8501
on your local machine. -
Optionally, explore the
demo.py
notebook for additional functionalities and demonstrations of the project's capabilities.
-
Create evaluation dataset using the code provided in
prepare_ragas_set.ipynb
. -
Run the code in
run_ragas.py
. -
The evaluation results will be saved in
mini_result.txt
.