PDF Summarizer

The PDF Summarizer is a command-line tool designed to help users manage and perform various operations on PDF files. This README provides a clear overview of how to use the tool, highlighting its key functionalities and their implementations.

Installation

Clone the repository:

git clone https://github.com/Programming-Sai/PDF-Summarizer.git

Navigate to the project directory
```
cd PDF-Summarizer
```

Create a Virtual Environment and activate it.

python -m venv .ospdf-venv

.ospdf-venv\Scripts\activate  # Windows

OR

source .ospdf-venv/bin/activate  # MacOS/Linux

Important

Make sure to select the new virtual environment .ospdf-venv as your interpreter in VS Code. Use the shortcut Ctrl + Shift + P (Windows/Linux) or Cmd + Shift + P (Mac), then type and select "Python: Select Interpreter". Choose the interpreter option marked Recommended or Python 3.x.x ('.ospdf-venv':venv).

Install the required dependencies:
```
pip install -r requirements.txt
```

Run the application:

python -u main.py

On running the application, you should see an output similar to this:


 _____                             _____
( ___ )---------------------------( ___ )
 |   |                             |   |
 |   |                      _  __  |   |
 |   |   ___  ___ _ __   __| |/ _| |   |
 |   |  / _ \/ __| '_ \ / _` | |_  |   |
 |   | | (_) \__ \ |_) | (_| |  _| |   |
 |   |  \___/|___/ .__/ \__,_|_|   |   |
 |   |           |_|               |   |
 |___|                             |___|
(_____)---------------------------(_____)

Welcome to PDF Summarizer!

Version: 0.0.1

PDF Summarizer helps you manage and work with PDFs. Here are some of the things you can do:
- Summarize PDF content based on highlighted text.
- Split a PDF into individual pages or ranges.
- Merge multiple PDFs into one.
- Convert a PDF page into an image.

Tips
------
- Use `init` to set the input file once and avoid specifying it repeatedly.
- Reset your session with `init -r` to start fresh.
- Use `-h` or `--help` when in doubt.

Functionalities

1. Summarize Highlighted Text

Description: Extract and summarize highlighted text from a PDF file.

Implementation:
- Parses the PDF for annotations.
- Extracts the highlighted content.
- Optionally includes images from the PDF in the output.
What it does:
- Produces a summary as plain text, a PDF, or a Word document.

Usage:

python main.py summarize --input-path <path_to_pdf> --output-path <output_path>

2. Split PDF

Description: Extract specific pages or ranges of pages from a PDF.

Implementation:
- Uses a PDF parser to split the document based on page indices.
- Saves the extracted pages as a new PDF.
What it does:
- Enables breaking large PDFs into smaller, more manageable files.

Usage:

python main.py split <path_to_pdf> <output_pdf> --start-page <start> --end-page <end>

3. Merge PDFs

Description: Combine multiple PDF files into one.

Implementation:
- Reads the input PDFs.
- Concatenates their pages in the specified order.
- Outputs a single, merged PDF.
What it does:
- Consolidates multiple related documents into a single file.

Usage:

python main.py merge <output_pdf> <input_pdf_1> <input_pdf_2> ...

4. Convert PDF to Image

Description: Convert a single page of a PDF into an image.

Implementation:
- Extracts the specified page from the PDF.
- Renders the page as an image.
- Saves the image in the desired format (e.g., PNG, JPEG).
What it does:
- Enables visual representation of PDF content for use in presentations or web pages.

Usage:

python main.py pdf2img  <path_to_pdf> <output_image> <page_number>

Tips

Initialization: Use the init command to set a default PDF file for your session, eliminating the need to specify the file repeatedly for each operation.
Help: Add -h or --help to any command for detailed usage instructions.
Reset: Start fresh by resetting the session with the init -r command.

Troubleshooting

Ensure you have Python 3.10+ installed.
Verify dependencies are correctly installed using:
```
pip list
```
If a command fails, check the help menu for correct syntax.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
.gitignore		.gitignore
OPERATING SYSTEMS.pdf		OPERATING SYSTEMS.pdf
PROBLEMS.md		PROBLEMS.md
README.md		README.md
extractors.py		extractors.py
main.py		main.py
pipInstall.sh		pipInstall.sh
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PDF Summarizer

Installation

Functionalities

1. Summarize Highlighted Text

2. Split PDF

3. Merge PDFs

4. Convert PDF to Image

Tips

Troubleshooting

About

Uh oh!

Releases

Packages

Languages

Programming-Sai/PDF-Summarizer

Folders and files

Latest commit

History

Repository files navigation

PDF Summarizer

Installation

Functionalities

1. Summarize Highlighted Text

2. Split PDF

3. Merge PDFs

4. Convert PDF to Image

Tips

Troubleshooting

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages