index-pdf

A short python script to quickly index a HUGE pdf file based on regex

pip3 install PyPDF2

apt install tk

Put search.py in folder containing all PDF's and run it
Enter search query
Select pdf files, you can select multiple pdf's
Click on search
The initial building of index will be slow, after the document is indexed, search is fast

If documents have been modified, you will need to delete <document_name>.pdf.index files
Change the regex pattern according to your needs, By default search is for sentence_case (First Letter Is Capital)
I do not intend on maintaining this project as it is just a quick script I wrote to solve a common annoyance. However, feel free to provide feedback/feature requests

Provide feedback