Future expansions are considered in this file. Their presence is not a promise that they'll exist, but rather this file serves as an early outline of features this project hopes to add, as well as changes in directions
The format is adapted from Keep a Changelog. It wont adhere perfectly, but it's a start.
Dates follow YYYY-MM-DD format
“What do you think?” he demanded impetuously.
“About what?” He waved his hand toward the book-shelves.
“About that. As a matter of fact you needn’t bother to ascertain. I ascertained. They’re real.”
“The books?”
He nodded.
“Absolutely real — have pages and everything. I thought they’d be a nice durable cardboard. Matter of fact, they’re absolutely real. Pages and — Here! Lemme show you.”
Taking our scepticism for granted, he rushed to the bookcases and returned with Volume One of the “Stoddard Lectures.”
“See!” he cried triumphantly. “It’s a bona-fide piece of printed matter. It fooled me. This fella’s a regular Belasco. It’s a triumph. What thoroughness! What realism! Knew when to stop, too — didn’t cut the pages. But what do you want? What do you expect?”
-- Owl Eyes in Great Gatsby by F. Scott Fitzgerald
In Progress.
Keith Murray
email: kmurrayis@gmail.com | twitter: @keithTheEE | github: CrakeNotSnowman
Unless otherwise noted, all changes by @kmurrayis
convert PDF into images
Pytesseract to get text
opencv to get format type?
- Make a flag for direct PDF generated by Latex, vs PDF generated by latex + artifacts, where artifacts would be caused by actions like scanning a paper copy of the PDF.
- Because the 'convert' function is vulnerable to remote code execution, and because this tool is meant to be paired with a browser extension, I need to search for another method to render the pdf as an image.
- Work around is pdftoppm
- Consider exporting equations from latex to python via sympy.
- Also consider rendering equations as png using the same function (useful when build target is old ereaders like the touch)