-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add PDF import support for books (Issue #93) #119
Conversation
I took the PR for the epub support as template. I used PyPDf as library, the license should be fine: https://pypi.org/project/PyPDF2/
Hi @dgc08 -- this failed one of the acceptance tests, can you take a look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Failed acceptance test.
The acceptance test should work now. The sample PDF file contained an extra page number, which also got imported. Didn't run the test beforehand, sorry. |
No problem that’s what the tests are there for. Thanks for adding the test
it’s a big help for stability.
El El lun, ene. 8, 2024 a la(s) 8:13 p. m., Sinthoras39 <
***@***.***> escribió:
… The acceptance test should work now. The sample PDF file contained an
extra page number, which also got imported. Didn't run the test beforehand,
sorry.
—
Reply to this email directly, view it on GitHub
<#119 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAMPWDO4BF26GIDGZPS3MHDYNPWIZAVCNFSM6AAAAABBQL5TSKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBQHE4DIMZVHA>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
kinda screwed up my fork, wait a moment before merging |
ok its fine now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One last change requested: what needs to be put into the pyproject.toml
file (in project root)?
(I don't have CI checking this, couldn't sort out how to do it well.)
I hope that's it. Thank you for your patience with me, Lute is the first Open Source project / larger project in general that i contribute to, pytest and that stuff is still new to me |
It’s a super contribution so thank you. I’ll try it out soon with a bigger
pdf to see how it works, and will review the code again.
For a first PR it’s great! 👍
El El lun, ene. 8, 2024 a la(s) 11:03 p. m., Sinthoras39 <
***@***.***> escribió:
… I hope that's it. Thank you for your patience with me, Lute is the first
Open Source project / larger project in general that i contribute to,
pytest and that stuff is still new to me
—
Reply to this email directly, view it on GitHub
<#119 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAMPWDP7C55KRLXBTIIPHHLYNQKGXAVCNFSM6AAAAABBQL5TSKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBRGM3DEOJTG4>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Hi @dgc08 -- I did an import of a large file and found a few places where the import adds some spaces, eg vs the original pdf: This happens fairly frequently. It's almost certain that it's due to the PDF parser library ... should check. |
Yes, it's the library, there's a good write up by the author(s) in these links:
Per the authors:
So, I think this should be fine as it is, but we should mention somewhere that PDF imports are extremely tricky -- I'll draft that before merging this PR, as users should be aware of the limitations. |
Implement issue #93
I took the PR for the epub support as template and chnaged it accordingly.
I used
PyPDF2
as library, the license should be fine: https://pypi.org/project/PyPDF2/