-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC: Expand file size explanations #1835
Conversation
I included a header about removing sources
|
any object can be used many times in the same document so it is hard to decide to remove an object at the beginning. For full pages, I would more likely recommend to not insert them, providing a list of pages numbers or PageObjects that will not include it.
Some comments in the good sections may be sufficient The most important is to indicate that cropping only makes not visible outside but they are still present (for file size but also for text extraction) |
@pubpub-zz |
@pub-zz Is this good to merge? (I don't have access) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are talking about page deletions which is not currently implemented. I'm preparing a page to add that. This PR should wait for this new feature to be added.
Update : PR #1843 has been generated, pending merging
Co-authored-by: pubpub-zz <4083478+pubpub-zz@users.noreply.github.com>
Co-authored-by: pubpub-zz <4083478+pubpub-zz@users.noreply.github.com>
Co-authored-by: Martin Thoma <info@martin-thoma.de>
Co-authored-by: Martin Thoma <info@martin-thoma.de>
Thank you for the PR @DIvkov575 :-) I've adjusted some of the wording. By the way: https://chat.openai.com/ is pretty good in improving the language. It might be a good idea to let it improve more parts of the docs 🤔 |
New Features (ENH) - Simplify metadata input (Document Information Dictionary) (#1851) - Extend cmap compatibilty to GBK_EUC_H/V (#1812) Bug Fixes (BUG) - Prevent infinite loop when no character follows after a comment (#1828) - get_contents does not return ContentStream (#1847) - Accept XYZ destination with zoom missing (default to zoom=0.0) (#1844) - Cope with 1 Bit images (#1815) Robustness (ROB) - Handle missing /Type entry in Page tree (#1845) Documentation (DOC) - Expand file size explanations (#1835) - Add comparison with pdfplumber (#1837) - Clarify that PyPDF2 is dead (#1827) - Add Hunter King as Contributor for #1806 Maintenance (MAINT) - Refactor internal Encryption class (#1821) - Add R parameter to generate_values (#1820) - Make encryption_key parameter of write_to_stream optional (#1819) - Prepare for adding AES enryption support (#1818) Code Style (STY): - Iterate directly over the list instead of using range (#1839) - Minor refactorings in _encryption.py (#1822) [Full Changelog](3.8.1...3.9.0)
Closes #1786
Changing the viewboxes ("cropping") has no impact on file size
Removing complete pages only has an impact if the connected resources are also removed