-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ROB: Ignore EOF in NumberObject.read_from_stream #1505
Conversation
Sadly I do not have a reproducing PDF that I can add to the testing corpus. This came up when exercising |
Codecov ReportBase: 91.84% // Head: 92.03% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## main #1505 +/- ##
==========================================
+ Coverage 91.84% 92.03% +0.19%
==========================================
Files 33 32 -1
Lines 6202 5976 -226
Branches 1228 1163 -65
==========================================
- Hits 5696 5500 -196
+ Misses 326 312 -14
+ Partials 180 164 -16 Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
7ec8040
to
b4cb2fa
Compare
Use `ignore_eof=True` just like `NameObject` does, which is the only other caller to `read_until_regex` in this module. This helps prevent arcane exceptions when trying to parse a number: ``` PyPDF2.errors.PdfStreamError: Stream has ended unexpectedly ``` The motivation is essentially identical to the change that introduced `ignore_eof=True` on `NameObject`s: py-pdf@431ba70
b1d7aa7
to
c93f755
Compare
Hi there, I've rebased the branch off the latest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So now all usages of read_until_regex
override the default ignore_eof
value of False
, right? If we're going this route, should probably then flip the default value to True
and then remove using the third argument in the two places...
This was initially motivated by `NumberObject.read_from_stream`, which was calling `read_until_regex` with the default value of `ignore_eof=False` and thus raising exceptions like: ``` PyPDF2.errors.PdfStreamError: Stream has ended unexpectedly ``` py-pdf@431ba70 demonstrates a similar fix for `NameObject.read_from_stream`. From discussion in py-pdf#1505, it was realized that the change to `NumberObject.read_from_stream` had now made ALL callers of `read_until_regex` pass `ignore_eof=True`. It's cleaner to remove the parameter entirely and change the default behaviour.
I've opened a separate PR along these lines: #1521 Feel free to pick one and close out the other. I'm not familiar enough with this codebase to judge which is "better". |
This was initially motivated by `NumberObject.read_from_stream`, which was calling `read_until_regex` with the default value of `ignore_eof=False` and thus raising exceptions like: ``` PyPDF2.errors.PdfStreamError: Stream has ended unexpectedly ``` py-pdf@431ba70 demonstrates a similar fix for `NameObject.read_from_stream`. From discussion in py-pdf#1505, it was realized that the change to `NumberObject.read_from_stream` had now made ALL callers of `read_until_regex` pass `ignore_eof=True`. It's cleaner to remove the parameter entirely and change the default behaviour.
I'm closing this one as the parameter was now removed with #1521 :-) |
Use
ignore_eof=True
just likeNameObject
does, which is the only other caller toread_until_regex
in this module.This helps prevent arcane exceptions when trying to parse a number:
The motivation is essentially identical to the change that introduced
ignore_eof=True
onNameObject
s:431ba70