-
Notifications
You must be signed in to change notification settings - Fork 685
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decode Integer Metadata #297
Comments
Hi @prgx-csmith01 Would it be possible for you to redact everything from the PDF and then share it so that it can be added as a test to PR #298 ? |
Hi @samkit-jain , I can't share the PDF but we have created a test file for you with an example of the metadata issue. I hope this helps. Thanks! |
h/t @prgx-csmith01 for providing the PDF
Many thanks @prgx-csmith01 I have updated the PR #298 with the test case. |
As an aside: That integer value of the Copies entry is invalid. According to the specification:
(ISO 32000-1) ... and neither is there any Copies entry in table 317 nor any other entry with a numeric type, merely text strings, dates, and names. Thus, this issue strictly speaking is not a bug (as labeled currently) but a request to support one more type of invalid PDFs. |
@mkl-public That's a good point, and thank you for raising it. I think your diagnosis is correct. I certainly don't want to slide down the slippery slope of trying to handle all malformed PDFs. In this case, however, @samkit-jain has PR'ed an efficient solution — it's a simple adjustment, and one that hopefully will accommodate a few other classes of invalid metadata entries in the future (without becoming a burden on the processing of valid PDFs). |
h/t @prgx-csmith01 for providing the PDF
Closed via #298; now available in |
I have received this error message for a PDF file:
It seems that there is no handling for integer metadata in the init of pdf.py
Previously there was a similar bug raised #67 for boolean objects.
I cannot provide the PDF used that caused this error as it is client data. The metadata of the file contains { ... , "Copies" : 0 }.
The text was updated successfully, but these errors were encountered: