Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing "Size" info in the trailer #1901

Closed
talcher opened this issue Jun 21, 2023 · 3 comments · Fixed by #1911
Closed

Missing "Size" info in the trailer #1901

talcher opened this issue Jun 21, 2023 · 3 comments · Fixed by #1911

Comments

@talcher
Copy link
Contributor

talcher commented Jun 21, 2023

Hi,

I use WeasyPrint to create PDF.
I recently upgraded my libraries to WeasyPrint 59.0 and pypdf 3.10.0.

My aim is to get the trailer of a PDF I created with WeasyPrint. In this trailer, I need the "Size" field, among others.

Previously, I got something like:

{'/Size': 14, '/Root': IndirectObject(3, 0, 140016423632752), '/Info': IndirectObject(2, 0, 140016423632752)}

Now, I get something like:

{'/Root': IndirectObject(3, 0, 140470815616880), '/Info': IndirectObject(2, 0, 140470815616880)}

If I force WeasyPrint to create a 1.4-versioned PDF, I get the trailer I expect.
However, this is not what I want. It would be much better not to force the 1.4 version.

Environment

  • OS: Debian 11
  • Python: 3.9.2
  • WeasyPrint: 59.0
  • PyPDF: 3.10.0

Code + PDF

from pypdf import PdfReader

reader = PdfReader("example.pdf")
trailer = reader.trailer.get_object()

Here are two PDF samples:

Suggestion

I have had a look at pypdf source code.
Correct me if I am wrong, but I think my issue is related to the "_read_xref_tables_and_trailers" function of the reader class, where trailer keys are defined that way: "trailer_keys = TK.ROOT, TK.ENCRYPT, TK.INFO, TK.ID"

If you need some more information, I would be glad to provide them.
Thank you for your help.

Alcher

@pubpub-zz
Copy link
Collaborator

Your analysis looks correct and I agree that TK.SIZE could be added to the list : the value is mandatory, as as part of the list, it will not be overwritten by old trailers.
Can you propose a PR and add a test to check the field is correctly extracted ?

@talcher
Copy link
Contributor Author

talcher commented Jun 23, 2023

Hello,
I just proposed a PR. Automatic checks have been successful.
Is that all you need? Sorry, I am a kind of a noob with GitHub workflows...

@MartinThoma
Copy link
Member

Thank you for your input @talcher 🙏 Yes, this is all we need for the moment :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants