-
Notifications
You must be signed in to change notification settings - Fork 599
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Huge increase in PDF filesize since 1.3.12 #450
Comments
Thank you for reporting. These are the release notes for 1.3.12: |
Can you please submit a Pdf file generated with the two relevant versions of OpenPDF, it will be useful to understand the problem. |
I have created two pdf files (with sensitive information removed) for comparison. Although the actual document has 198 pages I uploaded only 20 pages for each since the difference is already visible. File created with OpenPDF 1.2.17 ~1MB When opened in a PDF Viewer the files look the same. |
The large PDF has more than 180000 indirect objects and all but at most 150 are extended graphics state dictionaries setting foreground or background opacity to 1. From the release notes the obvious candidate is
|
@mkl-public what tool do you use to see this indirect objects? |
A standard text viewer (the Total Commander built-in one in my case but most text viewers should do).
|
@renber do you think you could provide a simple Java program to reproduce these tables ? |
@renber could you give this branch build a try? |
Or try the SNAPSHOT Version :-) Already merged. |
Thank you. File created with OpenPDF 1.3.24-SNAPSHOT (2020/11/09) ~7MB I can prepare a sample program, but it will take some time. |
@renber it would be really nice to have this sample because your tables are really complicated to reproduce. |
I have created a sample repository with a program which replicates the table structure (as far as possible by stripping away everything unnecessary): https://github.com/renber/OpenPDF_Issue450 |
@renber |
…, if not needed
@renber the new version from my branch should the good for you. If you can give it a try and let me know, it would be great. |
@renber Please check the latest SNAPSHOT. Is the problem solved now? |
It is a further improvement, thank you. The pdf from the sample repository is fine now. How should we proceed in debugging this? |
Hi renber, have you tried to set the compression level of the pdf? |
@arnthom: Unfortunately, neither |
I have investigated this a bit further. When using OpenPDF 1.3.12 as it is, the resulting file is 150MB. By reverting only the changes from the two commits from PR #282 the filesize drops to 10 MB like before. So there still seems to be something in this feat which does not play nice with our pdfs (and has not been caught yet by the changes in 5feac69 and a2f5e3b). |
@renber is possible for you update your sample to be closer to the PDF generated in production? This way I'll be able to investigate further. |
@sixdouglas: I created the sample by stripping away sensible (and hard to port) stuff, so I am afraid I cannot provide a better one within a reasonable time. I made some changes to the OpenPDF code of 1.2.12 myself (based on the commit diff) and if I change the |
@renber can you give a sample of your production file in order to try to see how I can go further on this subject? |
@renber : I pushed a new commit on my branch, |
@sixdouglas: That's it. I have run our pdf generation with the code from your |
It really was a wild guess, but if it works for you it good for me! 😉 |
Describe the bug
After updating to the latest version we noticed that the pdfs we create using OpenPDF have massively increased in size.
(e.g. a PDF with ~200 pages containing nested tables was 10MB and now is 150MB).
I have traced it back to changes in 1.3.12. If I use a OpenPdf version < 1.3.12 our software outputs a 10MB file, If I update OpenPDF to anything >= 1.3.12 the very same code produces a 150MB file.
To Reproduce
Currently I have no example, since we are running this in production. But it is solely dependent on changes in OpenPDF since reverting back to an old version results in smaller pdfs without changing any of the pdf generation code.
Maybe someone has an idea where this behavior might stem from?
Preparing an example is possible, but will take some time.
The text was updated successfully, but these errors were encountered: