-
-
Notifications
You must be signed in to change notification settings - Fork 692
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Seeing a PDF hang - no error - while loading #95
Comments
Hello @matthopson! I spend some time investigating this today. It turns out that the parser does eventually finish. However, it takes a very long time to do so - 1300 seconds (22 minutes), to be precise! This is caused by an interplay between this particular PDF and a flaw in This is a very large PDF. And it also contains some strange objects. Specifically, several very large arrays filled almost entirely with null values (e.g. Anyways, I was able to fix this in #99. With these changes, the parser finishes in just 13 seconds (0.22 minutes) - 99% faster! I just cut prerelease You can install this prerelease with npm:
It's also available on unpkg:
Please try it out and let me know if it works for you! |
@matthopson I should also mention that when working with large PDF files like this, it's a good idea to disable object streams when saving them: const pdfBytes = PDFDocumentWriter.saveToBytes(pdfDoc, { useObjectStreams: false }); Some PDF readers perform poorly when displaying large PDFs saved with object streams. Saving your documents this way helps avoid that issue. |
Thanks Andrew! I can confirm that this fixes the long processing time for that PDF! I really appreciate your help with this, and all your work on this library in general. |
Hi,
We're seeing an issue with one PDF in our testing. This was a randomly-downloaded PDF, and we've not see this issue with any other PDF we've tested it against. We tried tracking the bug down in the parser, but had to move on. I wonder if you'd be able to offer any insight?
The PDF can be sourced here:
http://downloadcenter.samsung.com/content/UM/201903/20190326104351182/ENG_US_MUSATSCR-2.0.2.pdf
Some things of note:
In our case, it'd be helpful if we were at least getting an error kicked back - but it looks more like it's stuck in a loop.
Thanks for any info you can offer!
The text was updated successfully, but these errors were encountered: