Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error rendering generated PDF #5255

Closed
chrert opened this issue Sep 2, 2014 · 9 comments
Closed

Error rendering generated PDF #5255

chrert opened this issue Sep 2, 2014 · 9 comments

Comments

@chrert
Copy link

chrert commented Sep 2, 2014

We generate PDF documents using the iText java library in a web application. Unfortunately, the generated documents can't be viewed in firefox using the pdf.js viewer. All other tested browsers and document viewers (Internet Explorer, Chrome, Evince, Adobe Reader) are able to render the files correctly.

I've also tried to render the files using the hello world example from the website, but the result is the same as for firefox:

Error: Invalid XRef stream header" pdf.worker.js:216 "XRef_readXRef@resource://pdf.js/build/pdf.worker.js:4613:13 XRef_parse@resource://pdf.js/build/pdf.worker.js:4207:9 PDFDocument_setup@resource://pdf.js/build/pdf.worker.js:3457:7 PDFDocument_parse@resource://pdf.js/build/pdf.worker.js:3337:7 LocalPdfManager_ensure/<@resource://pdf.js/build/pdf.worker.js:2909:11 LocalPdfManager_ensure@resource://pdf.js/build/pdf.worker.js:2904:5 BasePdfManager_ensureDoc@resource://pdf.js/build/pdf.worker.js:2840:7 loadDocument/</<@resource://pdf.js/build/pdf.worker.js:37462:42 " pdf.worker.js:218 "Warning: Unsupported feature "unknown"" pdf.worker.js:201 "Warning: Unsupported feature "unknown"" pdf.js:201 "Warning: Indexing all PDF objects" pdf.worker.js:201 "Error: catalog object is not a dictionary" pdf.worker.js:216 "assert@resource://pdf.js/build/pdf.worker.js:233:5 Catalog@resource://pdf.js/build/pdf.worker.js:3793:1 PDFDocument_setup@resource://pdf.js/build/pdf.worker.js:3458:7 PDFDocument_parse@resource://pdf.js/build/pdf.worker.js:3337:7 LocalPdfManager_ensure/<@resource://pdf.js/build/pdf.worker.js:2909:11 LocalPdfManager_ensure@resource://pdf.js/build/pdf.worker.js:2904:5 BasePdfManager_ensureDoc@resource://pdf.js/build/pdf.worker.js:2840:7 loadDocument/</<@resource://pdf.js/build/pdf.worker.js:37462:42 " pdf.worker.js:218 "Warning: Unsupported feature "unknown"" pdf.worker.js:201 "Warning: Unsupported feature "unknown"" pdf.js:201 "An error occurred while loading the PDF. PDF.js v1.0.277 (build: 250d394) Message: catalog object is not a dictionary"

The document can be found at: https://pdf.yt/d/IMCg2hxrZs_rOPyN

@yurydelendik
Copy link
Contributor

Looks like iText library generated invalid (for PDF32000 standard point of view) PDF. Marking it as corrupted PDF since it has incorrect xref pointed from the file trailer.

screen shot 2014-09-02 at 8 11 50 am

PDF32000:

7.5.5 File Trailer
The trailer of a PDF file enables a conforming reader to quickly find the
cross-reference table and certain special objects. Conforming readers
should read a PDF file from its end. The last line of the file shall contain 
only the end-of-file marker, %%EOF. The two preceding lines shall contain, 
one per line and in order, the keyword startxref and the byte offset in 
the decoded stream from the beginning of the file to the beginning of 
the xref keyword in the last cross-reference section.

@yurydelendik
Copy link
Contributor

Looks like there are two PDFs just glued together. I split the file into two PDFs -- they both rendered perfectly. @erti are you sure your java application are not trying to save two PDFs into single stream?

@yurydelendik
Copy link
Contributor

@erti Sorry there are three of them. Is it one PDF for database entry?

@chrert
Copy link
Author

chrert commented Sep 2, 2014

You're right, it's a document merged from 3 other documents using iText.
However, we were able to track down the part of our code that's causing the corruption, so it's definitely not a pdf.js bug!

Sry for the inconvenience!

@yurydelendik
Copy link
Contributor

It will be nice to know how wide spread this issue before we address it by recovering the data. Is it a iText standard way to create combined PDF documents and is there a particular example used to teach how to combine documents?

@chrert
Copy link
Author

chrert commented Sep 2, 2014

The concatenation of the pdf files is done as described here and works as expected!

The corruption of the pdf is caused by our code that is rendering the watermarks, footers and headers on the pdf pages. It seems like we are doing something wrong here but we haven't fully evaluated the issue yet!

If you are interested in the final fix I could publish an explanation as soon as we figured how to do it the right way. However, I don't think that it is related to pdf.js ...

@CodingFabian
Copy link
Contributor

i think the point here is not how the document is incorrectly rendered using itext (or any other gen), but the question is if there should be efforts done to behave like the other viewers and display something (possibly only the first pdf inside that blob)

@timvandermeij
Copy link
Contributor

I'm closing this for now as the PDF was clearly badly generated and the problem has been resolved (not PDF.js related). If we see more of such documents, we can attempt to address it, but it doesn't seem relevant for now.

@timvandermeij
Copy link
Contributor

Fixed by #5910 which makes even these corrupted PDFs at least show the first PDF, just like Adobe Reader.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants