-
-
Notifications
You must be signed in to change notification settings - Fork 155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
invalid output PDF for input with incomplete CIDsets embedded in PDXObjects #659
Comments
Hello, I have unfortunately also encountered validation errors in the PDF part with embedded CID fonts. These errors occur when the PDF document contains certain special characters. For example, these characters could include: ÓŁĄŚ–. Here is my process: I convert the PDF to PDF/A using Ghostscript. According to the Vera Online tool, the resulting file is still valid at this time. However, after attaching the XML with Mustang, the file is no longer valid. Mustang responds with multiple errors for the PDF part, such as:
If the code adjustment proposed by mr-mister123 works, I would be very happy to see it included in a new version. thanks and greetings |
I don't think that my patch will fix your issue, since with my patch mustang will remove incomplete CIDsets not only from the root document, but also from embedded XObjects. But your problem doesn't seem to be related to incomplete CIDsets, but to missing records in the unicode-mapping of embedded fonts. Since you are using ghostscript i assume, that your source-pdf-file that is the input to the mustang-lib will be of type PDF/A-1b. What Mustang does is, to change the filetype to PDF/A-3b as this is required when embedding files into pdf. this is be done by basically saying just changing the header of the pdf-file. Can you provide sample-files? |
Hi,
we create PDFs in the following way: We create Invoice-PDFs (PDF/A-1b) using jasperreports. After that we combine them with an background-layer-PDF that contains the company-logo and other informations like an imprint. This 'background-PDF' is also a valid PDF/A-1b file.
Both files can be verified successfully using verapdf. We combine them using the org.apache.pdfbox.multipdf.Overlay-Class of PDFBox. The result can also be verified successfully.
When adding the XML to build a valid Zugferd-PDF using Mustang the output is not valid anymore. It has to do with the CIDSet as mentioned in issue #249 .
The problem is, that combining both PDFs using the Overlay-Class is done by wrapping the layer in an PDXObject. The patch supplied for #249 just scans for fonts in the root-dictionary.
Fonts embedded in PDXObjects are not scanned.
Replacing ZUGFeRDExporterFromA3.removeCidSet(PDDocument) with these two functions will fix the problem:
It would be nice, if you could integrate this patch into mustang-project.
Thanks and greetz,
Karsten
The text was updated successfully, but these errors were encountered: