Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PDF passes validation with type 3 fonts that aren't embedded #1458

Open
Corvwyn opened this issue Jun 13, 2024 · 10 comments
Open

PDF passes validation with type 3 fonts that aren't embedded #1458

Corvwyn opened this issue Jun 13, 2024 · 10 comments

Comments

@Corvwyn
Copy link

Corvwyn commented Jun 13, 2024

I'm converting some PDF files to PDF/A-1b. After the conversion we use veraPDF to verify that these are valid PDF/A-1b.

In one instance we have a PDF file that veraPDF validates as valid PDF/A-1b. This file contains two fonts called T3Font_0 and T3Font_1, that aren't embedded.

Is this the correct behaviour, or is there something I'm missing? Is there something special about type 3 fonts that doesn't require them to be embedded?

I can provide an example pdf if needed, I just need to ask if it's ok to share first.

@THausherr
Copy link

THausherr commented Jun 13, 2024

Type 3 fonts are not really fonts, this is a collection of PDF content streams, one per glyph, so them claiming not being embedded might be a misunderstanding.

@Corvwyn
Copy link
Author

Corvwyn commented Jun 13, 2024

@THausherr Thanks for the info. In that case, it makes sense that veraPDF validates the PDF this way.

Adobe Acrobat lists them as type 3 fonts that aren't embedded.

The main problem is that the library we use to concatenate these files see them as unembedded fonts. I guess we might have to create a support ticket, so they load type 3 fonts in a different way.

Thanks for the quick reply!

@bdoubrov
Copy link
Contributor

@Corvwyn feel free to upload the files to this issue. I would double check if indeed these fonts are Type3 ones. Font names might be misleading sometimes.

@Corvwyn
Copy link
Author

Corvwyn commented Jun 13, 2024

@THausherr Great. I will upload the pdf soon, I just need to ask if it's ok.

@Corvwyn
Copy link
Author

Corvwyn commented Jun 13, 2024

@THausherr Here you go. pdfa1b_with_type3fonts.pdf

@THausherr
Copy link

Yeah there are type 3 fonts on page 60. And it's like I described.

Btw I found a different problem. VeraPDF claims it is a PDF/A file, PDFBox Preflight claims it isn't. The reason is that the file has /SMask (None) but "None" as a string instead of as a name. One of us is wrong 😂

@Corvwyn
Copy link
Author

Corvwyn commented Jun 13, 2024

Hmm. What a predicament 😛

@petervwyatt
Copy link

I will also highlight PDF Errata 118 and PDF Errata 6 - words such as "absent" or "present" in all PDF specs are ambiguous and very likely not what is desired. This may be why...

@bdoubrov
Copy link
Contributor

Well, /SMask (None) is a violation of ISO 32000-1. And the Arlington PDF checker does find this as well as a number of other deviations. So, strictly speaking the behaviour of PDF/A validator is undefined.

Currently veraPDF accepts both /None and (None) as permitted values of /SMask entry in the ExtGState dictionary. More correct behaviour would be to report the violation of ISO 32000-1 in the logs (which can also be optionally included into the report) and ignore this entry. The result of PDF/A-1b validation would not change though: the /SMask entry would be treated as not present and thus would comply to the PDF/A-1b requirements.

@bdoubrov
Copy link
Contributor

bdoubrov commented Aug 2, 2024

We've implemented additional object type checks, so that /SMask (None) is reported as a validation error. This fix is available in the latest dev builds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants