Consider a standardized way to detect PDF in HTML? #3462

travisleithead · 2018-02-08T23:04:24Z

After running across a site interoperability issues in Edge, we'd like to raise the possibility of a standard (or suggested) way for surfacing a UA's support of PDFs. Would love both opinions and proposals.

Today, among Chrome, Edge, and Firefox, Chrome and Edge both add an entry (or entries) into the navigator.plugins and navigator.mimeTypes collections such as:

{type: "application/pdf", suffixes: "pdf", description: ""}
{type: "application/x-google-chrome-pdf", suffixes: "pdf", description: "Portable Document Format"}
{type: "application/pdf", suffixes: "pdf", description: "Edge PDF Viewer"}

E.g., the site honda.it tries to detect a variety of potential strings here: var p=["Chrome PDF Viewer","WebKit built-in PDF","Adobe Acrobat"];

Sites are starting to take an interest in these. Perhaps we should come up with a vender-neutral way of advertising support for PDF?

The text was updated successfully, but these errors were encountered:

domenic · 2018-02-08T23:14:40Z

This makes sense to me; although navigator.plugins/mimeTypes are kind of weird implementation-defined voodoo-land, they are web-observable, and to the extent people are depending on them we should probably try to lock them down. Especially if that can be useful for web authors.

My Safari Tech Preview install also has similar entries, including both application/pdf and text/pdf.

It seems like one way to detect in a cross-browser way today is

const supportsPDF = "application/pdf" in navigator.mimeTypes;

Do you think we should codify this in the standard somehow, beyond what is already there in the spec for navigator.plugins/navigator.mimeTypes? What's there seems pretty good to me, but perhaps you had something specific in mind?

travisleithead · 2018-02-08T23:24:05Z

That's a nice feature detect, and probably good enough for a casual check. What I don't know is whether there is some deeper behavior that sites are somehow trying to take advantage of with their more complex checks, e.g., looking for specific name or description values.

If there really aren't web-facing feature differentiation in the various PDF implementations, then perhaps suggesting a "typical" generic PDF entry in the plugins collection would help encourage use of the simple feature detect you note above.

domenic · 2018-02-08T23:26:50Z

Hmm, I see. My guess would be that sites weren't aware you could write such simple feature detection code, but I'm not sure.

Do you think we should address this at the authoring guidance level, e.g. give advice on how to feature detect PDF? Or at the browser level, by e.g. giving advice that all browsers include one of the three strings you listed in your example in the OP for the description of their application/pdf entry?

travisleithead · 2018-02-08T23:47:08Z

Both? Authoring and browser level? I suppose authoring guideance is a no-brainer. Wondering how anyone else would feel about more prescriptive browser-level changes to align on one way of presenting PDF support via the navigator's plugin collection? @bzbarsky @annevk @RByers @rniwa

bzbarsky · 2018-02-09T02:14:03Z

I think it would be fine to include "application/pdf" if it's supported via the internal viewer. Exactly where and how it's included would need to be specced, of course. The inconsistency with other internally-supported types (HTML, images, JSON, various XML types, etc) is a bit annoying, of course. Is PDF the only type we need to worry about here?

Firefox used to report all types supported via internal viewer or plug-in or helper app in navigator.mimeTypes until we changed to our current behavior in https://bugzilla.mozilla.org/show_bug.cgi?id=1144204, so it's possible the more-complicated detects are trying to work around that....

domenic · 2018-02-09T03:40:04Z

I'd be interested in a concrete proposal for something browser-level. E.g. we could specify just application/pdf, but everyone already does that, and that wouldn't suffice for the honda.it check you note in the OP. Is making that check pass (and thus requiring a string like "Chrome" or "WebKit" or "Adobe" in the spec) an explicit goal? If not, what would you be looking for on top of what's already specified?

bzbarsky · 2018-02-09T03:42:24Z

but everyone already does that

Firefox doesn't, fwiw.

domenic · 2018-02-09T03:44:56Z

... I could swear I tested .... at least on my work computer ... but on my home computer both mimeTypes and plugins are empty, indeed. I'm prepared to say I was misremembering my work computer's results.

That does make things more interesting.

travisleithead · 2018-02-09T17:41:09Z

If not, what would you be looking for on top of what's already specified?

I looked through here, but didn't see anything specific to PDF...

I'd love to clear-out our plugins/mimeTypes collections like Firefox has done :-) Maybe we could add a simple feature-detect property in the spirt of legacy navigator.javaEnabled() or navigator.cookieEnabled?

domenic · 2018-02-09T18:01:37Z

Right, nothing specific to PDF, but unless a browser believes it doesn't support application/pdf, or specifically desires not to expose its support for it, the spec says you should include it. We could call it out as a specific case that's been found important for web compat, to be sure, and after realizing Firefox doesn't have anything there, that seems like an especially good idea.

With regard to further work, I'd again like to clarify our goals. I've seen three in this thread so far:

Make existing PDF feature detection, like the honda.it example from the OP, work in all browsers.
Make it easy to feature-detect PDF, without necessarily supporting every way people try to do so on the web today.
Make it easy to feature-detect PDF, but also allow (and maybe eventually mandate?) that browsers to have an empty plugins/mimeTypes collection

These each have rather different solutions in my mind. I'll hide them behind a <details> in the hopes of getting folks to answer the what-problem-we're-trying-to-solve question first, unbiased by the solution :).

My solutions for 1, 2, and 3

Add one or more of the Adobe/WebKit/Chrome strings to the spec, saying that the value of the description property for application/pdf should be one of these. (There's precedent for this kind of ickiness in various navigator properties.)
Add advice to the spec that websites really do expect you to advertise support for application/pdf, so you should include that (looking at you Firefox). And add advice that authors should feature-detect PDF via "application/pdf" in navigator.mimeTypes
Create a new PDF-specific property, like navigator.pdfEnabled, and try to move websites to that as much as possible via evangelism/etc.

bzbarsky · 2018-02-09T18:35:48Z

the spec says you should include it

No, it doesn't. The spec says:

The term plugin refers to a user-agent defined set of content handlers used by the user agent that can take part in the user agent's rendering of a Document object, but that neither act as child browsing contexts of the Document nor introduce any Node objects to the Document's DOM.

And it explicitly says that navigator.mimeTypes is a MimeTypeArray and that:

A MimeTypeArray object represents the MIME types explicitly supported by plugins supported by the user agent, each of which is represented by a MimeType object.

The built-in PDF renderer in Firefox does in fact act as a child browsing contet and does in fact introduce Node objects to the Document's DOM. So per a not-between-the-lines reading of the spec, it is not a "plugin" and must not be reflected in navigator.mimeTypes...

Now you could argue that the spec doesn't say what it means, but Firefox does implement exactly what the spec says (and in fact we changed our implementation to not expose application/pdf to comply with the spec here).

domenic · 2018-02-09T19:24:15Z

Ah OK, I didn't realize that's how Firefox implemented it, although it makes sense now that I think about how pdf.js works. Interesting.

It seems that given this, the current spec mechanism doesn't serve well for feature detection of "can the browser view this MIME type". We could either introduce a new mechanism for that, or we could revise the spec to change the meaning of navigator.mimeTypes/plugins, on the grounds that the revised definition is more useful.

Zirro · 2018-02-10T14:06:26Z

Before exploring solutions involving a new API, I would like to gain a better understanding of the problem to be solved first.

Which use cases require a website to detect support for PDF in the browser?
How do websites behave differently if the browser does not advertise support for PDF?
How does it affect user agents where the PDF is expected to be downloaded and handled by an external reader?

Adding a formal API to detect PDF support would increase the user's fingerprint. Before adding it there ought to be a strong argument in favour of websites behaving differently depending on whether the browser includes PDF support, versus making it impossible to detect through mimeTypes/plugins.

travisleithead · 2018-02-12T16:40:52Z

Regarding use cases, I can think of two off the top of my head:

The site want to display a PDF document in an iframe (contextually) in their site
The site wants to ensure if a user clicks on a PDF link, that the user has a reader installed to view it (lots of sites put up an ad to get required Adobe Reader to view a PDF that they are linking to)

For case 1, some in-browser feature detect is really helpful. If the support can't be detected, it's likely that the UA will re-direct the resource request to another content handler on the system (if one is installed), and the experience will not be seamless in the browser as expected).

For case 2 (maybe the majority of situations), I don't know that there's a way the web platform can provide this information reliably. Since there is no "isContentHandlerInstalled" that might be able to query the OS's content handler info, it's just hit-or-miss whether clicking the link will allow the user to view the PDF. Of course, if the PDF support is advertised as a browser-supported feature, then the site at least has that guarantee on a particular UA (and doesn't need to put up the PDF reader notice).

Yes, there is an increase in fingerprinting risk, though the ubiquity of PDF reader support in the browser means this impact would be negligible, I think.

bzbarsky · 2018-02-12T17:03:57Z

If the support can't be detected, it's likely that the UA will re-direct the resource request to another content handler on the system (if one is installed), and the experience will not be seamless in the browser as expected).

OK, but what action would the site take if support is not detected? For example, I have pdf.js turned off in Firefox explicitly because I do not want my PDFs rendering in iframes like this... ;)

I don't know that there's a way the web platform can provide this information reliably

Note that Firefox used to provide that information in navigator.mimeTypes until we stopped.

though the ubiquity of PDF reader support in the browser means this impact would be negligible

People can and do disable PDF support in their browsers. Witness all the threads asking how to do it in Chrome (not too simple, but they do have a UI for it).

Zirro · 2018-02-12T17:59:58Z

@travisleithead Thanks for providing some use cases. I can recognise the utility in the first case (though I prefer PDFs to stay in their own context), but for the second I think it'd be more appropriate for the browser or OS to suggest a way to open PDF files rather than the website.

I should be clear - my main concern with this API (besides fingerprinting) is that it might end up being used to hide PDF files from user agents which do not advertise built-in support.

Whether done intentionally or not, this would negatively impact those who require an external reader and might eventually force non-PDF supporting user agents to lie about their support in order to receive access to the document at all.

I feel that this risk outweighs the potential utility of an API, though I understand that this argument may not resonate with everyone. I'm satisfied now that I've voiced my concern.

annevk · 2021-01-09T08:22:00Z

It seems that all browsers support rendering PDFs in a nested browsing context so rather than detection I think we might have to say something about that. (When it works, whether MIME types are enforced, etc.)

domenic · 2021-07-22T22:19:35Z

Oh cool, we closed a 3.5 year old issue :) 6770de4

jrochkind · 2021-09-30T16:41:52Z

(for anyone finding this in google, and hoping to find a solution... at present there doesn't seem to be any browser support for navigator.pdfViewerSupported, and "application/pdf" in navigator.mimeTypes still seems to work in many browsers, but not Firefox. :( I don't believe there is any reasonable way to tell if Firefox is configured to display PDFs inline )

This comment has been minimized.

Sign in to view

domenic mentioned this issue Aug 23, 2018

Interop: pdf might or might not render in a sandboxed iframe (depending on a browser) #3958

Closed

zcorpan added addition/proposal New features or enhancements needs concrete proposal Moving the issue forward requires someone to figure out a detailed plan labels Sep 1, 2018

domenic mentioned this issue Oct 1, 2020

Removing plugins? #6003

Open

davidp3 mentioned this issue Jan 11, 2021

PDF loading examples web-platform-tests/wpt#27129

Open

domenic closed this as completed Jul 22, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider a standardized way to detect PDF in HTML? #3462

Consider a standardized way to detect PDF in HTML? #3462

travisleithead commented Feb 8, 2018 •

edited

Loading

domenic commented Feb 8, 2018

travisleithead commented Feb 8, 2018

domenic commented Feb 8, 2018

travisleithead commented Feb 8, 2018

bzbarsky commented Feb 9, 2018

domenic commented Feb 9, 2018

bzbarsky commented Feb 9, 2018

domenic commented Feb 9, 2018

travisleithead commented Feb 9, 2018 •

edited

Loading

domenic commented Feb 9, 2018

bzbarsky commented Feb 9, 2018 •

edited

Loading

domenic commented Feb 9, 2018

Zirro commented Feb 10, 2018 •

edited

Loading

travisleithead commented Feb 12, 2018

bzbarsky commented Feb 12, 2018

Zirro commented Feb 12, 2018

This comment has been minimized.

This comment has been minimized.

annevk commented Jan 9, 2021

domenic commented Jul 22, 2021

jrochkind commented Sep 30, 2021 •

edited

Loading

Consider a standardized way to detect PDF in HTML? #3462

Consider a standardized way to detect PDF in HTML? #3462

Comments

travisleithead commented Feb 8, 2018 • edited Loading

domenic commented Feb 8, 2018

travisleithead commented Feb 8, 2018

domenic commented Feb 8, 2018

travisleithead commented Feb 8, 2018

bzbarsky commented Feb 9, 2018

domenic commented Feb 9, 2018

bzbarsky commented Feb 9, 2018

domenic commented Feb 9, 2018

travisleithead commented Feb 9, 2018 • edited Loading

domenic commented Feb 9, 2018

bzbarsky commented Feb 9, 2018 • edited Loading

domenic commented Feb 9, 2018

Zirro commented Feb 10, 2018 • edited Loading

travisleithead commented Feb 12, 2018

bzbarsky commented Feb 12, 2018

Zirro commented Feb 12, 2018

This comment has been minimized.

This comment has been minimized.

annevk commented Jan 9, 2021

domenic commented Jul 22, 2021

jrochkind commented Sep 30, 2021 • edited Loading

travisleithead commented Feb 8, 2018 •

edited

Loading

travisleithead commented Feb 9, 2018 •

edited

Loading

bzbarsky commented Feb 9, 2018 •

edited

Loading

Zirro commented Feb 10, 2018 •

edited

Loading

jrochkind commented Sep 30, 2021 •

edited

Loading