Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider a standardized way to detect PDF in HTML? #3462

Closed
travisleithead opened this issue Feb 8, 2018 · 21 comments
Closed

Consider a standardized way to detect PDF in HTML? #3462

travisleithead opened this issue Feb 8, 2018 · 21 comments
Labels
addition/proposal New features or enhancements needs concrete proposal Moving the issue forward requires someone to figure out a detailed plan

Comments

@travisleithead
Copy link
Member

travisleithead commented Feb 8, 2018

After running across a site interoperability issues in Edge, we'd like to raise the possibility of a standard (or suggested) way for surfacing a UA's support of PDFs. Would love both opinions and proposals.

Today, among Chrome, Edge, and Firefox, Chrome and Edge both add an entry (or entries) into the navigator.plugins and navigator.mimeTypes collections such as:

  • {type: "application/pdf", suffixes: "pdf", description: ""}
  • {type: "application/x-google-chrome-pdf", suffixes: "pdf", description: "Portable Document Format"}
  • {type: "application/pdf", suffixes: "pdf", description: "Edge PDF Viewer"}

E.g., the site honda.it tries to detect a variety of potential strings here: var p=["Chrome PDF Viewer","WebKit built-in PDF","Adobe Acrobat"];

Sites are starting to take an interest in these. Perhaps we should come up with a vender-neutral way of advertising support for PDF?

@domenic
Copy link
Member

domenic commented Feb 8, 2018

This makes sense to me; although navigator.plugins/mimeTypes are kind of weird implementation-defined voodoo-land, they are web-observable, and to the extent people are depending on them we should probably try to lock them down. Especially if that can be useful for web authors.

My Safari Tech Preview install also has similar entries, including both application/pdf and text/pdf.

It seems like one way to detect in a cross-browser way today is

const supportsPDF = "application/pdf" in navigator.mimeTypes;

Do you think we should codify this in the standard somehow, beyond what is already there in the spec for navigator.plugins/navigator.mimeTypes? What's there seems pretty good to me, but perhaps you had something specific in mind?

@travisleithead
Copy link
Member Author

That's a nice feature detect, and probably good enough for a casual check. What I don't know is whether there is some deeper behavior that sites are somehow trying to take advantage of with their more complex checks, e.g., looking for specific name or description values.

If there really aren't web-facing feature differentiation in the various PDF implementations, then perhaps suggesting a "typical" generic PDF entry in the plugins collection would help encourage use of the simple feature detect you note above.

@domenic
Copy link
Member

domenic commented Feb 8, 2018

Hmm, I see. My guess would be that sites weren't aware you could write such simple feature detection code, but I'm not sure.

Do you think we should address this at the authoring guidance level, e.g. give advice on how to feature detect PDF? Or at the browser level, by e.g. giving advice that all browsers include one of the three strings you listed in your example in the OP for the description of their application/pdf entry?

@travisleithead
Copy link
Member Author

Both? Authoring and browser level? I suppose authoring guideance is a no-brainer. Wondering how anyone else would feel about more prescriptive browser-level changes to align on one way of presenting PDF support via the navigator's plugin collection? @bzbarsky @annevk @RByers @rniwa

@bzbarsky
Copy link
Contributor

bzbarsky commented Feb 9, 2018

I think it would be fine to include "application/pdf" if it's supported via the internal viewer. Exactly where and how it's included would need to be specced, of course. The inconsistency with other internally-supported types (HTML, images, JSON, various XML types, etc) is a bit annoying, of course. Is PDF the only type we need to worry about here?

Firefox used to report all types supported via internal viewer or plug-in or helper app in navigator.mimeTypes until we changed to our current behavior in https://bugzilla.mozilla.org/show_bug.cgi?id=1144204, so it's possible the more-complicated detects are trying to work around that....

@domenic
Copy link
Member

domenic commented Feb 9, 2018

I'd be interested in a concrete proposal for something browser-level. E.g. we could specify just application/pdf, but everyone already does that, and that wouldn't suffice for the honda.it check you note in the OP. Is making that check pass (and thus requiring a string like "Chrome" or "WebKit" or "Adobe" in the spec) an explicit goal? If not, what would you be looking for on top of what's already specified?

@bzbarsky
Copy link
Contributor

bzbarsky commented Feb 9, 2018

but everyone already does that

Firefox doesn't, fwiw.

@domenic
Copy link
Member

domenic commented Feb 9, 2018

... I could swear I tested .... at least on my work computer ... but on my home computer both mimeTypes and plugins are empty, indeed. I'm prepared to say I was misremembering my work computer's results.

That does make things more interesting.

@travisleithead
Copy link
Member Author

travisleithead commented Feb 9, 2018

If not, what would you be looking for on top of what's already specified?

I looked through here, but didn't see anything specific to PDF...

I'd love to clear-out our plugins/mimeTypes collections like Firefox has done :-) Maybe we could add a simple feature-detect property in the spirt of legacy navigator.javaEnabled() or navigator.cookieEnabled?

@domenic
Copy link
Member

domenic commented Feb 9, 2018

Right, nothing specific to PDF, but unless a browser believes it doesn't support application/pdf, or specifically desires not to expose its support for it, the spec says you should include it. We could call it out as a specific case that's been found important for web compat, to be sure, and after realizing Firefox doesn't have anything there, that seems like an especially good idea.

With regard to further work, I'd again like to clarify our goals. I've seen three in this thread so far:

  1. Make existing PDF feature detection, like the honda.it example from the OP, work in all browsers.
  2. Make it easy to feature-detect PDF, without necessarily supporting every way people try to do so on the web today.
  3. Make it easy to feature-detect PDF, but also allow (and maybe eventually mandate?) that browsers to have an empty plugins/mimeTypes collection

These each have rather different solutions in my mind. I'll hide them behind a <details> in the hopes of getting folks to answer the what-problem-we're-trying-to-solve question first, unbiased by the solution :).

My solutions for 1, 2, and 3
  1. Add one or more of the Adobe/WebKit/Chrome strings to the spec, saying that the value of the description property for application/pdf should be one of these. (There's precedent for this kind of ickiness in various navigator properties.)
  2. Add advice to the spec that websites really do expect you to advertise support for application/pdf, so you should include that (looking at you Firefox). And add advice that authors should feature-detect PDF via "application/pdf" in navigator.mimeTypes
  3. Create a new PDF-specific property, like navigator.pdfEnabled, and try to move websites to that as much as possible via evangelism/etc.

@bzbarsky
Copy link
Contributor

bzbarsky commented Feb 9, 2018

the spec says you should include it

No, it doesn't. The spec says:

The term plugin refers to a user-agent defined set of content handlers used by the user agent that can take part in the user agent's rendering of a Document object, but that neither act as child browsing contexts of the Document nor introduce any Node objects to the Document's DOM.

And it explicitly says that navigator.mimeTypes is a MimeTypeArray and that:

A MimeTypeArray object represents the MIME types explicitly supported by plugins supported by the user agent, each of which is represented by a MimeType object.

The built-in PDF renderer in Firefox does in fact act as a child browsing contet and does in fact introduce Node objects to the Document's DOM. So per a not-between-the-lines reading of the spec, it is not a "plugin" and must not be reflected in navigator.mimeTypes...

Now you could argue that the spec doesn't say what it means, but Firefox does implement exactly what the spec says (and in fact we changed our implementation to not expose application/pdf to comply with the spec here).

@domenic
Copy link
Member

domenic commented Feb 9, 2018

Ah OK, I didn't realize that's how Firefox implemented it, although it makes sense now that I think about how pdf.js works. Interesting.

It seems that given this, the current spec mechanism doesn't serve well for feature detection of "can the browser view this MIME type". We could either introduce a new mechanism for that, or we could revise the spec to change the meaning of navigator.mimeTypes/plugins, on the grounds that the revised definition is more useful.

@Zirro
Copy link
Contributor

Zirro commented Feb 10, 2018

Before exploring solutions involving a new API, I would like to gain a better understanding of the problem to be solved first.

  • Which use cases require a website to detect support for PDF in the browser?
  • How do websites behave differently if the browser does not advertise support for PDF?
  • How does it affect user agents where the PDF is expected to be downloaded and handled by an external reader?

Adding a formal API to detect PDF support would increase the user's fingerprint. Before adding it there ought to be a strong argument in favour of websites behaving differently depending on whether the browser includes PDF support, versus making it impossible to detect through mimeTypes/plugins.

@travisleithead
Copy link
Member Author

Regarding use cases, I can think of two off the top of my head:

  1. The site want to display a PDF document in an iframe (contextually) in their site
  2. The site wants to ensure if a user clicks on a PDF link, that the user has a reader installed to view it (lots of sites put up an ad to get required Adobe Reader to view a PDF that they are linking to)

For case 1, some in-browser feature detect is really helpful. If the support can't be detected, it's likely that the UA will re-direct the resource request to another content handler on the system (if one is installed), and the experience will not be seamless in the browser as expected).

For case 2 (maybe the majority of situations), I don't know that there's a way the web platform can provide this information reliably. Since there is no "isContentHandlerInstalled" that might be able to query the OS's content handler info, it's just hit-or-miss whether clicking the link will allow the user to view the PDF. Of course, if the PDF support is advertised as a browser-supported feature, then the site at least has that guarantee on a particular UA (and doesn't need to put up the PDF reader notice).

Yes, there is an increase in fingerprinting risk, though the ubiquity of PDF reader support in the browser means this impact would be negligible, I think.

@bzbarsky
Copy link
Contributor

If the support can't be detected, it's likely that the UA will re-direct the resource request to another content handler on the system (if one is installed), and the experience will not be seamless in the browser as expected).

OK, but what action would the site take if support is not detected? For example, I have pdf.js turned off in Firefox explicitly because I do not want my PDFs rendering in iframes like this... ;)

I don't know that there's a way the web platform can provide this information reliably

Note that Firefox used to provide that information in navigator.mimeTypes until we stopped.

though the ubiquity of PDF reader support in the browser means this impact would be negligible

People can and do disable PDF support in their browsers. Witness all the threads asking how to do it in Chrome (not too simple, but they do have a UI for it).

@Zirro
Copy link
Contributor

Zirro commented Feb 12, 2018

@travisleithead Thanks for providing some use cases. I can recognise the utility in the first case (though I prefer PDFs to stay in their own context), but for the second I think it'd be more appropriate for the browser or OS to suggest a way to open PDF files rather than the website.

I should be clear - my main concern with this API (besides fingerprinting) is that it might end up being used to hide PDF files from user agents which do not advertise built-in support.

Whether done intentionally or not, this would negatively impact those who require an external reader and might eventually force non-PDF supporting user agents to lie about their support in order to receive access to the document at all.

I feel that this risk outweighs the potential utility of an API, though I understand that this argument may not resonate with everyone. I'm satisfied now that I've voiced my concern.

@guest271314

This comment has been minimized.

@guest271314

This comment has been minimized.

@zcorpan zcorpan added addition/proposal New features or enhancements needs concrete proposal Moving the issue forward requires someone to figure out a detailed plan labels Sep 1, 2018
@annevk
Copy link
Member

annevk commented Jan 9, 2021

It seems that all browsers support rendering PDFs in a nested browsing context so rather than detection I think we might have to say something about that. (When it works, whether MIME types are enforced, etc.)

@domenic
Copy link
Member

domenic commented Jul 22, 2021

Oh cool, we closed a 3.5 year old issue :) 6770de4

@domenic domenic closed this as completed Jul 22, 2021
@jrochkind
Copy link

jrochkind commented Sep 30, 2021

(for anyone finding this in google, and hoping to find a solution... at present there doesn't seem to be any browser support for navigator.pdfViewerSupported, and "application/pdf" in navigator.mimeTypes still seems to work in many browsers, but not Firefox. :( I don't believe there is any reasonable way to tell if Firefox is configured to display PDFs inline )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
addition/proposal New features or enhancements needs concrete proposal Moving the issue forward requires someone to figure out a detailed plan
Development

No branches or pull requests

8 participants