-
-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PDF content blocked by Chrome #916
Comments
Might be related to #604 |
I have no problem with firefox, chrome and chromium. |
@Popolechien reported it ; I can reproduce on latest Chrome |
Sorry the link is incorrect ; you have to use the viewer. I need to open a ticket about that as well |
Indeed. Removing the sandbox attribute solves the problem. |
The code I give in that issue works around this specific problem with the pdf viewer in Chrome/Edge. It is not recognised by the browser as same origin, hence PDFs must be opened in a new window or tab. kiwix/kiwix-tools#604 (comment) |
That's quite a consequence in term of UI. If it's not possible to render it in the same iframe, maybe we should use an in-zim pdf.js but that's not as convenient |
Personally I think this is a Chromium bug, because the in-browser PDF viewer should clearly be same-origin when loading a PDF from the same origin. Or maybe it's a feature because PDFs can have active content that can contact external servers (?). I didn't find any other workaround in Kiwix JS than to make click on a PDF open a new window/tab. Having said that, the new window uses the same in-built PDF viewer, so it's not too big a deal. Epubs have to be downloaded anyway because no browser provides a custom viewer. It's a balance between the security of the iframe (in terms of not leaking info out: a particular concern with Zimit archives) and (in)convenience... Otherwise, there's really nothing to stop a script in the iframe navigating elsewhere. NB sandboxing the iframe doesn't stop a determined malicious attacker, but it stops accidental redirects, accidental (or well intentioned) attempts by scripts to break out of iframes, and accidental contacting of external sites for font files, images and (potentially) scripts. |
See whatwg/html#3958 which give some information. PDF viewer is implemented as a plugin in chrome and it is deactivated in sandboxed iframe. |
Please also add The sandbox attribute needs to stay to stop the Wiktionary zim from breaking. What's wrong with pdf.js? It would only make the experience more consistent across platforms. |
This is what we went for in Kiwix JS (for the sandbox attribute of iframe, or can be served as part of CSP response header):
|
@veloman-yunkan Is this sandbox attribute actually in force at library.kiwix.org yet? Because if you go to https://library.kiwix.org/viewer#wiktionary_en_all_nopic_2023-02 , you clearly see that a top-level navigation occurs and the iframe is destroyed..., and if I inspect the iframe just before the offending script runs that breaks out of it, I see what's in the screenshot below. Now, if the sandbox isn't implemented at library.kiwix.org, it means that the issues with the PDF viewer are not directly down to #906, unless I'm missing something (or my browser cache is VERY persistent, despite clearing it)... (Though I do think #906 will block PDFs in Chrome, because I already had to patch that in Kiwix JS.) Confused 😖... |
And, btw, that |
@Jaifroid https://library.kiwix.org runs the latest release of |
@veloman-yunkan Thank you for clearing that up! |
@rgaudin Another option is to embed pdf.js in |
Personally, I'd say there's no real inconvenience in the simplest solution of opening the PDF in a new tab or window. In many ways, it's better UX, because the user can keep that tab to read later and carry on browsing in the iframe. It also parallels the experience with EPUBs, which download separately rather than opening in the iframe. But I get that people have different opinions about such things! |
Bundling pdf.js inside a ZIM means displaying PDF ourselves, via pdf.js. It means creating a host webpage which won't feel exactly as the in-browser pdf.js. Also, for those with other PDF reader configured, it means rendering in PDF.js then potentially downloading (pdf.js allows this) to trigger the default PDF reader. As for including pdf.js in kiwix-serve, I don't see how that would help… Do you want to add a kiwix-serve specific API to use it? Do you want to intercept Then you have the SW discussion all over again (reader-ZIM dependency that's not part of the spec) |
Exactly. We can look at this from both angles though:
The trick is that kiwix-serve is both a first-class reader and a Web party so boundaries are blurred. I don't care much about the outcome of the tab thing but I am worried that we may be starting to drop support for regular web features. @kelson42 @Popolechien what do you think? |
For anyone coming to this longish thread at this point, the executive summary is as follows:
|
No. The vulnerability is that we display a "falsely offline" version of website which can still phone home and leak user data to remote servers. This is how web works and except by blocking all connections to server, it can always append. Before iframe, kiwix-serve was simply displaying the content in the top frame. "falsely offline" websites were obviously able to phone home. And as we were inserting the top bar in the content of the page, it was even simpler for the website to break out the app's controls. There is no more user data leak than before. Blocking this requests was never a goal of kiwix-serve. It may change, but if it became a purpose, the solution is probably more in service worker (to control all connection going out of the displayed website) than in iframe.
There is a balance between adding a new security level and not inferring in the website content (in chrome browser) Blocking any link with However, if the |
@mgautierfr in kiwix/kiwix-tools#604 (comment) I suggested that instead of using the sandbox attribute on the iframe, Kiwix Serve can serve all content with a CSP sandbox response header. I agree with you that this would be conceptually more elegant. However, we would still have this issue with PDF content not being rendered in Chromium browsers in the iframe, and also the issue with external links being blocked. Like in Kiwix Android, these will still have to be intercepted and opened in an external window. Adding the sandbox attribute in the response header or adding it in the iframe are exactly equivalent in terms of functionality (I've tested this in the PWA). EDIT: It might be possible to serve only HTML with a CSP sandbox response header, but not serve PDFs with this header. It would need testing as to whether this would allow them to render in the iframe. |
There are multiple versions of the bug. There is more information on Slack.
Shadow DOM was designed for this problem.
And between fixing the website content (in Wiktionary)
Have we seen how archive.org does it? archive.org edits the HTML of all webpages to insert its header with the controls. It rewrites links, and this is to ensure multimedia paths work and external links are intercepted. With that strategy, we can either use the CSP HTTP header like you suggested, or we can inject a |
@danielzgtg I used to use only a
And without Rewriting links isn't necessary if you intercept the user's click on the iframe and inspect the target. |
Thx for investogating this in details, but pretty possible that a bug in Chrome has been fixed between versions 90 and 110. |
FYI Chrome 91 was introduced in May 2021 |
Yes, it's pretty recent, but it's actually quite hard to stay on Chrome 90, because Google updates it almost as soon as you install it. Of course that wouldn't be the case with Chromium on Linux. I haven't spent a long time trying to find out if there's a workaround for Chrome 90, but I did try specifying the site (https://kiwix.github.io) as an allowed frame-src and even tried deleting the mtea http-equiv CSP, but the bug persists. The error in DevTools just confirms the block. Bottom line, unless someone has a specific patch for Chrome <=90's behaviour, it seems to me we have these choices:
In Kiwix JS we decided to adopt 3 (actually version 2 + 3), as it is the most universal solution, even if it doesn't mimic precisely what the original web site developers may have intended. Users are forced to open EPUBs in a separate app, so opening PDFs in a separate tab is not so much of a sacrifice IMHO. |
Can we just move forward the fix, even if chrome90- will continue to fail because of the bug? These users will still have the freedom to read the PDF content via a third party app... we might even trigger this behaviour via http headers based on rhe user-agent I guess. |
Only applicable to Desktop users with Online access
We can only trigger a download via Yet I also believe we should go forward with this. There's no good solution anyway and we have no data regarding the User Agents of our users (especially offline ones). The only data we can extrapolate from is known deployments which clearly are Android Chrome <90… but on Android, Chrome doesn't have a built-in PDF reader so it's less of a concern. |
UPDATE: The problem with Chrome <= 90 should now be resolved in test version 3.7.5 at https://kiwix.github.io/kiwix-js/ (wait for it to notify that an update is available). To test loading PDFs in the iframe, be sure to turn off the option "Open external links and PDFs in new tabs" (under Display Settings). Explanation: I realized that the Service Worker was setting the sandbox response headers for PDF files as well as HTML files, so I added the code below exempting PDFs. Something similar could be done in the back-end code of Kiwix Serve. The outcome, in Chrome 90 running in Win11 on Browser Stack, is in the screenshots at bottom.
To be clear, this exemption does open a security hole for PDFs, which would no longer be sandboxed, but I think we have to live with that unless we are prepared to force opening PDFs in a new tab or window, or make it an option as in this test version of Kiwix JS. Kiwix Serve doesn't have anywhere obvious to put such options, and I suspect it would complicate the UI code too much to have to deal with such things (options would need to be stored in cookies, read and restored on launch, etc.). |
@Jaifroid That was already the case in my local modification to |
My changes (without the temporary hack of hardcoding |
@veloman-yunkan Localhost is always http but it is considered secure (for testing purposes) by browsers, though I guess we can't rule out a bug here in different browsers. I see you removed the sandbox on the iframe, so it is really puzzling we're getting different results for different implementations. Are all caches fully cleared when you're testing? Might just be worth checking in dev tools that there is no sandbox attribute still set on the iframe... Otherwise, I think we'd need a test implementation to debug what's going on. |
Just checked in dev tools that there is no
That's nice but we need |
I don't think we're requiring secure origins as part of the sandbox, but we'd need to check the documentation to see if that is enforced or not. Or else test it empirically. However, I can definitively say that v3.7.5 of Kiwix JS works exactly the same whether served from localhost or served from https://kiwix.github.io, so I don't think this has anything to do with the reason we are seeing differences between Kiwix JS and Kiwix Serve despite both trying to serve the same headers. One unrelated issue with serving ZIM content over LAN using http is that Zimit archives, which depend on a Service Worker (requiring a secure origin) won't work. Zimit archives do work over http://localhost, however, as this is considered secure. EDIT: I can't see anything in https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Security-Policy/sandbox that presumes a secure origin. |
Thank you for this alternative fix. I confirmed that my Wiktionary use case still works on 3.7.5.
Only applicable if we don't accept my suggestion to just bundle the Chrome/Firefox installers for Windows/macOS/Linux/Android
See https://developer.mozilla.org/en-US/docs/Web/Security/Secure_Contexts It is possible to add a custom domain to |
This is hardly realistic to expect users to download and install and use a browser ; at least in all the deployment scenarios I've seen. Not to mention the potential system version conflict that might exist. I doubt latest Chrome can be installed on Android 5.1 Should it be done, it should be external to kiwix-serve anyway. We do offer some apps to download. It would be instructive to test with a browser though. |
It just came to my attention that we have a somewhat similar issue with EPUB with firefox First, verify the kiwix-serve 3.4.0 behavior by going to library. Click the HTML button of any book, it works. Go back and click an EPUB link, it should either open in your in-browser epub extension if you have one or trigger a download. Now, using kiwix-serve nightly, go to dev-library Reproduce those steps and you'll notice that nothing happens when you click on the EPUB link. This is even worst UX than the PDF behavior because there is no feedback and the user would only assume that the ZIM is broken. You can right-click and open in a new tab, and it would work as expected. As you'd imagine, the console mentions the culprit:
Same behavior with Chrome:
|
The code I suggested explicitly includes "allow-downloads", but it's currently only in a PR #924, and not yet in the dev version. It would be useful if @veloman-yunkan could merge that code so we can thoroughly test and debug it in the dev server for issues such as this. |
I'm re-opening because it's not fixed. @veloman-yunkan, the main difference I can see, at first sight, between the implementation at https://kiwix.github.io/kiwix-js/ and the latest Kiwix Serve at https://dev.library.kiwix.org/ is that in the latter case the sandbox is applied to the top-level document, i.e. to the |
@rgaudin Assuming the code in https://dev.library.kiwix.org, after the merging of #924, is the same code you have been testing, then as shown in the screenshot above, PDFs do not load in the iframe in Chromium browsers, whereas in the test case at https://kiwix.github.io/kiwix-js/, they do. @veloman-yunkan had also said that it was not working. I believe the reason may be because the HTML document that contains the iframe is sandboxed in the dev version of Kiwix Serve, whereas it is not sandboxed in the Kiwix JS test case. The sandbox response header should only be served for content from the ZIM that is injected into the iframe (with PDFs excluded). |
@Jaifroid What are the reproduction steps? unclear for me as well. |
Indeed, we tested both issues with Firefox while the first one is Chrome-only. |
When opening a PDF in a ZIM in the kiwix-serve viewer, on Chrome, there is an unrecoverable error message. This looks like a regression.
Go to https://dev.library.kiwix.org/viewer#lilote_fr_fo_2023-03/xay-va-%C3%A0-la-p%C3%AAche then click the Lire l'histoire button.
The text was updated successfully, but these errors were encountered: