-
-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handling of external app links in the viewer #1123
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1123 +/- ##
=======================================
Coverage 41.43% 41.43%
=======================================
Files 59 59
Lines 4245 4245
Branches 2323 2323
=======================================
Hits 1759 1759
Misses 990 990
Partials 1496 1496 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned in comment, I'd suggest having a whitelist rather than a blacklist, because the blacklist could be infinite, whereas there are only a couple of protocols in the whitelist.
One thing to be aware of is that I found several cases of PDFs served with http:// protocols instead of https:// protocols in Zimit archives, so I had to include a normalizing step in the code in Kiwix JS here https://github.com/kiwix/kiwix-js/blob/main/www/js/app.js#L2451. Now, I don't know if that is only relevant for Zimit1, or if Zimit2 might have this issue or not. I don't know if you still want to support Zimit1 ZIMs that are likely to be around for quite some time? @benoit74?
I have exactly the same remark / recommendation as @Jaifroid (who has been quicker than I to review this ^^) In ZIMs, we can indeed have http:, https:, javascript:, about: (and maybe others, I don't really know tbh) that should be handled, all others should be just left for the browser to proceed. All these links should raised a warning from my PoV since the user is not navigating inside the ZIM but going somewhere else. When navigating inside the ZIM, only relative links are used normally. Usage of http: and https: links are obviously way more common in Zimit than in other scrapers, but we can definitely have both in any (e.g. in ifixit we have many "external" links posted by users in the guides, and these could be http or https depending on what the target server supports or what the user decided to use ; and we do not try to guess if any http link could be replaced by an https one, this is too daunghting). Regarding links processed dynamically by wombat, when internal to the ZIM they use the same protocol as the currently displayed HTML page. Regarding Zimit 1, currently openZIM does not have anymore Zimit 1 ZIMs published, they have all been either updated to Zimit 2 or archived. That been said, there is probably many in the wild, and we obviously have no control on when users will drop them for a more recent version. |
FYI and AFAIK there is such a list of protocol schemes in zimwriterfs source code as well. |
I think we'll need to test this thoroughly against any links transformed by Wombat. In @mgautierfr's original workaround, it was necessary to turn off Wombat processing temporarily when the user clicked a link in order to get the "real" URL as opposed to the Wombat-transformed URL. It's not clear to me whether that is relevant to this PR (sorry, I'm still travelling, so don't have access to a test environment right now). |
Should this be implemented in this PR? |
No clue ^^ |
@veloman-yunkan |
Links that should be handled/opened by external applications - such as email addresses (mailto:), phone numbers (tel:), etc - are opened by the viewer in a new tab/window, thus avoiding any issues with content security policy.
dc7f17b
to
16ebc66
Compare
Zimit2 ZIMs employ Wombat for client-side rewriting of URLs. The latter interferes with our approach of handling in the viewer CTRL/SHIFT-clicks on links inside articles. This commit disables Wombat temporarily while changing the href attribute of the clicked link.
Done. Any links containing a non-whitelisted URL schema are now considered external with respect to link blocking and routed through the link-blocker page. Otherwise such links are opened in a new window/tab. Also discovered a bug related to handling of ctrl (or shift) clicked links for Zimit2 ZIMs and fixed it in a separate commit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I think that the additional bugfix commit was not included in your review. |
Fixes #1138
Also closes #1090, but not as requested in the title & description of that issue because I couldn't make it work and chose the general approach proposed in an afterthought that was initially posted in #1138.
Links that should be handled/opened by external applications - such as email addresses (
mailto:
), phone numbers (tel:
), etc - are opened by the viewer in a new tab/window, thus avoiding any issues with content security policy.