Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling of external app links in the viewer #1123

Merged
merged 2 commits into from
Sep 10, 2024

Conversation

veloman-yunkan
Copy link
Collaborator

Fixes #1138
Also closes #1090, but not as requested in the title & description of that issue because I couldn't make it work and chose the general approach proposed in an afterthought that was initially posted in #1138.

Links that should be handled/opened by external applications - such as email addresses (mailto:), phone numbers (tel:), etc - are opened by the viewer in a new tab/window, thus avoiding any issues with content security policy.

Copy link

codecov bot commented Sep 5, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 41.43%. Comparing base (d5a44b9) to head (2da9801).
Report is 3 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1123   +/-   ##
=======================================
  Coverage   41.43%   41.43%           
=======================================
  Files          59       59           
  Lines        4245     4245           
  Branches     2323     2323           
=======================================
  Hits         1759     1759           
  Misses        990      990           
  Partials     1496     1496           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@Jaifroid Jaifroid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in comment, I'd suggest having a whitelist rather than a blacklist, because the blacklist could be infinite, whereas there are only a couple of protocols in the whitelist.

One thing to be aware of is that I found several cases of PDFs served with http:// protocols instead of https:// protocols in Zimit archives, so I had to include a normalizing step in the code in Kiwix JS here https://github.com/kiwix/kiwix-js/blob/main/www/js/app.js#L2451. Now, I don't know if that is only relevant for Zimit1, or if Zimit2 might have this issue or not. I don't know if you still want to support Zimit1 ZIMs that are likely to be around for quite some time? @benoit74?

static/skin/viewer.js Outdated Show resolved Hide resolved
@benoit74
Copy link

benoit74 commented Sep 6, 2024

I have exactly the same remark / recommendation as @Jaifroid (who has been quicker than I to review this ^^)

In ZIMs, we can indeed have http:, https:, javascript:, about: (and maybe others, I don't really know tbh) that should be handled, all others should be just left for the browser to proceed. All these links should raised a warning from my PoV since the user is not navigating inside the ZIM but going somewhere else. When navigating inside the ZIM, only relative links are used normally.

Usage of http: and https: links are obviously way more common in Zimit than in other scrapers, but we can definitely have both in any (e.g. in ifixit we have many "external" links posted by users in the guides, and these could be http or https depending on what the target server supports or what the user decided to use ; and we do not try to guess if any http link could be replaced by an https one, this is too daunghting). Regarding links processed dynamically by wombat, when internal to the ZIM they use the same protocol as the currently displayed HTML page.

Regarding Zimit 1, currently openZIM does not have anymore Zimit 1 ZIMs published, they have all been either updated to Zimit 2 or archived. That been said, there is probably many in the wild, and we obviously have no control on when users will drop them for a more recent version.

@kelson42
Copy link
Collaborator

kelson42 commented Sep 6, 2024

FYI and AFAIK there is such a list of protocol schemes in zimwriterfs source code as well.

@Jaifroid
Copy link
Member

Jaifroid commented Sep 6, 2024

I think we'll need to test this thoroughly against any links transformed by Wombat. In @mgautierfr's original workaround, it was necessary to turn off Wombat processing temporarily when the user clicked a link in order to get the "real" URL as opposed to the Wombat-transformed URL. It's not clear to me whether that is relevant to this PR (sorry, I'm still travelling, so don't have access to a test environment right now).

@veloman-yunkan
Copy link
Collaborator Author

All these links should raised a warning from my PoV since the user is not navigating inside the ZIM but going somewhere else.

Should this be implemented in this PR?

@benoit74
Copy link

benoit74 commented Sep 6, 2024

Should this be implemented in this PR?

No clue ^^

@kelson42
Copy link
Collaborator

kelson42 commented Sep 8, 2024

All these links should raised a warning from my PoV since the user is not navigating inside the ZIM but going somewhere else.

Should this be implemented in this PR?

@veloman-yunkan -b --blockexternal should apply to all external URLs if activated, otherwise I see no reason why we should display a warning. This is at least not the behaviour which has been chosen so far. Does it apply?

Links that should be handled/opened by external applications - such as email
addresses (mailto:), phone numbers (tel:), etc - are opened by the
viewer in a new tab/window, thus avoiding any issues with content
security policy.
Zimit2 ZIMs employ Wombat for client-side rewriting of URLs. The latter
interferes with our approach of handling in the viewer CTRL/SHIFT-clicks
on links inside articles. This commit disables Wombat temporarily while
changing the href attribute of the clicked link.
@veloman-yunkan
Copy link
Collaborator Author

All these links should raised a warning from my PoV since the user is not navigating inside the ZIM but going somewhere else.

Should this be implemented in this PR?

@veloman-yunkan -b --blockexternal should apply to all external URLs if activated, otherwise I see no reason why we should display a warning. This is at least not the behaviour which has been chosen so far. Does it apply?

Done. Any links containing a non-whitelisted URL schema are now considered external with respect to link blocking and routed through the link-blocker page. Otherwise such links are opened in a new window/tab.

Also discovered a bug related to handling of ctrl (or shift) clicked links for Zimit2 ZIMs and fixed it in a separate commit.

Copy link
Collaborator

@kelson42 kelson42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@veloman-yunkan
Copy link
Collaborator Author

LGTM

I think that the additional bugfix commit was not included in your review.

@kelson42 kelson42 merged commit bec80e8 into main Sep 10, 2024
15 checks passed
@kelson42 kelson42 deleted the handling_of_external_app_links branch September 10, 2024 13:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants