Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide support for reading warc2zim / Zimit archives #1009

Closed
Jaifroid opened this issue May 22, 2023 · 2 comments · Fixed by #1173
Closed

Provide support for reading warc2zim / Zimit archives #1009

Jaifroid opened this issue May 22, 2023 · 2 comments · Fixed by #1173
Assignees
Labels
backend dependencies Pull requests that update a dependency file
Milestone

Comments

@Jaifroid
Copy link
Member

After discussion at the Kiwix Hackathon, it has been decided to implement support in a "standard" way for reading Zimit-based archives. The system is based on the wombat.js URL rewriter, which is injected into the document, and a Service Worker.

It is proposed that the reader (Kiwix JS) will provide both wombat.js and the Service Worker, so that we control the version being used. The reason for this is that we will have to splice together the Service Worker for reading Zimit archives with our own Service Worker, i.e., we cannot just use the version provided in each ZIM (it is not possible to install two Service Workers controlling the same domain). Since wombat.js and the Service Worker must talk to each other, it is probably better to provide known versions rather than rely on potentially different versions of wombat.js in the ZIM archive.

This issue supersedes #644.

@Jaifroid Jaifroid added dependencies Pull requests that update a dependency file backend labels May 22, 2023
@Jaifroid Jaifroid added this to the v4.0 milestone May 22, 2023
@Jaifroid Jaifroid self-assigned this May 22, 2023
@Jaifroid
Copy link
Member Author

It should be noted that it will not be possible to load Zimit-based archives in a Chromium extension without migrating to Manifest V3 #755. However, the code for that is pretty much finished in #984. We can't merge that code just yet, because the MV3 in Firefox does not yet support Service Workers as background scripts. Nevertheless, we can develop Zimit support outside of the extension framework for now.

@Jaifroid
Copy link
Member Author

Jaifroid commented Nov 27, 2023

We now have in #1173, a fully loaded Reply iframe inside our iframe, with wombat.js processing client-side dynamic and other links, bootstrapped by the Replay Worker (imported into our Service Worker via importScripts() in a try-catch statement). This is NOT the Regex-based URL substitution that is currently working (mostly) in the Kiwix PWA.

There is a fair amount of "glue" added to the app to allow our standard Service Worker to know when to hand off a request to the Replay Worker. To write the glue, and to ensure the Replay Worker diverts its Fetch requests to the main Service Worker (which in turn gets transformed URLs from the backend), I had to provide a locally slightly customized copy of the Replay Service Worker (there are only four small interventions). I also have to intercept load.js to prevent it from trying to register the copy of sw.js provided in the ZIM, disrupting our Service Worker. Instead, I make it simply load window.mainUrl.

I suppose this code will be good for a few months until Zimit 2.0 is ready and fully tested, and thereafter will provide backwards capability for older ZIMs. (We seem to make support for deprecated technology our distinctive feature...☹️).

Multimedia is almost all working There is still some ironing out to do with the format of the fuzzy-matched POST requests for YouTube video coming from the Replay Worker -- in particular, an issue with the ridiculously long URLs these produce. This is currently blocking YouTube video from playing, though I can see, and will hopefully be able to debug, the (dozens) of enormous POST requests.

Finally, the Replay Worker only runs in Chrome >= 80 and Firefox >= 74 according to my testing on BrowserStack, even after passing it through a Babel transformation, though in a local copy, I managed to get it working in Firefox 68+. If a user accesses a Zimit ZIM in a non-supported browser, they will be thrown into a view of the static content of requested pages (with a warning). This at least allows browsing of static ZIMs such as the Internet Encylopaedia of Philosophy for users with older browsers (IE11+).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend dependencies Pull requests that update a dependency file
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant