-
Notifications
You must be signed in to change notification settings - Fork 301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add offline support #137
Add offline support #137
Conversation
😻 This is a huge step forward, and the reload times are fantastic with e.g. pyodide. Works as advertised when turning off wifi! We'd probably want to enable even more goodies, e.g. have a pyodide kernel shared between multiple tabs. But indeed, the various deployment gotchas are very real, so it needs to be easy to turn off at build time for someone who knows they will be deploying someplace not so fun, e.g. Also, I think before landing this, we'd also want to land #118 to get deduplicated, cache-busting assets, so at least first-party stuff doesn't have surprising stale experiences on the docs site, especially since we're tracking upstream alphas now. As we want this to work on every page, and likely know whether we are in a service worker, perhaps we move all this to the |
Yep, this all makes sense. Let's wait a bit until we merge this. I'll try to keep this up to date. I think it would be nice to also add a Manifest so that users can install this as a web-app. Maybe the service worker should only be active when this app is served as a web-app. This would be another method to circumvent some of the issues. Btw, the documentation generated a working example of this: https://jupyterlite--137.org.readthedocs.build/en/137/_static/lab/index.html |
Oh yeah, tried it out immediately, did the ceremonial network-turn-off and everything! This really makes slightly-larger-than-trivial compute reasonable for a documentation site. |
So with #173, we've got some more structure in place for configuring how a site builds, etc. I left a placeholder for the serviceworker stuff, but wasn't sure how to proceed. It would be interesting to get |
Another angle: the webpack docs point out workbox which seems to make some of the stuff a little more manageable over time. Not sure how this would play with our desire to be able to tweak things after the webpack build, but might still be interesting. |
I'd like to take over that PR and rebase it if it's fine with everyone. I've been looking for the past working days at service workers and how to make use of their advantages from the Python kernel. I'm not only interested in the caching logic they bring (for offline support etc), I'm also interested in the Python kernel being able to synchronously access data from the virtual file system (local storage). The Python kernel running in a web worker, it can make blocking HTTP requests that are intercepted by the service worker, and the service worker can answer with whatever data it finds. We could monkey patch the |
@martinRenou I was just thinking about this. I think the right approach for flexibility would be to register a very simple service worker which just allows the registration of URL handlers. So one could make a cache handler extension, a sleep extension, import extension. It would need some way to identify which main browser thread owned webworker clients, I'm not sure if that can be done through fetch request detection in the service worker or if it needs a wrapper for webworker. |
Oh hang on, the client Id at least in Chrome appears to be for a whole session, ie. Main page and workers. That's easier than I thought |
I just checked, and it is trivial to associate client ids of web-workers to the clientid of the main window -
So a basic URL handler API for pyodide or similar stuff where you essentially want synchronous calls to async javascript things (files, sleep etc.) would look like this:
I think caching should maybe be handled in the main serviceworker for performance reasons. But the rest of the serviceworker should stay absolutely barebones - it just takes requests and turns them into jupyter messages if they are a special PUT request. |
Thanks for your comments @joemarshall ! My comments about monkey-patching the I'm exploring implementing a custom FileSystem (in the emscripten sense) that we mount for Python to use, and exposes the files of the current JupyterLab drive. The work is done in #655 (not all my code is pushed yet). So I think using a service worker the way you're describing above is needed, in order to turn async file/directory fetches into synchronous tasks.
This makes perfect sense. I was reading this morning about
|
I think normal postmessage makes more sense, because e.g. if you call input, you only want to read from things in the same window the kernel was launched from.
Instead of channels the messages could just have a URL included, so that they can be routed to server or client plugins by JupyterLiteServer same way other messages are routed (I guess messages start off at the front-end because presumably that is what has access to the window object) I think that makes more sense than adding another form of communication to the whole thing.
…________________________________
From: martinRenou ***@***.***>
Sent: Thursday, June 2, 2022 10:15:25 AM
To: jupyterlite/jupyterlite ***@***.***>
Cc: Joe Marshall (staff) ***@***.***>; Mention ***@***.***>
Subject: Re: [jupyterlite/jupyterlite] Add offline support (#137)
Thanks for your comments @joemarshall<https://github.com/joemarshall> !
My comments about monkey-patching the open global function in Python are invalidated now, so please discard them.
I'm exploring implementing a custom FileSystem (in the emscripten sense) that we mount for Python to use, and exposes the files of the current JupyterLab drive. The work is done in #655<#655> (not all my code is pushed yet).
The problem is emscripten file systems must be synchronous (there is currently some work in emscripten to make those APIs async but it's not finished/released yet, see discussion in the PR mentioned above).
So I think using a service worker the way you're describing above is needed, in order to turn async file/directory fetches into synchronous tasks.
Serviceworker gets a POST request to a special URL with some content (as JSON or something)
It makes this into a promise and postmessages it to the correct mainwindow (and returns a new unresolved promise to the fetch request)
The main app converts that to a jupyter message and sends it wherever it needs to go.
The handler extension makes a response and sends that back to the main app, which posts it back to the serviceworker, which then resolves the promise made in step 2.
Tada, XML httprequest in webworker finishes, we're all good.
This makes perfect sense. I was reading this morning about BroadcastChannel<https://developer.mozilla.org/en-US/docs/Web/API/BroadcastChannel> which I think will be perfect for that:
* it's bidirectional
* you can have multiple channels (one for input, one for sleep, one for the file system etc)
—
Reply to this email directly, view it on GitHub<#137 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AAK6Y6ZJZBMX3C3H6Y3ZD7LVNB3S3ANCNFSM46CVZNIQ>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
This message and any attachment are intended solely for the addressee
and may contain confidential information. If you have received this
message in error, please contact the sender and delete the email and
attachment.
Any views or opinions expressed by the author of this email do not
necessarily reflect the views of the University of Nottingham. Email
communications with the University of Nottingham may be monitored
where permitted by law.
|
This PR adds a service worker to the lab folder. It will intercept all requests to the server and store the results in a cache. The next time you load the website, the content will be loaded from the cache instead from the server (while updating the cache in the background). If there is no internet access, the
/lab
address should still work.Note that this might make it harder to debug because it will serve stale content by default. When debugging the application (e.g. in dev_mode on localhost) you should enable "bypass for network" in the debugging application panel.
Maybe we could disable service worker for localhost alltogether.