-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Synchronous instantiation #1976
Comments
Hmmm, I wonder if the 4 KB size limit applies to web workers and service workers. If not, then we could have a new |
Urgh the restrictions of the web sometimes baffle me :( Is it for certain that some functions can only be called when a script is originally loaded? Is there workarounds that JS has for that perhaps? (this seems like more of a wasm thing rather than a rust/wasm thing almost) |
@alexcrichton I can't actually find any information on it. But if the code is loaded asynchronously then I think it's possible for it to miss the On the other hand, I assume that in the future service workers will support ES6 modules (and therefore will work with async). So this problem might fix itself in the future. I just now tried compiling with the Webpack |
Okay, I actually went and read the spec. Two important things:
So, the conclusion is: this problem will be eventually fixed by the browsers, but right now it is necessary for |
Hmm, I just realized that the So we have these three options:
1 is the best long-term option (and it has a workaround which works today). 2 is reasonable, but it ties us even more strongly to Webpack. 3 is the worst in my opinion, because it is now possible to load workers with ES6 modules, so I don't think we should be spending too much effort on supporting classic-script style workers. So I think we should go with option 1 (do nothing and let the user workaround it by setting the event listeners in JS). And after top-level import init from "./foo.js";
await init(); |
@Pauan and @alexcrichton, thank you for the thorough responses. Regarding the workaround:
The browser's default fetch handling is sometimes what a service worker explicitly wants. It might be possible to use waitUntil() instead of respondWith(). But that will still pause the network communication in ways that may be undesirable. It may be possible to work around that by careful use of the install and activate events. I suspect that this is not trivial. That said, there is some precedent for the workaround: This is exactly how Cloudflare handles rust in the example and boilerplate code of their "Cloudflare workers". A product that is heavily based on the Service Workers API. Regarding the "perhaps we can wait for browser support" option: So, as long as there is one tablet browser, in addition to cloudflare, that can run our code, my requirements are met. Today, cloudflare does not support esm-integration or top-level Regarding webpack: Regarding the XMLHttpRequest call: Regarding loading workers with ES6 modules: How to proceed? |
@moshevds As far as I know, the browser's default However, as an alternative, you mentioned import init, { install, fetch } from "./foo.js";
const wasm = init();
self.addEventListener('install', (event) => {
event.waitUntil(wasm.then(() => {
return install(event);
}));
});
self.addEventListener('fetch', (event) => {
event.waitUntil(wasm.then(() => {
return fetch(event);
}));
}); Now the Rust As for "network pauses", that won't happen, because it won't instantiate the service worker until the Given that, you might be able to safely remove the To be clear, the "pausing" is only on the Promise microtask queue, it does not pause the macrotask queue, so there is no extra latency caused by it (assuming the As for maintenance, you only need to maintain a tiny With regard to "waiting for browser support", that's only for the "define everything in Rust" approach. With the workaround of defining the event listeners in JS it works today, no need to wait.
That's not the case. The biggest problem is that we have to maintain the flag for a potentially long time, and the flag would be obsolete very quickly, since it would be for classic-style workers (not ES6 modules). Whereas the workaround I have posted requires almost no effort from anybody (including you), and is forwards compatible with the "correct" solution (top-level |
The pause that I meant was the one where there is no install handler at all. In that case, all requests will pause until the wasm instance is available. I agree that this can be prevented with an install handler, and I agree that microtask pausing is not an important concern. I disagree, however, that it is a trivial glue in the way that you propose.
Again, I agree that the workaround is totally doable, just not that there are no real considerations when implementing it. I'm also not saying that it would be a large burden. But for me, it would be a worse option compared to internally patching wasm-bindgen (what I currently did). In turn, this is worse than having it work out of the box. If a consensus is reached that supporting synchronous instantiation is not worth the effort, is supporting a "bare imports object" target something that could be considered instead? |
@moshevds That sounds like a bug in Cloudflare, the In any case, even without the You seem to have two contradictory requirements:
So I'm rather confused: with synchronous loading it will also pause until the
I'm still not sure how maintaining a wasm-bindgen fork with this code... var request = new XMLHttpRequest();
request.open("GET", url, false);
request.responseType = "arraybuffer";
request.send(null);
if (request.status === 200) {
var module = new WebAssembly.Module(request.response);
var instance = new WebAssembly.Instance(module, imports);
// This needs to be updated when wasm-bindgen changes
wasm = instance.exports;
init.__wbindgen_wasm_module = module;
wasm.__wbindgen_start();
} else {
throw new Error("Network request failed with status: " + request.status);
} importScripts("./path/to/wasm.js");
wasm_bindgen("./path/to/wasm_bg.wasm"); ...is less burdensome compared to maintaining code like this: // This could even be moved into a utility .js file so it doesn't clutter up your code
function events(...names) {
importScripts("./path/to/wasm.js");
const wasm = wasm_bindgen("./path/to/wasm_bg.wasm");
names.forEach((name) => {
const f = wasm_bindgen[name];
// Add in whatever feature detection code you need here
self.addEventListener(name, (event) => {
event.waitUntil(wasm.then(() => f(event)));
});
});
}
// This is the only part you actually need to change, the rest stays the same
events("fetch", "push", "message"); The maintenance burden is literally just 1 line of code (adding/removing names to the But I don't really understand your requirements, so I'm probably missing something. Also, to be clear, your patch doesn't work, because it's still loading the
I think that's reasonable, yeah. In fact, we could just expose the |
Hi @Pauan, I'm sorry about the confusion. As I understand it, default fetch behaviour is inhibited the moment the first evaluation ends with an event listener added. In the synchronous case, this happens much later (after the wasm is fully instantiated) and therefore is not a problem. My patch really does work. The wasm module is provided as an argument to init(). Within Cloudflare, the module is constructed by Cloudflare and it can be passed that way. Within browsers, I can do this either with your XMLHttpRequest code or (the way I did it while testing:) embedded as a blob within the javascript file. The reason I find all of that less burdensome to maintain is that registering the events in javascript has what can be called a much larger interface surface area. Especially in the examples that I mentioned, those few lines of code interact with details of the service worker API + Cloudflare's rendition of that API + wasm-bindgen + whatever business logic I would have that requires the event handlers. I'm not saying those interacting interfaces go away. But with Rust and wasm-bindgen, we have awsome tools that help with some of these issues. And I would prefer not giving those up, when I don't have to. |
It occurred to me that normal workers have this same issue. The only way a browser guarantees that no messages between worker and mainthread are lost, is:
I am now wondering: Would it make sense for wasm-bindgen to have a target that is equivalent to that, but works today? |
Ahh, I see, that is a good point, I hadn't considered that. However, after running some tests, it seems that synchronous loading also blocks fetch requests. The reason this happens is: after the service worker is registered, all fetch requests will be routed to it. So the browser will wait for the service worker to load before handling any fetch requests. So there really is basically no difference between synchronous loading and asynchronous loading, as far as I can tell. Here is the code I'm using to test this.
It only works if the user handles all loading themself. Calling
That's an even bigger change, which also bloats up the file, since base64 encoding increases the file size by ~37%.
But you have to deal with all of that no matter what. So you're either dealing with it in your Rust code (which is actually clunkier than JS for defining event listeners and feature detection), or you're dealing with it in JS. I fully understand the desire to do everything in Rust (I create SPAs and Chrome/Firefox extensions in pure Rust), but I think in this case the amount of JS code is quite small, easy to maintain, and cleanly separated from Rust, because all of the logic is still defined in Rust.
Yes that will work.
Webpack has support for esm-integration, so you can use that. It's not really possible to implement it in wasm-bindgen, since it requires rewriting the After top-level |
What browser did you test this with? I am seeing the effect that I was expecting in Firefox and Chrome: N.B.: The timed fetches (on the first load) are not handled by the service worker if the service worker does not claim them, and this example does not claim.
That's right. I am not proposing that patch as a pull request, but as a proof of concept that synchronous instantiation works and is useful.
More like 3% for real world (and therefore compressed) scenario's.
This is reasonable to make as a personal consideration in relation to a specific use-case, imho. But I do not agree that it is true generally, and I do not agree that it is true for my use-case. I have been reading some of the discussions regarding the interplay of async loading and service workers. Many of the assumptions we had together, turn out to be wrong. See for example the detailed discussion about how to disallow top-level Honestly, information like this makes me much more hesitant to just wing it using some javascript code. It is easy to make incorrect assumptions like this. If I do this in Rust, I'll be able to analyse it a bit easier, and make clear test-cases for any caveats that I find.
Some people in the discussion that I linked raise doubts about this, unfortunately. |
It works normally because the service worker hasn't been loaded yet (note how it doesn't log "The Service worker will now control pages, but only those opened after the register() is successful. i.e. a document starts life with or without a Service worker and maintains that for its lifetime. So documents will have to be reloaded to actually be controlled." The way to test it is:
You really do have to close the tab and re-open it, merely refreshing isn't enough.
You seem to be misunderstanding how service workers work. I don't blame you, I also had similar misunderstandings until I looked into it more deeply. You seem to think that the browser works like this:
That is not how service workers operate. Instead, they operate like this:
Similarly, when you upgrade the service worker, the current pages will continue to use the old service worker. It is only after they are closed and re-opened that they will start using the new service worker. Or to put it another way, the lifetime of the service worker is intrinsically tied to the lifetime of the page. The browser will never "hot-swap" service workers into existing pages. Therefore, the browser always knows ahead of time whether the service worker "claims"
That's very interesting, I hadn't seen that. I'm really perplexed at their decision. It essentially seems to be, "well, somebody might write some slow And in any case, trying to stop bad developers from writing bad code doesn't work. Bad (or malicious) developers will just find other things to exploit/abuse. I can fully understand putting in some sort of quota, like "any service worker that takes longer than 30 seconds is killed". But that's not what they're doing, they're just flat-out banning all Judging from that thread, it also looks like they want to ban sync XMLHttpRequest in service workers, and since they seem to be trying to ban all long-running synchronous things, they'll probably end up banning synchronous So then your only option would be to write the event listeners in JS. This just makes me even more convinced that wasm-bindgen shouldn't have a |
A quick reading of your explanation is indeed how I understood it to be the case: That sync loading will block a fetch when the service worker is in the activated life-cycle stage, but not loaded. This is one of the situations that I was referring in an earlier post. And that is indeed a situation that has to be taken into account. But I will also have to take into account that I will claim the currently loaded page (using I am not particularly worried about loading times. But I am worried about situations where the application gets stuck loading a network resource that is required for offline use. Such as what your example can get into. If it would be the case that I am confused about how this works, that would be an even more clear reason that I should not trust myself to do this in Javascript. Don't you agree with that? Regarding the banning of synchronous WebAssembly: |
Oh, interesting, I didn't know that // This can be used to fetch + cache *any* file
async function load(url) {
const cache = await caches.open("offline");
try {
const response = await fetch(url);
if (response.ok) {
await cache.put(url, response);
return response;
} else {
throw new Error("Invalid status code: " + response.status);
}
} catch (e) {
const response = await cache.match(url);
if (response) {
return response;
} else {
throw e;
}
}
}
async function loadWasm(url) {
const response = await load(url);
const bytes = await response.arrayBuffer();
await wasm_bindgen(bytes);
} The above code only needs to be written once, then stuffed into some library or utility Now you can use it like this: const wasm = loadWasm("./path/to/wasm_bg.wasm");
The behavior of service workers is the same in both Rust and JS. The APIs are the same. The types are the same. web-sys is only a very thin layer on top of JS, so you will need an understanding of JS in order to do things in Rust. I agree that Rust is a much nicer language than JavaScript, but the more that I look at this the more it seems that a small amount of JS glue is necessary. If you're not comfortable with writing JS code, then you might want to look into using a JS library which is designed to make service workers easier. That way it's somebody else writing the code (and hopefully doing it correctly).
Synchronous XMLHttpRequest was available for decades, but years ago it was banned in the main thread, and they are now considering banning it in service workers. And synchronous WebAssembly is already banned on the main thread, and is heavily discouraged. They have a long history of banning synchronous things for performance reasons (not security). And given how they banned top-level |
On a technical level, I really object to do part of my caching logic in Javascript. I have outlined a system to do our caching and prefetching, and I want to build a prototype in Rust. Imho, it would be really bad to have a second caching layer that works differently and is written in Javascript. I think we need to reconsider the core question: I totally agree that all of this can be done in Javascript. In fact, everything that can be done with wasm-bindgen, can be done with Javascript. But when should we consider a workaround to be trivial? My answer:
This whole discussion comes across as if initially I only had a small problem because I couldn't do the API call that I wanted using Rust, and a solution was easily found. But now wasm-bindgen is saying "Just use Javascript for your project, its easy enough". I really just want to discuss whatever you think should be done in my case. I am assuming that "don't use wasm-bindgen" isn't the final answer if we carefully consider all the options. Right? PS: You are correct that freezing the browser window is not a security issue. The 4K limit is, I assume, also to prevent freezing the browser window. Workers don't freeze the browser window and do not have that limit. I would consider it really weird if they remove a feature without a similarly good reason. But indeed, not all good reasons to remove a feature are security related 🙂. |
That is not the case. All of the JS code I have discussed has been solely about 1) loading the I agree that unfortunately the "loading the So I have a question for you: assuming a hypothetical ideal situation where you can load the Also, how would you expect wasm-bindgen to help? We're certainly not going to be adding in ad-hoc caching into wasm-bindgen, that's clearly the responsibility of the end user, since different users will have different caching needs and use cases.
I've already pretty clearly described the workaround for this problem. It's a few dozen lines of JS code which basically never needs to change, you just stash it into a file and call it a day. I don't think you're going to get much better than that, because of the strict limitations of service workers (which we can't do anything about). The only other option is for wasm-bindgen to expose the To be clear, this is not wasm-bindgen's fault, it is caused by restrictions in service workers. And they have made it crystal clear that they have no desire to support
They have already removed top-level Whether you think that's reasonable or not, you should tell them, not me. I don't make those decisions. |
I don't think this discussion is bringing any of us closer to a solution. It might still be worthwhile to discuss more technical details about my project, but it seems a bit strange that we are almost discussing what kind of code I would accept into our codebase, instead of what kind of code you would except into wasm-bindgen. If "no trivial workaround possible" is the bar to consider possible solutions. Then discussing what a trivial workaround entails seems more sensible to me. Anyway:
The browser handles it. The Service Worker spec calls this the script resource map. Basicly, the browser will retain a copy of the service worker script and its importScripts, separate from its normal cache handling.
Any solution that doesn't impose
This is exactly one of my arguments for why this kind of solution isn't a trivial workaround at all 🙃.
This is a serious response: wasm-to-js transpiler or polyfill. (The performance of this will probably still be better than hand-written JS. But performance is not my concern at all, and not the reason why I want to use Rust here.) Also: I think the chances of such a change happening is exceedingly unlikely anyway.
I actually don't agree with you that those restrictions are ill-considered. If by "it is not wasm-bindgen's fault" you mean that it is outside of what wasm-bindgen aims to solve because the browsers are doing something wrong here: I also don't think that is the case. |
Okay, let me clarify: how would you handle it in the situation where wasm-bindgen synchronously loads the
Of course you're free to use a wasm-to-js transpiler on your own project (we even have a tutorial for it). But a wasm-to-js transpiler won't work for new wasm-only features (which we plan to support), so we're unlikely to add that into wasm-bindgen, so you'll have to use an external tool for that. And I'm not sure what you mean by "polyfill", what would it be polyfilling?
No, that is not what I mean. What I mean is that it is basically impossible for wasm-bindgen to solve this. Let me recap the situation:
So, if you want support for this in wasm-bindgen, you should talk to the W3C (probably in the thread you linked), explaining your use case and why you want to use You can also ask them if they plan to support sync Note that none of this is within wasm-bindgen's control, we are at the mercy of the browsers. If they choose to ban sync You seem to think that this is a matter of wasm-bindgen being stubborn, but that is not the case. It is the W3C's decision whether to ban sync |
Thank you. I don't think you are stubborn necessarily, but I do think that you are misunderstanding some of what I am saying. Because you keep assuming requirements for my use-case that I don't see as requirements at all. I hope we can still achieve a shared understanding here.
Sure. But I don't read a decision to ban WebAssembly in those discussions. I would agree that more communication between people working on WebAssembly specs and Service Worker specs might be helpful, but that is about it.
This might be related, but it is not at all what I am asking. Right now, the async This means that manually providing a If you think this is something that needs to be fixed in the async case, then I would propose whatever solution that you come up for async to work analogously for sync. But, just like you, I see no clear path for doing so. (One possible option could be to allow the caller to specify a fetch-compatible function that takes into account whatever caching the user wants to do. The analogous sync solution to that would be to allow it to return any thennable, not just promises.)
The WebAssembly polyfill that everybody says is a good idea, but doesn't actually exist today.
We have agreement here.
Even if it were allowed, registering probably needs to happen before the first await. IMHO, this is a sensible requirement for Service Workers.
This is indeed unfortunate, and something that the working groups should perhaps work on.
Or have the
I can't even do that now because it wouldn't work in offline mode.
And again, I am not asking for synchronous loading. That won't work just like asynchronous loading won't work either. What I am asking for is synchronous instantiation.
There are other options, but the same is true for any asynchronous solution. There are a bit more ways to store data locally with async API's, but the problem remains the same.
I have not seen an indication that this is the case. I actually take the 4K limit and the fact that it is only applied specifically to the main thread, as an indication that support for synchronous instantiation is unlikely to go away from Workers. Anyway: If this is a serious concern that you have, what kind of assurances would you look for to consider this worthwhile to support? From the working groups? From browser creators? I am willing to go and ask for some kind of assurance if there is a specific roadblock here.
Offline mode is quite an important aspect of my requirements, and that won't work the way you propose here. This is unrelated to the sync/async discussion.
I'm not sure if that issue is the correct place, but contacting some of those people seems like a sensible option. I would like to first see if you and me can get on the same page about this. It seems that should be possible 🙂. |
The only assumption that I made in my previous post is that you want to be able to define the event listeners in Rust. You have stated that requirement multiple times. Everything else is just hard facts based on that single requirement. If that is not a requirement, then I have already provided excellent solutions, here and here.
This is a web issue, the W3C spec is a super-set of other specs, so they can make these decisions without any changes to the WebAssembly spec, no communication is needed.
Ah, I see, well that's just wasm2js (which exists).
No, offline mode works just fine with the caching code I posted before. The caching happens automatically and transparently.
Nothing is magic. I covered that option in point 4 (embedding as base64), but that still requires sync
No, that is the only synchronous solution. No other synchronous solution works (because sync XHR is planned to be banned, and wasm2js won't work for new wasm features). And since the event listeners must be defined synchronously, therefore it is the only solution that fulfills your requirement of defining the event listeners in Rust.
It's heavily implied from their decision to ban sync XHR and top-level
But they restricted sync XHR in service workers, even though service workers run off of the main thread. It's clear that they are treating service workers differently from regular workers, so you can't assume that "because it works in Workers it'll work in Service Workers". If what you are saying was true, then they would have no reason to ban top-level In any case, this is all speculation, the best course of action is to ask them.
What matters is the W3C, since they create the spec that the browsers follow. Also, all the browsers have representatives in the W3C, so asking the W3C is basically the same as asking the browsers themself. |
This is correct.
I'm not sure what you are referring to. If I understand you correctly, you had 2 core points:
The first is more of an opinion that I happen to disagree with. The second is a prediction that I happen to think is unlikely. (You also now indicate that this one is speculation on both our ends, and I fully agree with that notion.) Both are fine, we don't have to agree about those points if we can find a way forward towards discussing what solutions there are that do meet "my requirement" (or my use-case more generally). I'll ask that somewhere at the W3C, and report back. |
My point has been that you might not have a choice, due to browser restrictions, so your requirements might have to change. We can't do anything about that since we don't make decisions for the W3C.
I have never said or implied that it was a fact, I always made it clear that it was my prediction. But it's a prediction that is based on my deep understanding of both the browsers and how the W3C operates, it isn't just a blind guess.
I went ahead and asked already: w3c/ServiceWorker#1499 As I predicted, they are currently leaning toward blocking it, though we'll see how things pan out. |
Great! I think we are on the same page about those particular points. I have added my questions at the issue you opened. I think we can continue here when/if we get a clear picture about how the Service Worker people think this should be implemented. Thanks. |
Tracing Cloudflare Worker + wasm_bindgen too...here's a holy long story with a const { hello } = wasm_bindgen;
const instance = wasm_bindgen(hello);
instance.then(() => hello()) so synchronous in cfworker is no longer a requirement now.( Pauan +1 point :) |
Motivation
I have been looking into a way to have Rust code initialize my service worker. This would include the
addEventListener('fetch', ...);
call, but this can only be done during the worker script’s initial evaluation.WebAssembly.instantiate(...)
andWebAssembly.instantiateStreaming(..)
both span beyond that time window, sonew WebAssembly.Instance(...)
is required.Proposed Solution
Generated code for the web and no-modules targets should include a way to instantiate synchronously. I have a small change that works: https://github.com/moshevds/wasm-bindgen/commit/1ba370edf16a2f3dd47c25889a07430023eb2a10
But I assume that changing the order of arguments is not a good idea, so I'd like to discuss what options there are.
Alternatives
The text was updated successfully, but these errors were encountered: