-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modules make passing data back and forth much harder than functions #21
Comments
I think that with the current proposal it would be possible to write something like this: const result = await worker({ endpoint }, module {
const { endpoint } = self.workerInput;
const res = await fetch(endpoint);
const json = await res.json();
export default json[2].firstName;
}); However, it's still less ideal than automatic captures, and it requires a small library to wrap the `worker` implementationfunction worker(data, module) {
const w = new Worker(module {
let mod;
self.workerInput = await new Promise(resolve => {
self.onmessage = ({ data: { data, module } }) => (mod = module, resolve(data));
});
export function then() {
return import(mod);
}
}, { type: "module" });
w.postMessage({ data, module });
return w;
} |
Nice, that's not so bad at all, syntactically. However, the fact that it requires one |
I think there are multiple ergonomic problems around workers and Module Blocks addresses some of them. It focuses mostly on the ability to share code between realms, which should hopefully pave the way to scheduler-like libraries and APIs. For example, I thought about using Module Blocks something like this to build an OMT task runner: const taskRunnerWorker = new Worker(module {
onmessage = ({data} => {
const {taskModule, parameters} = data;
const {default: task} = await import(taskModule);
postMessage(await task(...parameters));
}
});
taskRunnerWorker.postMessage({
taskModule: module {
export default async function(a, b) {
return Promise.resolve(a) + b;
}
},
parameters: [40, 2]
});
taskRunnerWorker.onmessage = ({data}) => { ... }; (I do realize this is not that different from what @nicolo-ribaudo wrote, but I started writing this before he posted so I figured I’d post it anyway for completeness’ sake.) You could augment this pattern to re-use the worker for multiple tasks, but then you’d have to keep track of in-flight tasks with ids. That’s one of the ergonomic issues that Module Blocks explicitly does not solve (but is fairly easy to solve in userland). |
It's the "hopefully" I'm worried about here. By introducing new syntax, you largely close off other explorations in that direction, unless they are direct extensions of the proposed syntax. So I'd like to see a more concrete plan to make sure this feature is usable, before moving it too far forward. IMO a feature which requires a
Could you show how that would look, solved in userland? In particular, if it's too heavyweight, then I worry that module blocks is not a good direction, compared to the other three alternatives discussed in the OP. |
This is a mix of my example and Surma's, that allows re-using the same worker for multiple module blocks, adding support for tracking tasks with ids: import { Runner } from "./runner.js";
const runner = new Runner();
runner.run([1, 2], module {
export default function(a, b) {
return a + b;
}
}); Implementation of
|
Nice! Yes, that indicates to me that something like #6, or some other mechanism for making it easy to pass per-module data without the extra ceremony of Ideally that something would be shorter than
But |
It seems like what we are coming around to on this thread is: improving ergonomics for transferring data to be sent to the worker can be done separately, in the library (e.g. the Web platform) from this feature, which adds the underlying syntax for the code itself. |
I would not characterize my concern that way. I would instead say that providing an ergonomic way of sending per-module (instead of per-global) data needs to be done as part of this proposal, otherwise this proposal is harmful as it encourages patterns like worker-per-block instead of worker pools. |
It seems to me that this concern unavoidably spans multiple standards venues: new syntax needs to be added in TC39, but new ways of running worker pools or passing values around between workers needs to be a host API. I'd be in favor of this ergonomic improvement on the Web side, but I don't see how they can be part of the same proposal. I also think the Web side here is a bit riskier--there are many more decision-points, and it's easier to get wrong, whereas there's really only a couple superficial things to decide about module blocks. For these reasons reason, I don't think we should consider web ergonomics a dependency for module blocks. |
Ok. I guess we just fundamentally disagree then. I'll channel my concerns to my TC39 rep. |
I'm confused by the disagreement in the last few comments, specifically:
The expressivity is definitely there already, as demonstrated nicely by @nicolo-ribaudo. Is the harm point that without syntax affordances on top (as part of this proposal), developers will reach for a pattern that spawns too many workers, which is too heavyweight? Wouldn't the heavyweight-ness be readily apparent when the program runs? That is, it doesn't seem like something that'll get ingrained in web apps. The opposite, actually, in that it'll be quickly discovered to be the wrong thing. While it's not great that it could be easy to reach for, non-performant constructs that are easy to reach for abound, and I expect features that have to do with application architecture like Workers would usually be thought through. |
Yes. The proposal in fact tries to make it very easy to spawn too many workers, e.g. with the blob URL integration and the proposed integration with the
Usually developer machines are significantly more powerful than end-user machines; we've seen this problem before, and we try to avoid designing new APIs with such footguns. (In particular, I don't really agree with "non-performant constructs that are easy to reach for abound" with regards to recently-shipped web APIs.) We also have data (I'll try to get it published) that show this problem in action for workers specifically, where attempting to convert an application to use more workers causes bad performance overhead on non-developer devices (e.g., phones). This is part of what is behind the Chrome team's current investment in main-thread scheduling APIs, instead of worker-focused APIs. |
Another important aspect of this discussion is the comparison with techniques like greenlet, which do use a worker pool under the hood:
Given this comparison, the proposal seems like it would have negative impact on the ecosystem, as people try to use it like they use greenlet, and regress their application's performance. |
I was checking greenlet's source code, and I don't think it's true that this proposal encourages spawning multiple workers more than greenlet does. Greenlet even has an ":bangbang: Important" warning in their readme because it's really easy to spawn multiple workers rather than always re-using the same one. A greenlet-like API would also work for this proposal (this example is from import greenlet from 'greenlet'
let getName = greenlet(module {
export default async username => {
let url = `https://api.github.com/users/${username}`
let res = await fetch(url)
let profile = await res.json()
return profile.name
}
})
console.log(await getName('developit')) I cannot think of a more coincise version of it that "accidentally" spawns more than one worker. (obviously However, this proposal will never be more coincise than |
It sounds to me like there are two separate issues: the difficulty closing over data, and the ease of constructing module-based workers which run a module block, which could act as a "performance foot-gun". Really, you want to reuse Workers, rather than spawn new ones all the time. @surma has been careful to reuse Workers, rather than spawn new ones, in all example code in this repo. Code to form a greenlet equivalent could be factored into a shared module. I guess the real issue is that
If the Web Platform wants to strongly discourage the use of the The platform would still permit module blocks to be created and used in other mechanisms which are based on (This integration decision might be unfortunate, as @surma had noted that he was hoping that removing the separate-file requirement could make things easier for bundlers. But regardless, some bundlers manage to make it work today, and others could apply similar techniques.) What do you think of this mitigation, @domenic ? |
Speifically, not integrating with the Together those would address the second issue, "the ease of constructing module-based workers which run a module block". The first issue ("the difficulty closing over data") still remains relevant, however, and is somewhat related to the second one. Namely: Right now it is easy to create a greenlet-equivalent that uses closures to transport code into a worker pool. (I apologize for not realizing that greenlet fails to do this.) The resulting code is short and concise for users, and performant. This proposal, without a mechanism for closing over data, requires a tradeoff:
I'm scared about introducing a feature where the elegant, easy path that uses the feature most directly causes non-performant code, whereas you have to contort and nest things to get the performant path. (Or forgoe using the feature at all, and just use closures transported via toString().) That indicates to me the feature is misdesigned, as ideally it should be the opposite. (And, this is why blöcks is function-centric, instead of module-centric.) |
@domenic I'm having trouble understanding what you're referring to with "a greenlet-equivalent that uses closures to transport code". Could you say more about what it's possible to do with function literals and |
Sure. With closures it is possible to write the following and get worker pool behavior: for (let i = 0; i < 1000; ++i) {
const result = await betterGreenlet([`https://example.com/${i}.json`], endpoint => {
const res = await fetch(endpoint);
const json = await res.json();
return json[2].firstName;
});
} Here With module blocks, the "easy path" mechanism that is the closest equivalent to this, syntactically, looks like the following: for (let i = 0; i < 1000; ++i) {
const result = await betterGreenlet([`https://example.com/${i}.json`], module {
const res = await fetch(workerInput.endpoint);
const json = await res.json();
export default json[2].firstName;
});
} This code is bad though, as it uses worker-per-module. To enable a worker pool model, you need to switch to for (let i = 0; i < 1000; ++i) {
const result = await betterGreenlet([`https://example.com/${i}.json`], module {
export default endpoint => {
const res = await fetch(endpoint);
const json = await res.json();
return json[2].firstName;
};
});
} i.e., module blocks make the bad thing ergonomic, and the good thing unergonomic. Does that help? |
Thanks for explaining. It never occurred to me that anyone would go for that pattern of using a module block to run code on import, which I agree is a bad idea for lots of reasons. It sounds like you are arguing for #6 . How do you feel about this proposal if we dropped Worker constructor and blob URL integration and included #6 in the initial version? |
That would make me pretty happy, especially the arrow-function version. It still feels a bit backwards that modules are the default, instead of functions, given that I thinking passing input (arguments) into these code blocks will be very common. But that concern is more in the theoretical-purity camp, than the this-is-potentially-dangerous camp. And I can understand the arguments for modules as a better boundary for "separate bunch of code" than a special type of function, that justify the proposal being module-based. |
I don’t think the conclusion here needs to be to not support module blocks in the worker constructor. After all, that is one of the most frequent complaints I hear about workers on the web currently, and I’d like to solve that. I definitely hear your concerns @domenic about people creating way too many workers, but at the same time I think we don’t need to prevent that at the language-level. In my head, module blocks are a building block (eyooo) for these OMT scheduler libraries, not something that will provide all the nice-to-have ergonomics itself. Looking at the sample that you wrote, I think you made a small mistake. You are using for (let i = 0; i < 1000; ++i) {
const result = await betterGreenlet([`https://example.com/${i}.json`], module {
const res = await fetch(workerInput.endpoint);
const json = await res.json();
return json[2].firstName;
});
} However, if we replace the return with - return json[2].firstName;
+ postMessage(json[2].firstName); Half-assed Better Greenlet implementationconst workerModule = module {
addEventListener("message", async ev => {
self.workerInput.args = ev.args;
await import(ev.module);
});
};
// FIXME: Spawn these workers lazily
const workers = Array.from(
{length: navigator.hardwareConcurrency},
() => new Worker(workerModule, {type: "module"})
);
export function betterGreenlet(args, module) {
return new Promise(resolve => {
// FIXME: Will break when there are more jobs than workers. Can
// be solved using Streams or something.
const worker = workers.shift();
worker.postMessage({args, module});
worker.addEventListener("message", ev => {
resolve(ev.data);
workers.push(worker);
}, {once: true});
});
} Does that make sense to you? |
(Forgot to add: i definitely see more of a need for a shorthand for single-function modules than before) |
To be clear, we'd prevent it at the platform level, not the language level. I.e. this is really a question for the HTML integration; even if TC39 ratifies module blocks, whether and how we modify the web platform to work with them is subject to the HTML feature addition process, and currently I'm pretty negative on making them easily plug in to the Worker constructor. We don't want to make it easier for people to use the
Good catch; I meant to use
I don't believe that's true. There's only one The other case your snippet didn't solve is how to get the arguments into the worker, in a situation where there are more jobs queued than workers. Again, I don't think that is solvable, because the arguments need to make it in via globals ( |
Note that even if the web platform doesn't implement support for function createWorker(module) {
const worker = new Worker("data:text/javascript,onmessage=({ data })=>import(data)");
worker.postMessage(module);
return worker;
}
createWorker(module {
console.log("Hi!");
}); If someone really wants to run a module as a worker they still need some extra ceremony (way less than having a separate file anyway), but the need of this extra function makes sure that "you know what you are doing". |
As a library author, the ceremony of a separate file is what prevents me from putting Web Workers in reusable packages. I can work around that with Data URIs or Blob URLs, but then I've created a library that doesn't work under a strict CSP. Is there an option here that can still work under a strict CSP? |
@domenic: Okay, so I think we agree on the fact that a single-function module shorthand might be desirable to make it easy to create single-function modules (aiming to make just-spin-up-a-worker-with-a-module approach not the only syntactically lightweight approach). I have two questions that might or might not be represent your opinion accurately. I hope we can clear this up: 1.) Why do you think that worker pooling isn’t possible with the current proposal? Wrt to pooling, I have two approaches implemented below: The first one uses a round-robin approach and keeps track of the tasks in-flight. This might over-saturate a worker, but not in a way that it would break things. It will also pretty much guarantee full CPU utilization, which might be desirable. The second implementation uses one worker per task. Both implementation create the workers lazily (i.e. neither spawns 12 workers up front just because you have 12 cores), and both implementations create at most import {betterGreenlet} from "./greenlet.js";
let taskCounter = 0;
document.all.btn.onclick = async () => {
let id = taskCounter++;
console.log(`Queueing task ${id}...`);
const result =
await betterGreenlet(
[Math.random()],
module {
export default function (n) {
// Example workload that takes 1s to complete.
return new Promise(resolve => {
setTimeout(() => resolve(n**2), 1000);
});
}
}
);
console.log(`Result of task ${id} is ${result}`);
} Example with tentative single-function module shorthandimport {betterGreenlet} from "./greenlet.js";
let taskCounter = 0;
document.all.btn.onclick = async () => {
let id = taskCounter++;
console.log(`Queueing task ${id}...`);
const result =
await betterGreenlet(
[Math.random()],
module (n) => new Promise(resolve => {
setTimeout(() => resolve(n**2), 1000);
}
);
console.log(`Result of task ${id} is ${result}`);
} Round-robin
|
It's possible, but it's more ergonomic to not do worker pooling. Pooling requires
It's a footgun, because it encourages over-using workers. Workers should be used sparingly, in a pooled fashion; making them easier to use is an anti-goal.
No. To be clear, I think modifying the |
cc @developit @shubhie @shaseley It would be great to have your input here |
Ah that’s good context, @domenic. I shall do so :) |
@domenic I'm curious who "we" refers to here. Are you speaking on behalf of the Blink API OWNERS, or on behalf of multiple browser engine support, or the set of HTML editors, in WHATWG? (Or was this a "royal we"?) |
I'm speaking on behalf of my role in the Chromium project, as well as my role as HTML editor. |
sorry for jumping in late here, but wanted to +1 shaseley's take and domenic's concerns raised. Task worklet type approach is needed for improving our odds of "pit of success" with moving appropriate work off main thread. FWIW I summarized by recommendation here after a multi-quarter deep dive couple years ago: |
@spanicker Thanks for the interesting reference, but I'm having trouble understanding the connections you're drawing.
Are you referring to this comment or something else from @shaseley? From our conversation, I thought he was positive about this proposal, but it sounds like I might have misunderstood.
A task worklet would be great! Is there active work on a proposal here? I suspect that module blocks would be useful for sending code to a task worklet, for similar reasons as it can help usage of workers, but it would be great to investigate in more detail against something more concrete.
What is the connection between module blocks and messaging? I do agree that there would be a risk to hiding the cost of messaging. This is one reason why module blocks removes that aspect of blöcks, as noted above. Instead, module blocks is about context-free code.
If these issues were mitigated by userspace solutions today, this proposal wouldn't be worth it. But I'm not quite convinced:
|
I remain bemused by the claim that this is actively harmful for thread pooling. The ability to spin up threads at all is harmful to thread pooling, and there are direct ways to mitigate that that are orthogonal to this proposal (e.g. limiting Worker creation). For comparison, we don't ban |
In that analogy, this proposal is adding syntactic sugar that makes it easier to write That's what's harmful. |
Right, so I think the expensive operations that we're worried about are creating workers and posting messages. This proposal doesn't really change the syntax for either of those (unlike blöcks or a task worklet proposal), so I'm still having trouble following the argument. |
Apologies for the confusion. Yes I meant hiding the cost of messaging while improving ergonomics.
I'm not saying this is not adding value over userspace solutions. I am asking if it's adding adequate value towards moving us to the "pit of success" for web developers, given all the footguns laid out. I'm suggesting this proposal should move in tandem with something like task worklet, so we increase our odds. |
That's correct; the proposal in itself does not change the syntax for creating workers. It's the separate proposal of changing the |
Perhaps it comes down to the philosophical disagreement on how many guard rails the web platform ought to have, but suppose for sake of argument I agree that we ought to have guard rails here. Could you and @spanicker also expand on why mitigations like limiting Worker creation are insufficient as guard rails?
I don't know what a pit of success is, but I guess it's something of a local maximum of performance so it's harder to do the non-performant things than the performant things? Still I'd like to push back on explicit coupling of proposals like that, as a PL and platform designer. That kind of coupling makes more sense, IMO, for frameworks on top. As PL that sits below all of that, we should be designing composable features that can be used to build many frameworks. |
Just a question here: does this proposal actually imply a change to the Worker constructor? My understanding from the explainer is that a Module Block is semantically equivalent to a Blob URL ("module blocks behave like module specifiers", specifiers are strings). The proposal doesn't extend Worker with support for a new feature, it provides a new way of constructing module specifiers, which are already supported via Effectively, this allows the various APIs that already accept module specifier strings to also accept a Block. This doesn't add surface area, since "inline modules" can already be implemented trivially with nearly identical semantics to the proposal: const module = s => URL.createObjectURL(new Blob(s,{type:'text/javascript'})); "Inline modules" produced by the above are valid specifier strings, are transferable between realms, and do not provide a way to close over lexically scoped variables. Using the above for the purposes of comparison, here is the net difference in syntax:
The capabilities and usage patterns are the same for both of the above: // load it in this realm:
import(myModule);
// load it in a worker:
new Worker(myModule, { type: 'module' });
// load it in worklets:
CSS.paintWorklet.addModule(myModule);
audioContext.audioWorklet.addModule(myModule);
// hypothetical Tasklet/Task Worklet usage:
Scheduler.addModule(myModule); The primary differences between the two approaches above are that the module block syntax allows for static analysis, and avoids the CSP + opaque origin issues that limit the utility of the Blob URL implementation. Regarding a higher level system for task orchestration: it's been mentioned a few times, but from the perspective of something like Task Worklet, there doesn't appear to be any reason to create a specialized source representation of a unit of work specific to task execution. If a "task processor" is conceptually a portable function, and we already have a primitives for expressing that (modules that register or export functions). There is also prior art supporting this assumption: the existing Worklet APIs all provide coordinated threadable task execution, and do so using standard Modules. Paint Worklets are effectively named tasks scheduled and orchestrated by the compositor, audio worklets register (task) processors that are executed by the audio rendering thread, etc. |
Module specifier strings are not accepted by the worker constructor. URLs (and only URLs) are. |
If the proposal were updated to read "module blocks behave like Blob URLs", would that address the concern? The similarities also hold when comparing the proposal to Data URLs. |
I don't think so, no. |
Editorially speaking, I agree with @domenic that the Worker constructor would need a change to accept anything other than strings as their first argument (which is specified as USVString in WebIDL). As far as building a consistent programming surface for Web developers, I agree with @developit, that accepting module blocks would be a reasonable interface, deriving simply from the idea that module blocks work roughly like Blob URLs. Making this change to the HTML specification would be very simple. The editorial fact that this is a change is beside the point we should be focusing on, which is that the proposal is reasonable, helpful, and analogous to existing constructions, as @developit explained. |
Apologies, I think I've talked to individuals at separate times. From a scheduling APIs perspective, I think something like this would be great for if we do something like the task worklet proposal, potentially integrating with Regarding integration with the platform now, specifically worker construction, I don't have strong opinions — I think it's a nice improvement for a clear pain point, but would be curious to hear more about potential mitigations for the footgun concerns. |
@shaseley I'm still unclear on the footguns in this proposal. Could you elaborate on what you see these to be? I still don't understand the risk, now that we've clarified that this proposal does not change messaging. My understanding from past discussions was that main thread scheduling was prioritized over task worklet because of related concerns about the cost of messaging and synchronization, which could overwhelm the the benefits of small-granularity parallelism in some cases. (While I imagine task worklet would be specifically geared towards small jobs, this proposal would be just as useful for sharing bigger things). What would your team be looking for in a task worklet proposal that could address these concerns? |
Daniel, thank you for considering all the perspectives shared here and encouraging discussion towards understanding each other's views, much appreciated! Before commenting further, I would love to ensure that we are all on the same page on the problems this proposal is solving, otherwise easy to wire cross and provide unhelpful feedback (apologies if I did that already :)) From the explainer I got the impression that the underlying motivation is performance of applications and improved scheduling by leveraging additional threads. The proposal aims to improve ergonomics thereby enabling developers to more easily use workers and move work off main thread. My current impression from this thread is that the proposal is no longer helping this goal. Possibly also compounded with the concessions around avoiding integrations with the platform with worker constructor, blob url etc. (Plus also likely from different parties involved not collaborating enough, but don't want to digress). Now the updated objective is: incrementally improve today's syntax (Jason's example) with better static analysis, address the CSP issue. This is helping a much narrower audience of library maintainers. Also could you please expound on the problems with Typescript integration? as we are making the case of improved usability / tooling support compared to status quo. |
@spanicker Yeah, good idea for us all to step back and make sure we're on the same page. If the README gives people the impression that they can just spawn more workers and everything will be magically faster, then that's something for us to iterate on. Some use cases for module blocks in libraries that I see:
In both cases, module blocks bring these core advantages of enhancing static analyzability and allowing compatibility with strict CSP modes while being flexible about deployment/packaging. Does the goal of helping with these use cases seem reasonable to you? On TypeScript integration, the major issue is how TS can track which interfaces are available in the current global object (Window vs Worker). TS's current approach doesn't quite map to module blocks, but this is something that the team is thinking about. See recent TS design meeting notes for further discussion. cc @DanielRosenwasser |
One thing I'd like to point out here: the fact that Module Blocks provides a mechanism for inline workers to import other modules makes them more suitable to complex worker use-cases than function serialization implementations. One could argue this makes the proposal more likely to push developers in the direction of meaningfully beneficial use-cases, since it removes one of the largest current barriers to non-trivial worker usage. Rather than encouraging developers to execute individual functions in a worker (which tends to correlate with the poorer worker use-cases), it forces developers to use a Module as the unit of encapsulation - this is important because it more clearly expresses the thread creation cost by syntactically separating it from invocation. Secondly, there seems to be at least a bit of agreement in this thread and amongst those I've spoken with that some form of "task worklet" model is needed. Modules are a hard prerequisite for Worklets, but in my experiments I found that the "task worklet" model can be cumbersome when forced to separate task processors into their own file. This is particularly true for intermediary and ad-hoc tasks that "glue together" tasks authored by different parties or exposed from a modular architecture. Anecdotally, every demo I constructed for Task Worklet relies on a loose approximation of Module Blocks. A mechanism for defining "inline modules" is crucial in order to allow developers to perform complex orchestration of a task graph. The alternative is to pull data across threads for the sole purpose of massaging or indirection, which leads to poor performance outcomes. Here's an example to illustrate: // image-tasks.js
registerTask('resize', class {
process(image, width, height) { /* snip */ }
});
registerTask('crop', class {
process(image, left, top, bottom, right) { /* snip */ }
});
registerTask('rotate', class {
process(image, cx, cy, deg) { /* snip */ }
}); // index.js
import { showImageInfo } from './image-info-ui.js';
const q = new TaskQueue();
q.addModule('/image-tasks.js');
img.onchange = async imageData => {
let img = q.postTask('crop', imageData, left, top, bottom, right); // transfers source image to a thread
img = q.postTask('resize', img, width, height); // no transfer
img = q.postTask('rotate', img, width/2, height/2, rotation); // no transfer
img.result.then(data => canvas.putImageData(data, 0, 0)); // transfers result image from thread
// now say we also want to calculate some info about the image, still off the main thread:
showImageInfo(img, q);
}; // image-info-ui.js
// bad - runs on main thread:
export async function showImageInfo(img) {
$ui.render(analyze(await img.result));
}
// bad - pulls image to main thread, then transfers it to another thread:
const q = new TaskQueue();
q.addModule(new URL('./image-info.js', import.meta.url));
export async function showImageInfo(img) {
$ui.render(q.postTask('analyze', await img.result).result);
}
// good - posts a new type of task to the existing pool, no data transfered:
export async function showImageInfo(img, q) {
q.addModule(module {
registerTask('analyze', class { /* snip */ });
});
$ui.render(await q.postTask('analyze', img).result);
} |
The original issue: Closing over dataI definitely see the ergonomic value of closing over data, but I think my main lesson for me from Blöcks was that it is a source of many problems and questions. I believe my Our strong opinion is to not close over data. The other comments in this thread are important and I will outline our current stance on each of them below. However, this issue is huge and has lost a well-defined scope and so I will close this issue. If you feel like a concern is not sufficiently addressed, please open a new issue! What is the problem space?Module blocks aim to improve ergonomics for developers who want to use APIs that rely on separate files (most notably: Workers and worklets). At the same time, library authors that want to use Workers under the hood have to jump through a lot of hoops and think through a lot of edge cases around bundling and file paths before publishing (let alone using libraries via CDNs like unpkg.com). This forces developers to choose less performant options or make otherwise negative choices, even if their use-case would massively benefit from the API. Module blocks aim to solve this problem through a minimally invasive change to the language. Verbose syntaxModule blocks are indeed more verbose than what Blöcks for example originally proposed. We had a couple of syntax suggestions in this thread for single-function modules which are worth exploring (see also #6). However, I am unconvinced that single-function modules are going to be a popular pattern, as I expect most use-cases of module blocks to make use of static imports. We plan to continue with the current Worker constructorMaking the worker constructor (and worklet constructors!) accept module blocks is a core feature of this proposal. It is one of the most frequently cited issues around Workers and Worklets. Without this, I don’t think there is a point in moving forward with this proposal (unless we want to encourage workarounds like @nicolo-ribaudo showed). Workers are currently barely used at all (apart from ReCAPTCHA driving up the usage stats), so I don’t share the concern that people will start spam-creating workers. The Blink worker team and the scheduler team don’t have strong opinions on this, either. That being said, in the long-term, developers being irresponsible with Worker creation is worth thinking about. If we want to mitigate worker abuse, however, it should not just address worker abuse done via Module Blocks. There are libraries (like Greenlet) that make it ergonomic and easy to spam-create workers today, without module blocks. Any mitigations we come up with should cover those patterns as well. Worker pooling / TaskletsI would love the see the platform set developers up for success wrt thread management. Something like tasklets or a worker pool primitive are great ideas. However, I see those as synergetic with Module Blocks. I’d even go so far as to say that Module Blocks are a requirement for any of those primitives to be successful. We already know that having to put code in a separate file is a massive pain point for developers, to the extent that Workers are avoided as long as possible. Unless we provide a way for developers to spin up workers (or a tasklet, or whatever) inline, any new scheduling primitive is bound to remain equally unsuccessful. More specifically, I was actually hoping that this proposal could help extend @spanicker’s Main-thread Scheduling proposal to also span off-thread scheduling capabilities, bringing us closed to something like GCD. Again, I will close this issue. Please open a new issue if you want to continue talking about any of these concerns specifically. |
I disagree with a lot of what you wrote above, but it feels like we're going in circles, so I don't think a point-by-point rebuttal is useful. Just to call one specific thing out though, integration of module blocks with workers and blob URLs is not accepted by the HTML Standard editors or the Chromium team. I.e., closing this issue does not indicate you've convinced the relevant parties to accept changes to HTML. Similarly, moving this proposal forward through the stages in TC39 or implementation in V8/support from Chrome DevRel does not indicate That would require further collaboration and outreach, which I guess will not happen in this thread since you're closing it, but presumably you'll work on that somewhere. |
sorry to leave a comment in the closed issue. My requests:
I personally don't see this as a clear net benefit in helping developers with better leveraging worker. It's neither significantly improving ergonomics / mental model nor helping developers increase appropriate usage of worker in practice -- beyond the theoretical “hopefully this helps new APIs …” Why not wait until this actually helps something? The larger point of disagreement is: there isn’t an obvious big opportunity with worker that can be unlocked with small ergonomic improvements, so the benefits are limited and we should consider that against the costs as well as the opportunity cost. Also consider revisiting this when the opportunity has been made more clear. To provide some insight into where I'm coming from, let's consider the potential big opportunities with worker:
|
@domenic I hoped I was clear in my reply, but maybe it got lost in the length: I welcome a point-by-point rebuttal — but ideally in new issues per topic so we can have focused discussions. I care about this proposal and want it to succeed, so it’s not in my interest to cut any corners. This issue (#21) was about Module Blocks closing over values and I hope you agree that it had lost scope. Closing this issue is not a statement on any other issues that were brought up throughout, like the Worker constructor integration that you are worried about. I created a new issue (#43) where we can continue this discussion. If there are more things you would like to voice concerns over, please feel actively encouraged to open new issues. @spanicker I disagree that I am only targeting people who already use workers. Over the last couple years I have heard a number of times that people are shying away from workers because they are too complicated and unergonomic — regardless of the wins they might bring. One example: The meeting notes from the W3C Games Workshop explicitly states that game makers want to make use of workers but often don’t. One of the reasons stated:
Another example: Developers express their desire for Module Blocks: 1 2 3. While these are individuals, they are quite experienced and work on large-scale projects. Them getting excited about Module Blocks and what they provide is meaningful in my opinion. I have more anecdata on this which leads me to believe that Module Block is a welcome and important primitive for the developer ecosystem. To me this shows that this is significantly improving ergonomics. I myself have discovered as a library author that I have avoided using Workers because they become extraordinarily tricky to publish in a way that remains usable (esp. once paths or lazy-loading is involved). With Module Blocks these problems would go away. Again, I have heard similar stories from other developers. As for tasklets: I absolutely want to see Tasklets or something like Tasklets make it to the platform. However, it is unclear to me how any proposal in that problem space can succeed without something like Module Blocks as the underlying primitive for transferable/shareable code, as we know that required code to be put in a separate file is a deterrent for developers. (Side note: I am not advocating to use Workers in the UI path. I have always said “UI thread [aka. main thread] for UI work”. Even for the “obvious” use-cases for workers developers often use the main thread.) |
I don't think it's been articulated anywhere in this thread, so I'll ask: setting aside any notion of integrating this with the Worker constructor, is anyone opposed to this proposal as one piece of a multi-step process to land something "tasklet"/"task worklet" -like? Say we removed the |
Also, just to clarify something @spanicker - the benchmarks you're referring to show a tradeoff between input response time and FPS/jank. Worker usage on low-end Android devices was shown to correlate with reduced main thread jank and increased FPS. The results also implied a correlation only when input was continuous and rapid, with no other scenarios producing statistically meaningful outcomes. I also want to stress: none of these results have been validated. This means the individual findings must be interpreted as equally viable or inviable. They cannot be cited selectively, because they are not mutually exclusive. While it is true that the results indicate Workers should not be used for direct input response (keyboard/pointer input), the same constraint applies to main thread scheduling: handling input requires synchronous or nearly-synchronous "user-blocking" priority that effectively bypasses scheduling. |
This issue was filed about a specific topic (having to do with passing data and closing over it), and seems to have expanded in scope to talk about the proposal overall, and the internal coordination of the Chrome team. I want to second @surma 's encouragement to file other issues, e.g., on the proposal's motivation and integration with the web platform (e.g., #43), and suggest that Chrome-internal coordination happen outside of this repository. Locking the issue as too heated. |
When trying to offload work into another thread, IMO the syntactic overhead of importing/exporting is too high compared to other languages, or the blöcks proposal, or existing libraries.
Consider wanting to fetch a file at a computed-at-runtime URL, transform its contents, and return the result to the main thread. With this proposal, the best I can come up with is
where
worker()
is a helper function somewhat similar to what is shown in https://github.com/tc39/proposal-js-module-blocks#use-with-workers , although I think it would have to spawn aWorker
per invocation in order to allow reliable message passing and translation of default exports into return values. Which is itself not great.This is worse than other languages, where you can do something equivalent to
or the blöcks proposal, which requires explicit variable capture
or something like greenlet which uses IIAFEs:
Is there a way to use this proposal to get something simpler? Otherwise I am worried that the benefits over libraries like greenlet are not enough to justify the extra overhead one has to incur.
The text was updated successfully, but these errors were encountered: