-
Notifications
You must be signed in to change notification settings - Fork 97
worker: initial implementation (large/base PR) #58
Conversation
Hi,
I have been able to compile clean version of Node.js without any issues on same machine before on a side note, not as a requirement or anything but as a mere suggestion for help :) |
47e8463
to
70921b6
Compare
@YafimK Hm – I can’t quite figure out what the problem here is :/ Does it help if you just remove line 44 in
I think this isn’t really about this branch, but about Ayo in general – we don’t have much CI support yet, so talking about support for certain platforms is somewhat hard (but also see #25)
If you can make that happen, that would be absolutely awesome 💙 (Also, just fyi, I rebased this against the current |
70921b6
to
1aa0013
Compare
@benjamingr @Qard I’ve moved everything new off the |
1aa0013
to
00bb418
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will need to look at the SharedArrayBuffer part more, but I really like the direction this is taking. Hats off to @addaleax!
doc/api/vm.md
Outdated
@@ -310,6 +310,27 @@ console.log(util.inspect(sandbox)); | |||
// { globalVar: 1024 } | |||
``` | |||
|
|||
## vm.moveMessagePortToContext(port, context) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/context/\0ifiedSandbox/
|
||
Sends a JavaScript value to the receiving side of this channel. | ||
`value` will be transferred in a way | ||
that is compatible with the [HTML structured clone algorithm][]. In particular, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
structured cloning isn't a thing any more (https://html.spec.whatwg.org/multipage/structured-data.html#structuredclone).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heh. Right. I think this is still what it’s known as, though? (That it’s implemented as a serialize + deserialize step should be more or less transparent to users)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@addaleax That article is pretty outdated though (it has no mention of MessagePort or SharedArrayBuffer)... HTML structured serialization/deserialization? maybe a link to v8.Serializer/Deserializer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(it has no mention of MessagePort or SharedArrayBuffer)
That’s why these are mentioned explicitly here ;)
maybe a link to v8.Serializer/Deserializer?
Yeah, that’s a good idea … that should give people an idea of how to play around with the algorithm without spinning up message channels every time. I’ve added:
+For more information on the serialization and deserialization mechanisms
+behind this API, see the [serialization API of the `v8` module][v8.serdes].
doc/api/worker.md
Outdated
--> | ||
|
||
The `Worker` class represents an independent JavaScript execution thread. | ||
Most Node APIs are available inside of it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ayo or Ayo.js.
Is Worker
available in a worker?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is
Worker
available in a worker?
Yes. That’s also tested and the code is generally laid out to support such a structure, in part because there’s not really anything standing in the way of it
doc/api/worker.md
Outdated
* Returns: {undefined} | ||
|
||
Starts receiving messages on this `MessagePort`. When using this port | ||
as an event emitter, this will be called automatically once `message` listeners |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: 'message'
<!-- YAML | ||
added: REPLACEME | ||
--> | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What arguments does this event get? (value, transferList)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only value
. We could try to add the transferList
too, if that seems useful, but all of the relevant objects should be reachable through value
anyway.
I don’t know if we have some standard format for saying “this events has one parameter that can be any JS value”, so I’m adding prose for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think something like the following is used. Basically the same format as function parameters.
### Event: 'event'
* `value` {any} The value
This event is fired when ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, thanks – done!
src/node_messaging.cc
Outdated
MessagePort* port; | ||
ASSIGN_OR_RETURN_UNWRAP(&port, args.This()); | ||
if (!port->data_) { | ||
env->ThrowError("Can not send data on closed MessagePort"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Cannot
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed!
doc/api/worker.md
Outdated
|
||
The `MessageChannel` has no methods of its own. `new MessageChannel()` | ||
yields an object with `port1` and `port2` properties, which refer to linked | ||
[`MessagePort`][] instances. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something I've been thinking is that the messaging API could well be applicable to VM contexts too. Is that a correct understanding?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is a correct understanding, because the ports in a channel can be moved to different contexts
doc/api/vm.md
Outdated
@@ -310,6 +310,27 @@ console.log(util.inspect(sandbox)); | |||
// { globalVar: 1024 } | |||
``` | |||
|
|||
## vm.moveMessagePortToContext(port, context) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just vm.transferMessagePort()
? The "to context" part is already conveyed by the fact that this method is in vm
module, and "transfer" is a more Proper(R) name IMO than "move".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I like about move
rather than transfer
is that it conveys better that the original message port object is now rendered unusable … I don’t feel strongly about it, though
src/node_messaging.cc
Outdated
|
||
ExternalSABReference ReferenceCountedSAB::ForIncomingSharedArrayBuffer( | ||
Environment* env, Local<Context> context, Local<SharedArrayBuffer> source) { | ||
Local<Value> lifetime_partner; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aww... ;)
src/node_messaging.cc
Outdated
Local<FunctionTemplate> templ; | ||
templ = env->message_port_constructor_template(); | ||
if (!templ.IsEmpty()) | ||
return templ->GetFunction(context); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is GetFunction()
cached by context?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup – I have to admit I never got the point of {Function,Object}Template
s before making this PR. However, in retrospect it’s a lot clearer: It’s a Template
in the sense that it can be used to create functionally equivalent functions/objects in different contexts.
src/node_messaging.h
Outdated
void AddSharedArrayBuffer(ExternalSABReference ref); | ||
// Internal method of Message that is called once serialization finishes | ||
// and that transfers ownership of `data` to this message. | ||
void AddMessagePort(std::unique_ptr<MessagePortData>&& data); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make these private?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that would require giving the SerializerDelegate
that accesses these a publicly visibly identifier, right?
00bb418
to
9d3a05f
Compare
5d36c5d
to
5b0aca3
Compare
@TimothyGu I think I got everything you commented on so far? :) |
43b4725
to
5d793a4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more feedback. Still haven't read much code yet.
added: REPLACEME | ||
--> | ||
|
||
* Returns: {undefined} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* `value` {any}
* `transferList` {Object[]}
<!-- YAML | ||
added: REPLACEME | ||
--> | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean, refer to cluster
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I intended to refer to the comment of adding parameter types. Seems like github messed up the order :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah – in that case, done :)
src/node_messaging.cc
Outdated
} | ||
|
||
void MessagePortData::Entangle(MessagePortData* a, MessagePortData* b) { | ||
a->sibling_ = b; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd add some CHECKs or even mutexes here making sure it's thread-safe, and only works when either port is not entangled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think CHECK
s for making sure that the ports aren’t entangled are fine. The method isn’t really required or supposed to be thread-safe, I’ve noted that in the documentation. (It’s only called by the thread that created both message ports.)
doc/api/worker.md
Outdated
use them for I/O, since Ayo’s built-in mechanisms for performing operations | ||
asynchronously already treat it more efficiently than Worker threads can. | ||
|
||
Workers can also, unlike child processes, share memory efficiently by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can even explicitly call out clusters here.
--> | ||
|
||
The `Worker` class represents an independent JavaScript execution thread. | ||
Most Ayo APIs are available inside of it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would mention message passing as the primary means of communication between the worker thread and the thread that spawned the worker. Maybe something like:
Like [Web Workers] and the [
cluster
module], two-way communication can be achieved through inter-thread message passing. Internally, aWorker
has a built-in pair of [MessagePort
]s that are already associated with each other when theWorker
is created. While theMessagePort
objects are not directly exposed, their functionalities are exposed through [worker.postMessage()
] and the ['message'
event] on theWorker
object for the parent thread, and [require('worker').postMessage()
] and the ['workerMessage'
event] onrequire('worker')
for the child thread.To create custom messaging channels (which is strongly encouraged over using the default global channel for more complex tasks), users can create a
MessageChannel
object on either thread and pass one of theMessagePort
s on thatMessageChannel
to the other thread through a pre-existing channel, such as the global one.[insert example]
See [
port.postMessage()
] for more information on how messages are passed, and what kind of JavaScript values can be successfully transported through the thread barrier.
doc/api/worker.md
Outdated
|
||
The `MessageChannel` has no methods of its own. `new MessageChannel()` | ||
yields an object with `port1` and `port2` properties, which refer to linked | ||
[`MessagePort`][] instances. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd add a description of what MessageChannel
is before saying what it has. Maybe:
The
MessageChannel
class represents an asynchronous messaging channel.
doc/api/vm.md
Outdated
context. | ||
|
||
Note that the return instance is *not* an `EventEmitter`; for receiving | ||
messages, the `.onmessage` property can be used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems rather crude to have some MessagePorts being EventEmitters, and some not. I understand you are still working on this part, and that the EventEmitter is implemented in JS complicates things. But I'd rather have it all one way or the other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I'd rather have it all one way or the other.
Me too. :) But like with the Buffer
s, it’s just a problem that not all contexts have a concept of EventEmitter
s … this would require making some internal modules multi-context ready, which I would like to consider out of scope for this PR. :) Also, this is already quite an advanced feature on its own.
doc/api/worker.md
Outdated
* Extends: {EventEmitter} | ||
|
||
Instances of the `worker.MessagePort` class represent an asynchronous | ||
communications channel. It can be used to transfer structured data, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/an asynchronous communications channel/an end(or port?) of a two-way asynchronous communications channel/
doc/api/worker.md
Outdated
|
||
`transferList` may be a list of `ArrayBuffer` and `MessagePort` objects. | ||
After transferring, they will not be usable on the sending side of the channel | ||
anymore. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess one of the main thing that confused me was how transferList
was used. So now I understand that objects present in value
(no matter how deep) that are also in transferList
are transferred, while ordinarily they are cloned. I wasn't aware of the "ordinarily" part.
However, what if transferList
has an object that is not present in value
? Will it get neutered as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m adding clarifications for both points, thanks
doc/api/worker.md
Outdated
communications channel. It can be used to transfer structured data, | ||
memory regions and other `MessagePort`s between different [`Worker`][]s | ||
or [`vm` context][vm]s. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, please document if/when open()
and close()
need to be called. An example would be best.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It’s hard to come up with examples for these … I’ve added to the start()
documentation that it doesn’t need to be used unless moving-between-VM-contexts is used. (And, as you pointed out below, ideally it wouldn’t be necessary at all).
Regarding close()
… I’m not entirely sure whether MessagePort
s should be unref()
ed by default or not. Right now, any MessagePort
keeps an event loop open unless explicitly unref()
ed or close()
ed …
8b087d0
to
a2095eb
Compare
@TimothyGu I think I got more or less everything … the |
a2095eb
to
0693217
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making progress...
src/node_messaging.cc
Outdated
void Finish() { | ||
for (MessagePort* port : ports_) { | ||
port->Close(); | ||
msg_->AddMessagePort(std::move(port->Detach())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the std::move redundant if MessagePort::Detach already returns a std::move'd smart pointer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clang complains:
../src/node_messaging.cc:325:28: warning: moving a temporary object prevents copy elision [-Wpessimizing-move]
msg_->AddMessagePort(std::move(port->Detach()));
^
../src/node_messaging.cc:325:28: note: remove std::move call here
msg_->AddMessagePort(std::move(port->Detach()));
^~~~~~~~~~ ~
Will fix.
src/node_messaging.cc
Outdated
} | ||
Context::Scope context_scope(context); | ||
MessagePort* target = | ||
MessagePort::New(env, context, nullptr, std::move(port->data_)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This std::move(port->data_)
should be port->Detach()
.
0693217
to
ef85bd3
Compare
@addaleax I've completed making MessagePort consistently EventEmitter using the V8 Extras approach I talked about on Discord. See bb85e11 and 62a0263. The same approach could be taken to make Buffer available everywhere as well. While I agree that it isn't the most in scope for this PR, it shows something pretty promising IMO. |
ef85bd3
to
d76b9ec
Compare
@TimothyGu that is, indeed, very nice. 👏 The commits generally LGTM, I’ll pull them in here |
With the changes pulled this mostly LGTM - hopefully I'll be able to get some interesting benchmarks going. |
d76b9ec
to
f715000
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
b170b46 LGTM other than minor suggestion to have isMainThread
getter rather than threadId === 0
in a bunch of places.
adf9752
to
56d80a6
Compare
Fwiw I “fixed” the V8 extras + events + internal errors issue by manually editing the error messages and tacking on |
56d80a6
to
c2b23bb
Compare
c2b23bb
to
d35f78d
Compare
Taken from petkaantonov/io.js@ea143f7 and modified to fit current linter rules and coding style.
Native addons need to use flags to indicate that they are capable of being loaded by worker threads. Native addons are unloaded if all Environments referring to it have been cleaned up, except if it also loaded by the main Environment.
This should help a lot with actual sandboxing of JS code.
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Only allow `.js` and `.mjs` extensions to provide future-proofing for file type detection.
$ ./ayo benchmark/cluster/echo.js cluster/echo.js n=100000 sendsPerBroadcast=1 payload="string" workers=1: 26,709.154114687004 cluster/echo.js n=100000 sendsPerBroadcast=10 payload="string" workers=1: 15,936.350422945854 cluster/echo.js n=100000 sendsPerBroadcast=1 payload="object" workers=1: 20,778.85550744996 cluster/echo.js n=100000 sendsPerBroadcast=10 payload="object" workers=1: 10,912.260027807712 $ ./ayo benchmark/worker/echo.js worker/echo.js n=100000 sendsPerBroadcast=1 payload="string" workers=1: 69,787.63926344117 worker/echo.js n=100000 sendsPerBroadcast=10 payload="string" workers=1: 32,544.630210444844 worker/echo.js n=100000 sendsPerBroadcast=1 payload="object" workers=1: 48,706.90345844702 worker/echo.js n=100000 sendsPerBroadcast=10 payload="object" workers=1: 18,088.639282621873
d35f78d
to
9fb522e
Compare
does it mean that we are not use thread in nodejs? or jsut this branch is deleted? |
@p3x-robot A newer version of this PR was merged into Node.js' master branch. |
(status: currently ready for review, by anyone, including you: #40 (comment))
(If that gives the wrong impression: No, this does not conform the browser WebWorker API, but that should be rather easily implementable on top of this, and I’m okay with that.)
Preliminary benchmarks: https://gist.github.com/TimothyGu/7fd2fbe15537a84963c36a3e0a03bcce
Fixes: #31
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passesTODO
.start()
manually forMessagePort
sperformance
(currently pending the V8 6.1 update in upstream Node.js)MessagePort
s configurablev8::Platform
implementation with multi-isolate support(@petkaantonov please feel free to indicate whether attributing you for code that comes from your original PR is not enough/too much/just right :) )