-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Follow up work once lightweight isolates are enabled by-default. #46752
Comments
… call. This improves performance of SendPort.Receive.Nop benchmark with isolate groups enabled on Intel Xeon by ~17%. This benchmark emphasizes performance of handle message flow. Issue #46752 TEST=ci Change-Id: I3b9be3283047631e8989bb56f90af2b3b007afe8 Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/209642 Commit-Queue: Alexander Aprelev <aam@google.com> Reviewed-by: Martin Kustermann <kustermann@google.com>
@mkustermann Wasn't this done? |
@mnordine not afaik (/cc @aam). Currently we do a O(n) verification pass and check that certain objects aren't part of the transitive object graph. To make it O(1) we'd need to think very carefully what transferring such objects to another isolate means. Also in the context of message passing, where there's a time window when the sender isolate exited, but the receiver isolate hasn't read the message yet (e.g. for transferred receive ports, timers etc objects) |
Maybe an additional TODO would be the following found in the Lines 3765 to 3766 in b6c8bd7
The current comment points to #36097 which are closed. Don't know if the intention are still to change the service API. |
@mkustermann I'm very excited in the possibility of:
is there a separate issue to track that work? (it would also be great to have an issue as a place to ask questions about the current implementation of Isolate scheduling) |
Could you give us some context on why this is important for you? We haven't heard many users ask for it, so currently we're focusing on some other things atm. Though there's a bunch of things that is planned for our GC which may allow us to eventually remove this restriction (number of mutators running in parallel) we have atm. (/cc @rmacnak-google ) |
Thanks for the quick reply @mkustermann ! My use case for this is that I'm back working on a "Erlang style" backend system (initially servicing Flutter clients via websockets) initially running workloads developed internally but later on being able to run workloads submitted by customers. For this, having pre-emptive scheduling across (very) large numbers of Isolates would help both with Isolates that could end up being CPU bound either due to developer bugs or misuse by customer submitted jobs (unintentionally or not). I believe having pre-emptive Isolate scheduling like Erlang where all Isolates can keep making some progress at the expense of higher latency across all Isolates would be beneficial for my use-case (my reference for alot of my design thinking for this is this presentation: https://www.youtube.com/watch?v=JvBT4XBdoUE). Hopefully what I described above is what you had in mind in regards to that item on this issues list of tasks? In regards to GC, I've also run into the potential issue of heavy mem churn in some Isolates causing higher latency across all Isolates but I've asked about that in a separate issue. I suspect the reason no one has asked about this before is Dart server-side usage is still in its early stages and the use there is so far is from frameworks that are more styled after NodeJS rather than Erlang/Elixir. But I definitely see growing use of Dart on backend (there was a new commercial Dart server-side Sass announced just few days ago) so I would expect this to become a more important usecase in the near future. |
Scenarios where the number of ready tasks (i.e. have something to do) is vastly greater than the number of CPU cores available for execution is basically a highly overloaded system and one cannot expect to get good application behavior. What's more realistic is that one has a high number of tasks - but only a small part of them are actually able to execute (i.e. have something to do) at a given point in time. The reason we currently would need to do a preemptive scheduling in the VM itself is because we cannot have too many isolates running in parallel as that would degrade the performance - due to the way our GC is structured (i.e. currently we cannot take advantage of high number of cores). If we didn't have that problem anymore, we could simply use OS threads and rely on the OS to do scheduling. |
Hi @mkustermann I'm circling back to this because I didn't yet want to give up on the idea of having large numbers of concurrent Isolates in a manageable way, so I've tried a different tack: to use scripting for workloads running inside each Isolate. Using an interpreter written in Dart, I'm then able to do my own preemptive scheduling simply by counting instruction executions inside the interpreters for(;;) loop and so time slicing that way. But what I think I need to now do is essentially do the same as you have proposed in #51261 but for Dart instead of FFI calls: I need to be able to have the current thread leave the current Isolate when the "timeslice" runs out for its interpreter, as otherwise I can run into the same kind of max 16 mutator thread deadlocks. I was guessing that doing this would then put each of those threads back (hopefully to the back) of the "ready to run" Isolates list? I had a quick look but couldn't easily spot where in the VM src is the code that allocates the mutator thread pool to "ready to run" Isolates, so I'm not sure if this is already possible and I'm missing something on how to do this in just Dart? Or if not, would the mechanism you are planning for FFI could be put to use for this scenario too? |
If a Dart isolate calls C which then leaves the isolate via What our implementation does in
Both of this happens in Thread::ExitIsolate which calls IsolateGroup::DecreaseMutatorCount() If it wasn't a nested exit (i.e. Dart calls C which exits) but rather Dart just went back to the event loop, then we indeed re-use the thread for possibly another isolate. Isolates are by-default started on a thread pool the @maks Do I correctly understand: Each interpreted dart program runs in it's own lightweight isolate. Your interpreter loop will let you know when a time slice ran out. At which point you can simply go back to the event loop - then other isolates will run. If you have an interpreter to Dart call that may be blocking but you cannot control what that dart call does - you essentially need the VM support for preemptive scheduling. |
Thanks for the detailed reply and pointers to the VM code @mkustermann thats very helpful! 🙏🏻 Apologies for not providing more details on the approach I'm trying with the interpreter, I'll try to explain it better below. But getting back to the topic, I was hoping to use the above approach to only run Lua instances in each "lua worker" Isolate thereby allowing me to have preemptive scheduling of those Isolates.
Yes that's exactly it Martin. I'd like some way, in the Lua runloop's Dart code to go back to the Dart event loop to allow other isolates to run, and then at some point come back to running this isolate from that "yield point" in the Lua runloop (the Dart But that seems to require doing something like What I essentially want to do is the equivalent of Really what I want here is what, ironically Lua co-routines exactly provide: a blocking I have a bad feeling this is along the lines of the debate that was had around the use of
And yes I very much take your point about if I have code that calls out to Dart that then does a long running CPU bound call, I back to square one! Thats one of the reasons I chose Lua and LuaDardo specifically is that its possible to limit every external call available in the Lua instance as even its std lib import is optional, so I can chose to not even provide a |
After the lightweight isolate support is enabled (see #36097) there are a few tasks we might want to do as follow-up work:
Experiment with more data sharing
Performance
lookupHandler()
&invokeMessageHandler()
)Testing
/cc @aam
The text was updated successfully, but these errors were encountered: