-
Notifications
You must be signed in to change notification settings - Fork 30.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
let v8 use libuv thread pool #11855
Comments
If that were possible, doing so would further reduce the performance of |
@mscdex We can increase libuv thread number. |
Sure, you can increase it, but I would guess most people use the default (currently 4). |
Well, and we also can increase default libuv thread number... |
I'm not 100% sure but I think 4 may have been chosen as that is/was the typical number of cores/cpus on most machines? |
While increasing the default will hopefully be something we can do at some point, 4 is still the most common average if I'm remembering correctly. |
In any case, having V8 use threads outside of the thread pool seems wrong. If 4 is the right default, you would in fact use more than 4 if V8 spins up its own threads. If you expect 4 + additional V8 threads to be a good number, then 4 is chosen too conservatively. |
@hashseed I'm not sure what the performance difference is (waiting for a libuv thread vs. OS scheduler), but if V8 were to use the libuv thread pool, some node requests would/could get blocked (even more so than they may be currently), whereas they may not have before. |
FWIW V8 hands over thread management to Chrome when embedded in Chrome. |
@hashseed Could you please explain the reason what Chrome did ? |
Node is quite a different thing than Chrome though ;-) |
@jeisinger probably knows more. Iiuc V8 simply prefers to let the embedder take control. In case of Chrome, page start-up can be very busy wrt threading, and Chrome's scheduler likely has a better grasp than the OS scheduler. |
In the past, each isolate of v8 would create it's own worker pool. To fix that, I moved the worker pool to a single v8:: Platform instance. Furthermore, chrome uses a central scheduler to eg throttle tasks in background tabs or give rendering related tasks a higher priority when the frame deadline approaches. To achieve this, it uses a custom v8:: Platform implementation. |
FWIW, the reason V8 has its own threadpool is that it was the least amount of work. The plan was always to come up with something better. What constitutes 'better' is something of an open question, though. Blindly shoveling it into the libuv threadpool will probably cause a performance hit. |
I think the conservative wisdom is to set threads = number of cores. This holds for when those threads are being used for CPU intensive tasks. The reason being that setting 4 threads for 4 cores reduces the cost of context switching, and then you would typically pin them to a core etc. But the Therefore, for Node, it would actually make much more sense to increase the default number of threads available for disk and network operations. This would reduce head-of-line blocking caused by slow DNS requests or degraded disks, by increasing concurrency. The time-scale of context switches in this async use-case is dwarfed by the time-scale of disk or network IO. Also, the memory overhead of libuv threads is low: 128 threads should cost 1 MB in total according to the libuv docs (http://docs.libuv.org/en/v1.x/threadpool.html)? |
@jorangreef What you say is true but there is a caveat: system calls can consume significant kernel-mode CPU time. I've experimented with the approach you suggest but performance for some operation falls off a cliff when you have many more threads than cores. Example: stat(2) on deep (or deeply symlinked) file paths. The inode metadata is probably in the dirent cache but traversing it can still be costly. If you have many threads hammering the cache, performance degrades in worse than linear fashion. Most programs would not exhibit such pathological behavior but a specialized tool like a backup program might (and people do write such things in node.js.) |
@bnoordhuis Would it not then make sense to allow writers of such problematic tools to reduce the thread count to equal the core count, rather than force most people, working on simpler things, to understand they can increase the core count to some unstudied number? |
@bnoordhuis would it be possible to dynamically scale the thread count without incurring excessive overhead? I'm guessing you could be quite conservative and only change the number when you start to run into heavy performance problems. I remember @sam-github saying that the node event loop time was a pretty good indicator of how your app was performing, what would the equivalent be for this? I guess my question is, is it just a question of doing the work, or is it likely to result in a net performance decrease in most situations. |
Libuv could use the work queue size or the average flight time of a work request as an input signal but that doesn't distinguish between threads waiting for I/O and threads doing computation. For I/O-bound work loads Libuv knows what category its own work items belong to (fs, dns - all I/O) but it doesn't know that for external work items that come in through uv_queue_work(), which is what What's more, libuv only knows about the current process but a common use case with node is to spawn multiple processes. You don't want to get into a perfect storm situation where multiple processes think "hey, throughput times are going up, let's create some more threads." It's not intractable but fixing it in either node1 or libuv will involve a large amount of engineering and for an uncertain payoff. Perhaps determining the ideal thread pool size is more of an ops things. We should expose hooks but leave tuning it to the programmer or the system administrator. 1 Node could side-step libuv, use its own thread pool and orchestrate with other node processes over IPC. |
note that #14001 switches v8 to libuv's threadpool /cc @matthewloring |
@bnoordhuis Is it currently possible to provide libuv with a hint to limit CPU-heavy threads, while allowing more I/O-heavy threads? V8 has a |
@TimothyGu Not at the moment. |
That enum is not used by v8. We only added it because chrome used to use Windows worker pool reflecting WT_EXECUTELONGFUNCTION |
The problem is that a single threadpool is used for IO and CPU threads. If there were two threadpools, one could be used for IO (and have many threads) and the other could be used for CPU (and have threads === cores). This would make tuning possible. Currently, there's no way to tune the threadpool for both use cases. If libuv knows that all its threads are IO only, then the above can be rolled out as follows:
|
@jorangreef We've been discussing such schemes in libuv since the thread pool was first added. You can find at least two attempts in my fork, maybe more. Both attempts stranded on not performing significantly better most of the time and significantly worse some of the time. :-/ |
Is it possible to let v8 use libuv thread pool ? this could reduce extra thread numbers.
The text was updated successfully, but these errors were encountered: