-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Thread wakening may be bottom neck for large core systems #61
Comments
Can you show the trace result? or you can comment the thread::yield() function, and trace again |
Later. I've modified the base code a lot. It is like so:
|
It seems when dispatching a task, not all threads will be scheduled right now, some threads yield out. if so, comment the thread::yield and test again, it should be faster, but I test with the same result. |
It could be an advantage of x64 processors. I mean these processors may be more responsive anyway. And yes, removing yield works. But not always. It makes processors too busy on atomic operations sometimes. Racing will slow down the whole system. I am still experimenting on how to make the loop better. I've got a pretty good result so far. With some other new optimizations, I got 5token/s. |
great !!!, Looking forward to your optimizations |
Check out this demo. Add |
@chenqy4933 Got 4.4 tokens with the master. I guess the optimization works. Though I did not get as high as 5 token/s. |
you can optimize it continue, I just optimized it with the CPU level yield. |
Result: boost from
3.6~8token/s
to4.2~4.3token/s
on SG2042.Analyze: I added tracy to trace execution in details. I observed that worker wakening is problematic. There are some workers picked up task after the completion of other workers.
That means the execution time is as twice as the expected sometimes. I guess that it is caused by
thread::yield()
, which switched out the busy waiting loop.I am thinking about to make a higher/busier performance poll, while being able to switch to an idle state when waiting for user input.
The text was updated successfully, but these errors were encountered: