-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sched/rr #17
Conversation
return len; | ||
} | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extra space again
Also remove support of runtime scheduler switching, it is not needed anymore.
This one is based on a statically-allocated per-CPU array instead of a shared linked list. Now the get() doesn't use locking and thus should operate much faster. Also there should be a little improvement on NUMA systems because get() now references only local per-CPU variable. The debugfs wrapper implementation is also simplified.
.counter = 0, | ||
.servers = { NULL } | ||
}; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW,
CPU != socket != NUMA node
We have a local copy for each core (or even each "virtual" core in case of Hyper-Threading).
That should give a positive performance impact for non-shared caches, but negative impact for shared caches.
For non-shared caches (presumably L1d for "real" cores) that should eliminate the cache line bouncing.
For shared caches (presumably L2 and L3) it increases a chance of discarding by an LRU algorithm.
I can't estimate which factor would dominate on a real system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's ok for now. More fine grined optimization is reachable only by real benchmarks.
related to: #5 - Load balancing
The round-robin scheduler implementation.
Basic round-robin scheduler implementation.