Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Idea for alternative implementation of syscall stack #972

Open
nyh opened this issue May 15, 2018 · 0 comments
Open

Idea for alternative implementation of syscall stack #972

nyh opened this issue May 15, 2018 · 0 comments

Comments

@nyh
Copy link
Contributor

nyh commented May 15, 2018

This is an idea that I'm just writing here for future reference. We don't have to rush it, and it may not be a good idea at all - it's certainly more complicated than what we have now, and I'm not even sure we can make it work.

The current syscall stack implementation (see issue #808) works by having a separate small (1024 byte) stack for every thread, which is just used for allocating a bigger stack when needed. The 1024-byte overhead for every application thread, even one who will never call a syscall, is not much, but it's still inelegant.

An alternative approach could be have to have a per-cpu (not per thread) large syscall stack. When we start a syscall we switch to that per-cpu stack, and the first thing we do on this stack is to allocate a new syscall stack to be used by the next syscall on this CPU (in case the current syscall is preempted). When a syscall has finished, it either frees its stack or remembers it in a one-element stack pool to avoid very frequent allocation/deallocation of these stacks (especially in the common case where we have one system call called after the other, without preemption in the middle of the system call).

The most difficult part of this suggestion is how to access the per-cpu variable in the syscall entry assembly code. It's not as easy as accessing the per-thread TCB we have now with fixed offsets into the FS. I'm worried that we will require calculations using extra registers, which we don't have (although #808 suggested a few tricks to try and buy us one extra register). Perhaps we can also start using the %gs segment for per-cpu variables, which we haven't done so far (we only use %fs, for per-thread variables).

To avoid any overhead, even per-cpu overhead, in applications which never use SYSCALL, we could have the syscall handler first set to a function which allocates the per-cpu stacks - and then resets the handler to the function which assumes these are allocated. But I doubt the overhead of 64 KB per core (a per-cpu syscall stack) is worth fussing about.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants