-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove -m-no-red-zone on AMD64 #292
Comments
Interesting, the red zone business seems very subtle, which is why I guess most kernels just avoid it. But you are right, we certainly don't need it in the hvt case. I would lean towards removing it entirely (option 2 above). It seems the only issue is the virtio case. A couple of questions: in virtio, does linking code that has been built with and without red zone (option 1) cause any issues, in other words would you need the IST partitioning scheme even if you only changed flags for "user" code? And second, would it be easy to see if there are issues with the IST partitioning scheme? It seems pretty racy and I'm not sure how often you get an interrupt clobber when not being careful about the red zone. |
@djwillia I think I explained the situation on our call today, let me know if you need any recap here? |
See also https://www.kernel.org/doc/Documentation/x86/kernel-stacks, which gives a good explanation of the IST mechanism on x86_64. Notably, what I did not realise but makes sense in hindsight, IST-using interrupt or trap handlers cannot nest safely. This needs some more thought about the implications for virtio; it's possible we have a race condition between virtio-net and timer interrupts even in the current setup, although it must be extremely rare since no one has reported it. In order to be 100% safe, it would be best for virtio to always use separate ISTs for the timer and virtio-net. |
This is a large change, and the majority of it is effectively a full re-write of the build system. The actual removal of hvt compile-time specialization is fairly straightforward. User-visible changes: - 'configure.sh' now needs to be run manually before running 'make', this is consistent with POLA. - conversely, 'make clean' no longer cleans Makeconf. Use 'distclean' or 'clobber' for that. - 'configure.sh' will now print the targets that can (will) be built on this system. The strategy is still "build everything we can", however I have disabled Genode on all systems except Linux due to toolchain issues. - You can now build a subset of targets from the top-level 'make', by specifying 'CONFIG_XXX=' (disable) or 'CONFIG_XXX=1' (enable) either on the command line, or editing the generated Makeconf. - Makefiles use silent rules by default. To get the old verbose ones back, use 'make V=1'. - The 'solo5-hvt' tender is no longer "specialized" to the unikernel. We build two tenders, 'solo5-hvt' with all non-debug modules configured and 'solo5-hvt-debug' with additional debug modules (gdb, dumpcore where available). - 'solo5-hvt-configure' is kept around for now for backward compatibility with OPAM/MirageOS but is essentially a NOP. Developer-visible changes: - The build system now has proper support for auto-generation of dependencies. This means you can safely edit source files, run make and be sure you will get a complete incremental build. - Makefiles have been refactored to use common best practices, remove repetition, consistent variable names and clear interfaces between configure.sh/Makeconf/Makefiles, all the while keeping them simple enough to understand for me on a Monday morning before coffee. I.e. limit use of macros, eval, etc. - hvt tender modules are no longer defined by compile-time flags, instead a dynamic array is placed into a special ELF section (.modules). This means that a hvt tender binary can be combined from an arbitrary set of hvt_module_XXX object files, which is the right way to do things going forward and also simplifies the build system (not needing to build multiple targets from the same set of sources). Shortcomings / TODOs: - Dependency files (*.d) are stored in-tree. I spent several days on trying to figure out how to get them to work out of tree, but in combination with the non-recursive use of subdirectories in 'bindings' I could not figure out the required Makefile magic. - HVT_DROP_PRIVILEGES=0 is non-functional with the new modules arrangement, but needs a re-design anyway. Other changes included as part of this PR: - Revert privilege dropping on FreeBSD (see discussion in Solo5#282). - The build system changes effectively implement option 1 in Solo5#292, i.e. on x86_64 -m no-red-zone is only used for bindings, not for application code. - tests/tests.bats has been refactored for DRY as it was getting totally unmaintainable.
#345 rationalized the various build flags, and in the process implements option 1 here, so |
We have been carrying
-m-no-red-zone
for AMD64 for all code (bindings themselves and "user" code) since the dawn of time. Since then, the CPU interrupt handling and trap handling code has been refactored to use ISTn, i.e. separate stacks for interrupt and trap handlers.We should consider removing
-m-no-red-zone
. There are two options:The first option seems entirely safe -- "user" code will never run in an interrupt context, and any trap/interrupt handlers in bindings currently run on a separate stack via ISTn.
The second option should be possible with a minor tweak to our usage of ISTn to accomodate virtio (hvt does not use hardware interrupts). Currently, we have (from bindings/cpu_x86_64.c):
Given that on virtio it is theoretically possible for IRQ0 (timer) to nest with IRQx (virtio_net), I think we want something like this:
References: System V Application Binary Interface x86-64 Architecture Processor Supplement, section 3.2.2, also this stackoverflow answer.
Any opinions? @djwillia @ricarkol?
The text was updated successfully, but these errors were encountered: