Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memory protection for x86 dependent on XIP #18956

Closed
ghost opened this issue Sep 5, 2019 · 6 comments · Fixed by #19062
Closed

memory protection for x86 dependent on XIP #18956

ghost opened this issue Sep 5, 2019 · 6 comments · Fixed by #19062
Assignees
Labels
area: Memory Protection area: X86 x86 Architecture (32-bit) bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug
Milestone

Comments

@ghost
Copy link

ghost commented Sep 5, 2019

If you disable CONFIG_XIP for x86, userspace/memory protection tests begin to fail.

Out of the box:
charles@cyouse-desk1:~/zephyr/zephyr$ sanitycheck -p qemu_x86 -T samples/userspace/shared_mem
JOBS: 6
Cleaning output directory /home/charles/zephyr/zephyr/sanity-out
Building testcase defconfigs...
Filtering test cases...
1 tests selected, 196 tests discarded due to filters
total complete: 1/ 1 100% failed: 0
1 of 1 tests passed with 0 warnings in 9 seconds

Modify the boards/x86/qemu_x86/Kconfig.board to comment out select XIP and then re-run:

$ vi boards/x86/qemu_x86/Kconfig.board
$ sanitycheck -p qemu_x86 -T samples/userspace/shared_mem
JOBS: 6
Cleaning output directory /home/charles/zephyr/zephyr/sanity-out
Building testcase defconfigs...
Filtering test cases...
1 tests selected, 196 tests discarded due to filters
total complete: 0/ 1 0% failed: 0

qemu_x86 samples/userspace/shared_mem/sample.kernel.memory_protection.shared_mem FAILED: handler_crash
see: sanity-out/qemu_x86/samples/userspace/shared_mem/sample.kernel.memory_protection.shared_mem/handler.log

total complete: 1/ 1 100% failed: 1
0 of 1 tests passed with 0 warnings in 5 seconds

@ghost ghost added bug The issue is a bug, or the PR is fixing a bug area: X86 x86 Architecture (32-bit) area: Memory Protection labels Sep 5, 2019
@ghost ghost self-assigned this Sep 5, 2019
@ghost
Copy link
Author

ghost commented Sep 5, 2019

@andrewboie There are 50-odd tests in sanitycheck that fail when XIP is deselected. I assume they're all related, and will share one (trivial) solution. As far as I can tell this problem existed before the recent implementation of per-thread page tables. I'll be looking into this - it's valuable to me as an exercise in getting to know the gritty details of the x86 userspace implementation - but I wanted to draw this to your attention in case you know of an obvious "whoopie" here.

@andrewboie
Copy link
Contributor

I agree this probably has a single, fairly trivial fix. QEMU had XIP turned on by default for a long time (years) so it seems a matter of the non-XIP scenario never having been tested.

Can you post the error log with the console output?

@ghost
Copy link
Author

ghost commented Sep 5, 2019

Booting from ROM..Optimal CONFIG_X86_MMU_PAGE_POOL_PAGES 9
***** Booting Zephyr OS build v2.0.0-rc3-9-gd0ef464201b9 *****
FATAL: ***** CPU Page Fault (error code 0x00000000)
FATAL: Supervisor thread read address 0x00000000
FATAL: PDE: 0x0127 Present, Writable, User, Execute Enabled
FATAL: PTE: 0x00 Non-present, Read-only, Supervisor, Execute Enabled
FATAL: eax: 0x0012b000, ebx: 0x00001000, ecx: 0x0012c000, edx: 0x00001000
FATAL: esi: 0x00000000, edi: 0x0012b000, ebp: 0x0013b290, esp: 0x0013b284
FATAL: eflags: 0x00000212 cs: 0x0008 cr3: 0x0011d500
FATAL: call trace:
FATAL: eip: 0x00103189
FATAL:      0x001029ce (0x12b000)
FATAL:      0x0010234a (0x11d100)
FATAL:      0x00104e78 (0x11d100)
FATAL:      0x00104ee8 (0x11d100)
FATAL:      0x0010055d (0x11d100)
FATAL:      0x0010086b (0x143000)
FATAL:      0x00103775 (0x13b3e4)
FATAL:      0x001014ec (0x0)
FATAL: >>> ZEPHYR FATAL ERROR 0: CPU exception
FATAL: Current thread: 0x0011d5a0 (unknown)
FATAL: Halting system

@nashif nashif added the priority: low Low impact/importance bug label Sep 6, 2019
@nashif nashif added this to the v2.1.0 milestone Sep 6, 2019
@andrewboie
Copy link
Contributor

andrewboie commented Sep 10, 2019

The crash is in the code that constructs page tables for new user threads.

On further analysis, DT_PHYS_RAM_ADDR is being set wrong for qemu_x86 if CONFIG_XIP is turned off, it had a value of 0x500000. But with XIP turned off, RAM spans from 0x100000 - 0x4FFFFF.

The below code works for me, although tests/kernel/xip unsurprisingly crashes. The rest pass. Below code snippet is not a fix, it warns about overlapping the "flash" region.

diff --git a/boards/x86/qemu_x86/Kconfig.board b/boards/x86/qemu_x86/Kconfig.board
index 0ac4e6d036..3893e62575 100644
--- a/boards/x86/qemu_x86/Kconfig.board
+++ b/boards/x86/qemu_x86/Kconfig.board
@@ -1,9 +1,4 @@
 # SPDX-License-Identifier: Apache-2.0
-#
-# The QEMU targets themselves are not XIP, everything is actually RAM, but we
-# pretend the first 4 megabytes are a memory-mapped flash region. This is done
-# to ensure that the XIP data copying infrastructure doesn't bit-rot on
-# x86.
 
 config BOARD_QEMU_X86
        bool "QEMU x86"
@@ -12,4 +7,3 @@ config BOARD_QEMU_X86
        select HAS_DTS_ETHERNET
        select CPU_HAS_FPU
        select HAS_COVERAGE_SUPPORT
-       select XIP
diff --git a/dts/x86/ia32.dtsi b/dts/x86/ia32.dtsi
index 3b203fdc7b..3691fff61b 100644
--- a/dts/x86/ia32.dtsi
+++ b/dts/x86/ia32.dtsi
@@ -33,10 +33,10 @@
                reg = <0x00100000 DT_FLASH_SIZE>;
        };
 
-       sram0: memory@500000 {
+       sram0: memory@100000 {
                device_type = "memory";
                compatible = "mmio-sram";
-               reg = <0x00500000 DT_SRAM_SIZE>;
+               reg = <0x00100000 DT_SRAM_SIZE>;
        };
 
        soc {

I think we need to move some policy out of ia32.dtsi. There shouldn't be a flash0 region defined there at all, and sram0 should start at 1MB.

If qemu_x86 wants to test flash and xip this stuff needs to be moved qemu_x86's dts and not in the main ia32 one.

Still looking into this to find the best fix.

@andrewboie
Copy link
Contributor

So this is definitely a DTS/board config problem and not a userspace one.

I'm working on a PR which does the following:

  • IA32 SOC will no longer define a flash region
  • IA32 SOC will start RAM at 1MB
  • IA32 SOC will allow XIP to be enabled, but the board-level DTS will have to define a flash region
  • qemu_x86 will have XIP off by default, but tests like tests/kernel/xip will be able to turn it on for testing purposes
  • Anything related to apollo_lake SOC will have flash0 removed, no ability to enable XIP, SRAM always starting at 1MB, it doesn't make sense to support XIP for apollo_lake SOC

This will have the effect of testing both XIP and non-XIP on qemu_x86 to ensure it doesn't bit-rot, with qemu's default being non-XIP.

@ghost
Copy link
Author

ghost commented Sep 10, 2019

Just FYI, that page allocator I mentioned last week will be submitted this afternoon. The "SRAM" information from DTS is going to be a "default" memory map, which can be overridden by a memory map obtained at runtime (from EFI/ACPI/multiboot, or whatever).

There's going to be some overlap here, and to be complete, will probably require some minor changes to the early MMU setup (since the memory map isn't necessarily defined by constants like DT_PHYS_RAM_ADDR anymore) and newlib's heap stuff. All minor details though that I think won't collide with what you're doing here.

andrewboie pushed a commit to andrewboie/zephyr that referenced this issue Sep 11, 2019
XIP support in x86 was something of a mess. This
patch does the following:

- Generic ia32 SOC no longer defines a "flash" region
  as generic X86 devices don't have a microcontroller-
  like concept of flash. The same has been done for apollo_lake.
- Generic ia32 and apollo_lake SOCs starts memory at 1MB.
- Generic ia32 SOC may optionally have CONFIG_XIP enabled.
  The board definition must provide a flash region definition
  that gets exposed as DT_PHYS_LOAD_ADDR.
- Fixed definitions for RAM/ROM source addresses in ia32's
  linker.ld when XIP is turned off.
- Support for enabling XIP on apollo_lake SOC removed, there's
  no use-case.
- acrn and gpmrb boards have flash and XIP related definitions
  removed.
- qemu_x86 has a fake flash region added, immediately after system
  RAM, for use when XIP is enabled. This used to be in the ia32 SOC.
  However, the default for qemu_x86 is to now have XIP disabled.
- Fixed tests/kernel/xip to run by default on boards that enable
  XIP by default, plus an additional test to exercise XIP on
  qemu_x86 (which supports it but has XIP switched off by default)

The overall effect of this patch is to:

- Remove XIP configuration for SOC/boards where it does not make
  any sense to have it
- Support testing XIP on qemu_x86 via tests/kernel/xip, but leave
  it off by default for other tests, to ensure it doesn't bit-rot
  and that the system works in both scenarios.
- XIP remains an available feature for boards that need it.

Fixes: zephyrproject-rtos#18956

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
nashif pushed a commit that referenced this issue Sep 12, 2019
XIP support in x86 was something of a mess. This
patch does the following:

- Generic ia32 SOC no longer defines a "flash" region
  as generic X86 devices don't have a microcontroller-
  like concept of flash. The same has been done for apollo_lake.
- Generic ia32 and apollo_lake SOCs starts memory at 1MB.
- Generic ia32 SOC may optionally have CONFIG_XIP enabled.
  The board definition must provide a flash region definition
  that gets exposed as DT_PHYS_LOAD_ADDR.
- Fixed definitions for RAM/ROM source addresses in ia32's
  linker.ld when XIP is turned off.
- Support for enabling XIP on apollo_lake SOC removed, there's
  no use-case.
- acrn and gpmrb boards have flash and XIP related definitions
  removed.
- qemu_x86 has a fake flash region added, immediately after system
  RAM, for use when XIP is enabled. This used to be in the ia32 SOC.
  However, the default for qemu_x86 is to now have XIP disabled.
- Fixed tests/kernel/xip to run by default on boards that enable
  XIP by default, plus an additional test to exercise XIP on
  qemu_x86 (which supports it but has XIP switched off by default)

The overall effect of this patch is to:

- Remove XIP configuration for SOC/boards where it does not make
  any sense to have it
- Support testing XIP on qemu_x86 via tests/kernel/xip, but leave
  it off by default for other tests, to ensure it doesn't bit-rot
  and that the system works in both scenarios.
- XIP remains an available feature for boards that need it.

Fixes: #18956

Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Memory Protection area: X86 x86 Architecture (32-bit) bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug
Projects
None yet
2 participants