Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement support for link time reordering #13176

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

nmeum
Copy link
Member

@nmeum nmeum commented Jan 21, 2020

Contribution description

Link time reordering is a protection mechanisms, implemented at compile-time,
which randomizes the memory layout of the resulting firmware image. As such, it
serves as a protection mechanisms against remote code executions which attempt
to reuse existing code. Consider, for example, the PoC exploit in #10739. This
exploit jumps to a GPIO pin toggle function in the text segment. If text
segment addresses differ for each build this would be more difficult to achieve
in a portable way.

Link time reordering is further described in a publication by Koviu et al.
[1, p. 394]. The technique is also utilized by some conventional operating
systems. For instance, OpenBSD uses this technique for randomizing text segment
addresses of the kernel, they refer to it as kernel address randomized link (KARL) [2].

Testing procedure

Verify that the text segement addresses differ for each build:

$ make -C examples/gnrc_networking
$ nm examples/gnrc_networking/bin/native/gnrc_networking.elf > /tmp/first_build
$ make -C examples/gnrc_networking
$ nm examples/gnrc_networking/bin/native/gnrc_networking.elf > /tmp/second_build
$ diff -u /tmp/first_build /tmp/second_build

Also make sure they don't when building with RIOTINSECURE=1.

Issues/PRs references

dist/tools/randsort/randsort.py Outdated Show resolved Hide resolved
Makefile.include Outdated Show resolved Hide resolved
@fjmolinas
Copy link
Contributor

Regarding the naming and the implications I'm not in favor on having re-ordering by default. During dev I would rather have it disabled unless specifying I'm building the production application or something like that. And also not mix up all the functionalities, maybe have e PRODUCTION_BUILD ?=0 that sets different features: LINK_TIME_REORDER ?= 0 DEFAULT_MODULE += ssp, etc....

@nmeum
Copy link
Member Author

nmeum commented Jan 21, 2020

During dev I would rather have it disabled unless specifying I'm building the production application or something like that

This is certainly something which should be considered. Arguments against a separation of production and development environment are that SSP, in particular, is especially useful in a development environment as it helps you spot unintentional buffer overflows, thereby easing debugging. Regarding link time reordering: One might argue that exact text segment address do not matter much for debugging as you will have debugging symbols in such an environment anyhow. There is also a usability argument to be made where you have to make sure that users don't end up accidentally using development images in production.

@kaspar030
Copy link
Contributor

Regarding link time reordering: One might argue that exact text segment address do not matter much for debugging as you will have debugging symbols in such an environment anyhow.

For debugging it is actually nice to be able to reproduce binaries. Randomizing them doesn't matter - until it does.

@nmeum
Copy link
Member Author

nmeum commented Jan 21, 2020

@kaspar030 I added a --seed command line flag to the randomsort utility. Additionally, I introduced an environment variable (RIOTLINKSEED). The value of this variable is passed as a --seed argument to randomsort by the buildsystem. Besides, I added a hardcoded RIOTLINKSEED variable to .murdock to not break the CI test cache. I believe this address your comments and marked them as resolved, let me know if this is not the case.

If desired we can also use a static seed when DEVELHELP is set. To resolve the concern regarding production vs. development environment.

@benpicco benpicco added Area: security Area: Security-related libraries and subsystems CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR Type: enhancement The issue suggests enhanceable parts / The PR enhances parts of the codebase / documentation labels Jan 21, 2020
@nmeum
Copy link
Member Author

nmeum commented Jan 22, 2020

ESP builds fail because the esp* Makefiles add non-files to BASELIBS shouldn't they modify LINKFLAGS instead?

diff --git a/cpu/esp32/Makefile.include b/cpu/esp32/Makefile.include
index e1d3102ca..0486e98f2 100644
--- a/cpu/esp32/Makefile.include
+++ b/cpu/esp32/Makefile.include
@@ -194,11 +194,11 @@ LINKFLAGS += -L$(ESP32_SDK_DIR)/components/esp32
 LINKFLAGS += -L$(ESP32_SDK_DIR)/components/esp32/lib
 
 ifneq (,$(filter esp_wifi_any,$(USEMODULE)))
-    BASELIBS += -lcore -lrtc -lnet80211 -lpp -lsmartconfig -lcoexist
-    BASELIBS += -lwps -lwpa -lwpa2 -lespnow -lmesh -lphy -lstdc++
+    LINKFLAGS += -lcore -lrtc -lnet80211 -lpp -lsmartconfig -lcoexist
+    LINKFLAGS += -lwps -lwpa -lwpa2 -lespnow -lmesh -lphy -lstdc++
 endif
 
-BASELIBS += -lhal -lg -lc
+LINKFLAGS += -lhal -lg -lc
 
 LINKFLAGS += -L$(RIOTCPU)/$(CPU)/ld/
 LINKFLAGS += -T$(RIOTCPU)/$(CPU)/ld/esp32.ld
diff --git a/cpu/esp8266/Makefile.include b/cpu/esp8266/Makefile.include
index ecf4b0d86..f17cf2805 100644
--- a/cpu/esp8266/Makefile.include
+++ b/cpu/esp8266/Makefile.include
@@ -179,10 +179,10 @@ endif
 LINKFLAGS += -L$(ESP8266_RTOS_SDK_DIR)/components/esp8266/lib
 LINKFLAGS += $(CFLAGS_OPT)
 
-BASELIBS += -lc -lgcc -lwpa -lcore -lnet80211 -lphy -lpp -lhal -lstdc++
+LINKFLAGS += -lc -lgcc -lwpa -lcore -lnet80211 -lphy -lpp -lhal -lstdc++
 
 ifneq (, $(filter esp_now, $(USEMODULE)))
-    BASELIBS += -lespnow
+    LINKFLAGS += -lespnow
 endif
 
 LINKFLAGS += -u _malloc_r

@nmeum
Copy link
Member Author

nmeum commented Jan 22, 2020

Fixed the ESP build failure. Now BOARD=waspmote-pro fails because it defines board_init twice. Once in boards/waspmote-pro/board.c and the other time boards/common/atmega/board.c how would you want this resolved?

@nmeum
Copy link
Member Author

nmeum commented Jan 22, 2020

The last test which does not build is tests/riotboot. It fails with the following error message when building riotboot.elf:

/usr/lib/gcc/arm-none-eabi/7.3.1/../../../arm-none-eabi/bin/ld: /hom/nmeum/RIOT/bootloaders/riotboot/bin/pba-d-01-kw2x/core/kernel_init.o: in function `main_trampoline':
/home/nmeum/RIOT/core/kernel_init.c:50: undefined reference to `main'

I did not manage to figure out yet which file is intended to provide the main function for riotboot.elf. Could someone enlighten me on this?

@kaspar030
Copy link
Contributor

I did not manage to figure out yet which file is intended to provide the main function for riotboot.elf. Could someone enlighten me on this?

There is none. Usually, core/kernel_init.c provides kernel_init(), which in turn sets up a stack calling main(). For the riotboot bootloader, bootloaders/riotboot/main.c overrides kernel_init() and does not use a main() at all. Seems like this (just) worked because of the default linking order.

@nmeum
Copy link
Member Author

nmeum commented Jan 22, 2020

makefiles/boot/riotboot.mk is really beyond my intellectual capacity. core/kernel_init.c needs to be disabled for riotbuild.elf but doing so would require rebuilding core/ which it doesn't at the moment? @kaspar030 are you interested in fixing this?

@nmeum
Copy link
Member Author

nmeum commented Jan 29, 2020

Pushed a fix for riotboot, the build passes on the CI now.

Link time reordering is a protection mechanisms, implemented at
compile-time, which randomizes the memory layout of the resulting
firmware image.

As such, it serves as a protection mechanisms against remote code
executions which attempt to reuse existing code. Link time reordering is
further described in a publication by Koviu et al. [1].

[1]: https://ieeexplore.ieee.org/abstract/document/7917121/
This allows reproducible builds if the same seed is used.
In order to not break the CI test cache.
The ESP Makefiles add non-files to BASELIBS. To work around this issue
use printf to convert BASELIBS to newline separated input for randsort.
Otherwise it is unset for the recursive make invocation and newly
generated.
@nmeum
Copy link
Member Author

nmeum commented Jan 31, 2020

Rebased against master, all linking fixes have been merged now. As such, the build passes on the CI now without requiring any unrelated changes. Besides, @kaspar030 I believe I addressed all of your comments. Could you take another look at this PR? Does it require further changes or can it be merged as is?

Copy link
Contributor

@kaspar030 kaspar030 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit unhappy with the integration.

@@ -4,6 +4,7 @@
: ${TEST_BOARDS_LLVM_COMPILE:="iotlab-m3 native nrf52dk mulle nucleo-f401re samr21-xpro slstk3402a"}

export RIOT_CI_BUILD=1
export RIOTLINKSEED=2342
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With ssp, compile/link time random value and now this, it would be the third user of a build time random variable. Can we maybe unify this properly?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, what about adding a REPRODUCIBLE_BUILD environment variable which deactivates SSP and uses a constant seed?

# Save value to verify it is not modified later
_BASELIBS_VALUE_BEFORE_USAGE := $(BASELIBS)

# Linker rule
$(ELFFILE): FORCE
ifeq ($(BUILDOSXNATIVE),1)
_LINK = $(if $(CPPMIX),$(LINKXX),$(LINK)) $(UNDEF) $$(find $(BASELIBS) -size +8c) $(LINKFLAGS) $(LINKFLAGPREFIX)-no_pie
_LINK = $(if $(CPPMIX),$(LINKXX),$(LINK)) $(UNDEF) $(SHUFLIBS) $(LINKFLAGS) $(LINKFLAGPREFIX)-no_pie
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the use of SHUFLIBS here. It makes it look as if libraries are always shuffled.
Can we have e.g., "LINKLIBS" here, and above, use sth like:

if KALR:
  LINKLIBS = $(shuffle $(BASELIBS))
else:
  LINKLIBS = $(BASELIBS)

Copy link
Member Author

@nmeum nmeum Feb 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to the OSX specific find(1) invocation adding another if/else is a bit annoying. Is the find(1) invocation on OSX really strictly necessary?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the find(1) invocation on OSX really strictly necessary?

I think we have to assume that. (IIRC there was an issue with empty archives...).

Maybe we can factor that ifdef out of ifeq ($(BUILDOSXNATIVE),1)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am ok with renaming the variable from SHUFLIBS to LINKLIBS. I don't think that it makes that much sense to add an additional disable/enable switch though. If you want a deterministic order just set RIOTLINKSEED.

From the ld(1) --start-group documentation:

The specified archives are searched repeatedly until no new undefined references are created.

Don't really see which benefit the "default" order provides then.

# Randomize library order to protect against code re-use attacks (link time reordering)
RIOTLINKSEED ?= -1
ifeq ($(BUILDOSXNATIVE),1)
SHUFLIBS = $(shell find $(BASELIBS) -size +8c | $(RIOTTOOLS)/randsort/randsort.py --seed $(RIOTLINKSEED))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid calling out to python? On Linux, there should be "shuf". OS X doesn't have that (by default). I think something could be built with "jot | ...".

Or we start requiring some homebrew tools on OS X...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was under the impression that python was required anyhow. I would try to avoid using non-POSIX utilities such as shuf. Besides, those utilities may not allow specifying a seed argument and may not use cryptographic secure random sources.

Or we start requiring some homebrew tools on OS X...

Why prefer a different third-party utility over the randsort script which doesn't have any external dependencies apart from python itself?

Copy link
Contributor

@kaspar030 kaspar030 Feb 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be required, but should be avoided.
python is slow. e.g.:

$ time (echo -n  foo bar 1234 | python3 randsort.py)
foo bar 1234
real    0m0.418s
user    0m0.383s
sys     0m0.039s

That's on a raspi3. Not the fastest, but that's almost half a second for shuffling three values.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can rewrite it in C if performance is an issue.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just thought of this again and the fact that takes half a second for shuffling three values is likely attributable to the initialization of the python interpreter. As such, shuffling more values should not take significantly longer. Besides, half a second is not really that big a deal for a build process in my opinion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about using shuf when available and using randsort.py as a fallback?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? This would just unnecessarily complicate the entire setup. The randsort.py script doesn't have any disadvantages unless you are trying to claim that half a second is a noticeable build slowdown.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Half a second is not a noticeable slowdown, but CI currently does 58746 builds for each PR.

@nmeum
Copy link
Member Author

nmeum commented Feb 17, 2020

@kaspar030 how would you like to proceed with this? I would be willing to make further adjustments to this PR and with #13387 being open source as well now I would definitely like to see this merged as I believe that this is an important technique for making circumvention of executable space protections harder.

I can rewrite the randsort.py script in C if ~30ms time increase per build are really that important to you, though I seriously doubt that it is worth it. I also added comments to the other threads created by you.

@benpicco
Copy link
Contributor

I'm not yet convinced how much this benefits security. You write

If text segment addresses differ for each build this would be more difficult to achieve
in a portable way.

But typically you have one firmware image and run that on many devices. So while the addresses will differ between different firmware versions, if you find a vulnerability in the firmware, you can still exploit all devices.

It's still useful to make it harder to exploit undiscovered bugs across firmware updates, but it should

  • be optional since it makes debugging harder (we should have a compile flag to enable all the security features in one go though)
  • be documented that this 'static' randomization can not be as effective as the randomization done on architectures where you execute from RAM.

@nmeum
Copy link
Member Author

nmeum commented Mar 24, 2020

But typically you have one firmware image and run that on many devices. So while the addresses will differ between different firmware versions, if you find a vulnerability in the firmware, you can still exploit all devices.

That it is absolutely correct, but at least it differs now for different firmware versions or different variants of the firmware. Also consider open source firmware software, compiled by the users and flashed directly to the devices. This is just a cheap mitigation technique for code reuse attacks which would circumvent executable space protections as provided by mpu_noexec_ram, it doesn't solve the code reuse attack problem entirely but it's better than having no mitigations for code reuse attacks at all and it doesn't come with any disadvantages really. Implementing run-time address layout randomization (i.e. ASLR) is not feasible without a secure random number generator, a memory management unit, and a virtual memory implementation. None of which are commonly available on RIOT.

be optional since it makes debugging harder (we should have a compile flag to enable all the security features in one go though)

I agree that one compile flag would be useful, that flag could simply cause RIOTLINKSEED=1 to be set or something along the lines.

be documented that this 'static' randomization can not be as effective as the randomization done on architectures where you execute from RAM.

I agree, where should that documentation go?

@benpicco
Copy link
Contributor

benpicco commented Mar 24, 2020

I agree, where should that documentation go?

My first idea would be to create a (pseudo) module security or hardening where all security related information can be collected in one place. (and enable all the relevant options when used)

But afaik you'd have to do that all in header files, so it'd be kind of clunky.

But you can just create a new page in doc/doxygen/src.
This has the advantage that it's easily discoverable on https://doc.riot-os.org

That can be a separate PR.

I still think link time reordering should not be enabled by default as it adds inconvenience with only marginal benefit.
But have some kind of hardening option/module that enables all these security features in one go
This way it can be easily enabled for release and doesn't inconvenience debugging or new users who learn about the system.

@MrKevinWeiss MrKevinWeiss added this to the Release 2021.07 milestone Jun 21, 2021
@MrKevinWeiss MrKevinWeiss removed this from the Release 2021.07 milestone Jul 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: security Area: Security-related libraries and subsystems CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR Type: enhancement The issue suggests enhanceable parts / The PR enhances parts of the codebase / documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants