Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

staticx generated binary segfaults when built on Ubuntu 18.04 Github hosted actions runner #198

Closed
jay0lee opened this issue Oct 6, 2021 · 14 comments

Comments

@jay0lee
Copy link

jay0lee commented Oct 6, 2021

Any binaries I build with PyInstaller and StaticX are segfaulting when built on Ubuntu 18.04 and Github Actions hosted runners. Here's a PoC project:

https://github.com/jay0lee/actions-hello-world/runs/3807920931

it seems to be some specific Ubuntu 18.04 issue as the same steps build a valid binary on Ubuntu 20.04.

I suspect this is the same issue as when using actions/setup-python but can work to confirm that.

@JonathonReinhart
Copy link
Owner

Thanks @jay0lee for opening this issue.

I'll also bring forward this useful comment from you on #188:

Also FWIW I did test taking the StaticX binary build on Ubuntu 20.04 and running it on Debian 9 and CentOS 7. Both worked which is good enough for me. For now at least I'll just generate my legacy Linux static build on 20.04.

Do you think you could update your project to capture dist/helloworld (PyInstaller output) and dist/helloworld-staticx (Staticx output)? It looks like this should be straightforward with actions/upload-artifact. I'd like to dig in to the 18.04 binary to see exactly what it's loading and why it's segfaulting.

@jay0lee
Copy link
Author

jay0lee commented Oct 7, 2021

Thanks Jonathon,

Here's the compiled binaries:

https://drive.google.com/file/d/1vDya8-HeUGfqYcMp9UM5U2GTeQ1Y49I8/view?usp=sharing

and if you need to see the run that generated them it's at:

https://github.com/jay0lee/actions-hello-world/runs/3826354929?check_suite_focus=true

@JonathonReinhart
Copy link
Owner

FYI: No need to upload to drive; they're available at the bottom of the workflow summary page.

Wow:

(gdb) r
Starting program: /wherever/helloworld-staticx 
During startup program terminated with signal SIGSEGV, Segmentation fault.

I can't say I've ever seen that before. I'm not sure it will help but can you run staticx with --debug? Not only does it enable debug output for the builder, but it also switches to a debug version of the bootloader which might help in this case.

@jay0lee
Copy link
Author

jay0lee commented Oct 8, 2021 via email

@JonathonReinhart
Copy link
Owner

gdb provides no useful information whatsoever because the program is crashing immediately. This makes me suspect something is wrong with the ELF file itself.


The python-ptrace project provides an alternate Python implementation of gdb called gdb.py which is actually more useful.

$ alias noaslr='setarch $(uname -m) -R'
$ noaslr gdb.py ./1319221677/helloworld-staticx
------------------------------------------------------------
PID: 87691
Signal: SIGSEGV
Invalid memory access to NULL
- mapping: NULL is not mapped in memory
------------------------------------------------------------
(gdb) regs
     r15 = 0x0000000000000000
     r14 = 0x00007ffff735b300
     r13 = 0x00007ffff735b318
     r12 = 0x00007ffff73649c0
     rbp = 0x00007ffff786ebf0
     rbx = 0x0000000000963ef0
     r11 = 0x0000000000000202
     r10 = 0xfffffffffffff28e
      r9 = 0x0000000000000360
      r8 = 0x00000000009076e0
     rax = 0xffffffffffffffff
     rcx = 0x00007ffff7cf26c7
     rdx = 0x00000000009fa130
     rsi = 0x00007ffff786ebf0
     rdi = 0x00007ffff7361cd0
orig_rax = 0x000000000000003b
     rip = 0x00007ffff7cf26c7
      cs = 0x0000000000000033
  eflags = 0x0000000000000202
     rsp = 0x00007fffffffc698
      ss = 0x000000000000002b
 fs_base = 0x00007ffff7c24740
 gs_base = 0x0000000000000000
      ds = 0x0000000000000000
      es = 0x0000000000000000
      fs = 0x0000000000000000
      gs = 0x0000000000000000
(gdb) maps
MAPS: 0x00007ffffffde000-0x00007ffffffff000 => [stack] (rw-p)
MAPS: 0xffffffffff600000-0xffffffffff601000 => [vsyscall] (r-xp)

It looks like the kernel is handing control over to the bootloader with nothing from the executable actually mapped.


At this point, I suspect that either:

  • Your compiler toolchain built a terribly incorrect version of the bootloader
    • Assuming you installed staticx from source and not a wheel
  • Your version of patchelf horribly mangled the bootloader while patching it
    • This is my suspicion

I released v0.13.2 which adds a bunch more logging at startup to identify the tools being used and their versions. I'd start by recommending that you do another run with staticx v0.13.2 and --debug.

Then, I'd like to see the output of readelf -hlSW helloworld-staticx. I can do that locally by grabbing your artifact. I'd also like to see the same for the bootloader embedded into your staticx package. If you're installing from a wheel, then I don't need that.

@jay0lee
Copy link
Author

jay0lee commented Oct 9, 2021 via email

@sansna
Copy link

sansna commented Oct 14, 2021

from #203
same issue, after fix #200 by #204
no more segmentation faults.

@JonathonReinhart
Copy link
Owner

CentOS 7 test 1

I build a CentOS image with the following Dockerfile:

FROM centos:7

# Enable EPEL
RUN yum install -y epel-release && rm -rf /var/cache/yum

# Install main packages
RUN yum install -y \
        patchelf \
        python3 \
        python3-pip \
        python3-wheel \
        which \
    && rm -rf /var/cache/yum

# Upgrade pip
RUN pip3 install --upgrade pip

# Install our dependencies
ADD requirements.txt /tmp/requirements.txt
RUN pip3 install -r /tmp/requirements.txt 

Then I ran the following commands inside a container from that resulting image:

# pip install staticx==0.13.3

# staticx $(which date) date.sx
# ./date.sx 
Segmentation fault

# sxpath=$(python3 -c 'import staticx; print(staticx.__path__[0])')

# ls -l date.sx $sxpath/assets/release/bootloader 
-rwxr-xr-x 1 root root 127680 Oct 14 05:10 /usr/local/lib/python3.6/site-packages/staticx/assets/release/bootloader
-rwx------ 1 root root 932520 Oct 14 05:11 date.sx


# readelf -hlSW $sxpath/assets/release/bootloader > readelf_hlSW_bootloader
# readelf -hlSW date.sx > readelf_hlSW_date_sx
# diff -u readelf_hlSW_bootloader readelf_hlSW_date_sx 

The resulting diff:

--- readelf_hlSW_bootloader	2021-10-14 05:16:25.484932981 +0000
+++ readelf_hlSW_date_sx	2021-10-14 05:16:43.277202489 +0000
@@ -10,14 +10,14 @@
   Version:                           0x1
   Entry point address:               0x40157e
   Start of program headers:          64 (bytes into file)
-  Start of section headers:          126016 (bytes into file)
+  Start of section headers:          930792 (bytes into file)
   Flags:                             0x0
   Size of this header:               64 (bytes)
   Size of program headers:           56 (bytes)
   Number of program headers:         7
   Size of section headers:           64 (bytes)
-  Number of section headers:         26
-  Section header string table index: 25
+  Number of section headers:         27
+  Section header string table index: 26
 
 Section Headers:
   [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
@@ -44,9 +44,10 @@
   [20] .debug_str        PROGBITS        0000000000000000 0196d0 0001c2 01  MS  0   0  1
   [21] .debug_loc        PROGBITS        0000000000000000 019892 0000e6 00      0   0  1
   [22] .debug_ranges     PROGBITS        0000000000000000 019980 0000a0 00      0   0 16
-  [23] .symtab           SYMTAB          0000000000000000 019a20 003a08 18     24 252  8
-  [24] .strtab           STRTAB          0000000000000000 01d428 001721 00      0   0  1
-  [25] .shstrtab         STRTAB          0000000000000000 01eb49 0000f2 00      0   0  1
+  [23] .staticx.archive  PROGBITS        0000000000000000 019a20 0c4780 00      0   0  1
+  [24] .symtab           SYMTAB          0000000000000000 0de1a0 003a20 18     25 253  8
+  [25] .strtab           STRTAB          0000000000000000 0e1bc0 001721 00      0   0  1
+  [26] .shstrtab         STRTAB          0000000000000000 0e32e1 000103 00      0   0  1
 Key to Flags:
   W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
   L (link order), O (extra OS processing required), G (group), T (TLS),
@@ -55,7 +56,7 @@
 
 Program Headers:
   Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
-  LOAD           0x000000 0x0000000000400000 0x0000000000400000 0x0001c8 0x0001c8 R   0x1000
+  LOAD           0x000000 0x0000000000000000 0x0000000000400000 0x0001c8 0x0001c8 R   0x1000
   LOAD           0x001000 0x0000000000401000 0x0000000000401000 0x014d36 0x014d36 R E 0x1000
   LOAD           0x016000 0x0000000000416000 0x0000000000416000 0x002420 0x002420 R   0x1000
   LOAD           0x018f50 0x0000000000419f50 0x0000000000419f50 0x000378 0x001450 RW  0x1000

The changes are mostly as expected:

  • The section headers are still at the end of the file (which is now bigger)
  • The number of section headers went up
  • The .staticx.archive section was inserted

🤔 But the final change is surprising. The VirtAddr of the first LOAD program header changed from 0x400000 to 0x0.

@JonathonReinhart
Copy link
Owner

That program header change looks very suspicious and doesn't seem to be correct, given the changes that patchelf needed to make.

I decided to try and "fix" the problem, by changing it back:

# cp date.sx date.sx.hack
# echo -en '\x00\x00\x40\x00' | dd of=date.sx.hack conv=notrunc bs=1 seek=$((0x50))

Verifying:

# diff <(readelf -l $sxpath/assets/release/bootloader) <(readelf -l date.sx.hack) >/dev/null && echo 'identical' || echo 'different'
identical

Testing:

# ./date.sx.hack 
Thu Oct 14 05:58:08 UTC 2021

It worked! 🎉

But now to figure out why it's getting corrupted...

@JonathonReinhart
Copy link
Owner

JonathonReinhart commented Oct 14, 2021

I was mistaken by blaming this on patchelf. I actually use objcopy for elf_add_section.

compare_proghdrs.sh

#!/bin/bash
diff <(readelf -Wl "$1") <(readelf -Wl "$2") \
    && echo "Program headers are identical"

CentOS 7 Test 2

# objcopy --version
GNU objcopy version 2.27-44.base.el7
# cp $sxpath/assets/release/bootloader .
# cp bootloader bootloader.test2
# dd if=/dev/urandom of=dummy_128k bs=128k count=1
# objcopy --add-section '.dummy=dummy_128k' bootloader.test2 
# ./bootloader.test2
Segmentation fault
# ./compare_proghdrs.sh bootloader bootloader.test2 
8c8
<   LOAD           0x000000 0x0000000000400000 0x0000000000400000 0x0001c8 0x0001c8 R   0x1000
---
>   LOAD           0x000000 0x0000000000000000 0x0000000000400000 0x0001c8 0x0001c8 R   0x1000

Conclusion: Using objcopy --add-section with a 128k file on CentOS 7 will mangle the program headers. 🐛

@JonathonReinhart
Copy link
Owner

Test 3

Same as Test 2, but on Debian bullseye:

$ objcopy --version
GNU objcopy (GNU Binutils for Debian) 2.35.2
$ cp bootloader bootloader.test3
$ objcopy --add-section '.dummy=dummy_128k' bootloader.test3
$ ./bootloader.test3 
bootloader.test3: Failed to find .staticx.archive section      (((this is success)))
$ ../compare_proghdrs.sh bootloader bootloader.test3 
Program headers are identical

@JonathonReinhart
Copy link
Owner

Test 4

I tried to eliminate musl by building normally (scons) on my bullseye host.

  • Running it under centos 7 works fine
  • Installing and using staticx under centos 7 with this bootloader works fine:
$ rm -rf build dist scons_build staticx/assets/*
$ python3 setup.py bdist_wheel
...
$ cd dist/
$ docker run --rm -it -v $(pwd):/dist -w /dist centos:7-python3
[root@f80262881b53 dist]# pip3 install staticx-*.whl 
...
[root@f80262881b53 dist]# staticx $(which date) date.sx
[root@f80262881b53 dist]# ./date.sx 
Thu Oct 14 07:27:15 UTC 2021

Conclusion: It's old objcopy + musl.

@JonathonReinhart
Copy link
Owner

@jay0lee I'm going to close this issue in favor of #205, where the issue is now better-defined. Thanks for your help so far.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants