-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ELF loading may not be necessary and increases complexity and vulnerabilities #23620
Comments
@Lichtso and I were discussing this, @alessandrod @dmakarov What do you guys think, if in favor can you two pick it up? |
ELF is also a container for debug information. I guess we could still generate both ELF for debugging purposes and somewhat raw binary for loading by the runtime loader. |
Good point, the ELF and SEF would have to align vaddr wise in order for the debug info to be useful. Maybe we can optionally add the dwarf to the SEF? |
I worked on a system where we loaded a raw binary into a simulator or on a physical board, but used a separate ELF file to load into lldb. We had a debug server that communicated with the simulator or the board, and it worked quite well. The debug server knew how to map the symbols to addresses when it interacted with the board. I think it shouldn't be necessary to complicate our custom binary format with debug info, as long as we can still generate an elf file with the same code. |
Sure, same code and same memory layout |
We're currently working on a security.txt like standard that currently uses a custom ELF section in https://github.com/neodyme-labs/solana-security-txt . It still works without that section because the actualy data is stored in .ro, so this proposal wouldn't break anything, but it would be awesome if this new format would allow for custom key-value attributes like name, project link, security policy and so on |
Wouldn't we be able to achieve all this, keeping ELF but switching to emitting statically linked object files instead of shared objects like we do now? And as we've been discussing we could write a very simple ELF parser and stop using goblin. I think the reason for keeping ELF would be ecosystem tooling. Right now you can use objdump, readelf, ghidra, r2, cargo-bloat, elfcat, etc to debug profile and optimize binaries (and I regularly use all of those). With a new format we'd break all those. Also while starting from scratch is nice, I'm worried that we'd start simple, then add one extra thing at a time (DWARF, also see @CherryWorm's comment already) and slowly end up with frankenstein-ELF with another name. |
Yeah, good points, I do like the tooling support for ELFs. The question is with ELF can we do the following for executables (debug versions would have string/symbol/dwarf):
|
I share these concerns. The tooling for working BPF programs isn't good now, but with the custom or raw binary format it will be next to non-existent. Adding more and more information to a raw binary we'll end up with a bad substitute for ELF. |
I am also fine with sticking with a minimal subset of ELF. |
Have been working on generating static linked executables with our BPF toolchain today. linker_script.ld
main.rs#![no_main]
#![no_std]
#[panic_handler]
fn panic(_panic: &core::panic::PanicInfo) -> ! {
loop {}
}
static STRING: &'static str = "Hello World";
static mut HEAP: [u8; 64] = [0u8; 64];
#[no_mangle]
pub extern fn _start() -> *const u8 {
STRING.as_ptr()
}
#[no_mangle]
pub extern fn get_heap() -> *const u8 {
unsafe { HEAP.as_ptr() }
} What works (using only rustc)SDK=solana/sdk/bpf/dependencies/bpf-tools
$SDK/rust/bin/rustc --target bpfel-unknown-unknown --emit=obj -O -o main.o main.rs
$SDK/llvm/bin/ld.lld --discard-all --nmagic --script=linker_script.ld --format=elf -o main.elf main.o
$SDK/llvm/bin/llvm-objdump -xd main.elf
$SDK/llvm/bin/llvm-readelf --all main.elf That results in a nice What does not work (using cargo)RUSTFLAGS="--crate-type=staticlib" rustup run bpf cargo build --target bpfel-unknown-unknown --release The resulting @alessandrod I think you were tinkering with the linker script and build process as well, any ideas? |
Looks like you are building the deps/stdlib separately into a static library(s), will that allow the linker to strip unused code in the final ELF? What does this look like from a developer's perspective, do they need to do anything special besides update their I was thinking that we could build/link similar to what we do now except that we force PIC without ANY dependency on relocation/hash/string/symbol. For syscalls the linker updates the call -1 with a known syscall index that either fixup at load time or translate at runtime. |
Yep I was thinking something like that, which would require changes to the LLVM target to avoid relocs. Then somehow link in our own custom syscalls.o file which would allow us to implement syscalls with indexes, and either change the rustc target to always go through LTO, or somehow teach lld to (effectively) delete dead code. Except for the relocs part (kernel relocations are special), this is pretty much how the upstream rust bpf target works btw, where the equivalent of syscalls.o is provided by aya. |
This would require a bit of work in rustc, but I don't think it's necessarily the best approach. I'd leave the crate type as cdylib or bin and then we're pretty much allowed to output anything we want in that case. |
Agreed, ideally it would be So far I have only been experimenting with different ways to put the compilation and linking together. |
as part of this, removing the debug headers before deploying would be good. That should get rid of this issue as well #23354 |
Assuming all relocations for syscalls and cross-refs are getting removed, the Simplifying the entrypoint definition would allow removing symbol hashing logic entirely. Another way to define an entrypoint is using a new We'd be left with only
Pseudo code of the init section object:
.section .init$__00
.global _start
_start:
j entrypoint |
There is actually a special field in the ELF header for the entrypoint. https://refspecs.linuxfoundation.org/elf/gabi4+/ch4.eheader.html
|
Thanks for pointing this out. I'll see if I have any luck with the |
The linker is setting the entry field correctly (or pretty much nothing would work) |
Problem
ELF loading may not be necessary and increases complexity and vulnerabilities
Solana programs are linked as ELF files which are loaded by the runtime. ELF was chosen because it is a known and common executable format and there were future intentions of runtime linking with other support libraries, etc...
ELF loading consumes compute time and expands the attack surface considerably. These factors could be significantly reduced by using a use-case focused and simplified format.
Proposed Solution
Solana has a fairly constrained and simple runtime environment and the following points lend it to a simpler executable
Because of this, a Solana executable could instead be as simple as:
Doing so would eliminate:
Changing to the new executable format could be done independently of SBFv2 as it is a change to the executable format and not the bytecode format. A suitable off-the-shelf format could be used but since the new format is so simple it should probably be a new format, maybe Solana Executable Format (SEF).
The text was updated successfully, but these errors were encountered: