Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider using libbpf #93

Closed
saschagrunert opened this issue Oct 15, 2021 · 27 comments
Closed

Consider using libbpf #93

saschagrunert opened this issue Oct 15, 2021 · 27 comments

Comments

@saschagrunert
Copy link
Member

If we utilize libbpf, then we can produce a smaller binary which also runs faster and minimizes the runtime dependencies. The overall architecture of the hook could be simplified as well. I created a syscall recorder project for demonstration purposes: https://github.com/saschagrunert/syscall-recorder

Building the main application (the syscall-recorder) requires bpftool, clang, llvm, libbpf, libelf, libz and libseccomp (for converting the syscall IDs to names). Statically linking is now also possible.

For my demo and to keep things simple, I decided to not fork within the recorder and build a small wrapper around systemd-run: https://github.com/saschagrunert/syscall-recorder/blob/main/hack/oci-hook/hook.go. Right now the recorder is not able to produce a full seccomp profile, but writes a list of syscalls to the target location.

demo

What are your thoughts on that?

@vrothberg
Copy link
Member

I don't understand the proposal and problem statement. Can you elaborate? Is it to rewrite oci-seccomp-bpf-hook in C with libbpf?

@saschagrunert
Copy link
Member Author

I don't understand the proposal and problem statement. Can you elaborate? Is it to rewrite oci-seccomp-bpf-hook in C with libbpf?

The main purpose of the project was to find the dependencies required to build and ship ebpf applications for debugging Kubernetes clusters. The hook was more a side-experiment, because it has a real use case and would also work using libbpf.

My proposal is not to rewrite it completely in C, we could probably split-up the binaries or still use cgo. Using libbpf has benefits from my point of view, for example not relying on the kernel headers.

@weirdwiz
Copy link
Collaborator

+1 porting the code to libbpf will make it easier to ship, with less memory footprint and possibly allow us to "Compile Once - Run Everywhere".

@vrothberg
Copy link
Member

Can you outline the exact benefits of using libbpf?

Porting/rewriting is costly and I want to make sure there are sufficient technical benefits.

Heads up: I am generally opposed to creating new projects in C. Build dependencies IMHO are not worth a rewrite of a project. But runtime dependencies may be worth it.

@weirdwiz
Copy link
Collaborator

weirdwiz commented Oct 15, 2021

In my opinion, the main benefit is during runtime. We wouldn't have to compile every time we run the tool, reducing the startup time. We would also wouldn't have to rely on kernel headers being present on the target system, which sometimes is a pain to deal with.

There are go libraries that bind to libbpf that can help make the job easier [1].

One con is that the target system needs to support BTF to help remove the kernel headers dependency.

[1] : https://github.com/aquasecurity/libbpfgo

@vrothberg
Copy link
Member

Thanks! Using go-bindings sounds compelling. Avoiding to recompile as well.

I'm on board 👍 Thanks, @saschagrunert & @weirdwiz

Any volunteers?

@weirdwiz
Copy link
Collaborator

I'd love to work on it

@vrothberg
Copy link
Member

@saschagrunert are you cool with @weirdwiz taking a shot at it?

@rhatdan
Copy link
Member

rhatdan commented Oct 15, 2021

One con is that the target system needs to support BTF to help remove the kernel headers dependency.
What versions of RHEL support this?

@weirdwiz
Copy link
Collaborator

weirdwiz commented Oct 18, 2021

What versions of RHEL support this?

RHEL 8.2+

@saschagrunert
Copy link
Member Author

@saschagrunert are you cool with @weirdwiz taking a shot at it?

Sure, I'm happy to review and support if requested. 👍

@vrothberg
Copy link
Member

Thanks, @saschagrunert! Happy hacking, @weirdwiz !

@rafaeldtinoco
Copy link

rafaeldtinoco commented Oct 21, 2021

Take a look at https://github.com/aquasecurity/btfhub/ to make this work with CO-RE and old kernels (specifically at recent work being done at: https://github.com/aquasecurity/btfhub/tree/main/tools, which will be upstreamed shortly). By doing something like that you're able to generate a binary that will run in any kernel (including old ones) without the dependency of LLVM and runtime compilations. We're pursuing that as well. Hope it helps.

TL;DR: making your eBPF application to support 550 kernels that don't provide BTF files is obtained by adding 1.5MB to an eBPF based application. The recent kernels already provide BTF and you dont have to worry in order to have CO-RE capable eBPF app.

@saschagrunert
Copy link
Member Author

saschagrunert commented Oct 26, 2021

@rafaeldtinoco thank you for the input. I had a look at the btfgen tool and think it looks promising. I'm now wondering, how would a build pipeline look like?

For example:

  1. We build the bpf object locally and using the vmlinux.h from
    bpftool btf dump file /sys/kernel/btf/vmlinux format c
  2. We generate the smaller btf files once via btfgen using the btfhub and put them into our repository
  3. Use the btf_custom_path option from libbpf for the relocation (see NewModuleFromBufferArgs in libbpfgo)

This would mean that the custom btf files need to be part of the local file system during execution.

@rafaeldtinoco
Copy link

@rafaeldtinoco thank you for the input. I had a look at the btfgen tool and think it looks promising. I'm now wondering, how would a build pipeline look like?

For example:

  1. We build the bpf object locally and using the vmlinux.h from
    bpftool btf dump file /sys/kernel/btf/vmlinux format c
  2. We generate the smaller btf files once via btfgen using the btfhub and put them into our repository
  3. Use the btf_custom_path option from libbpf for the relocation (see NewModuleFromBufferArgs in libbpfgo)

This would mean that the custom btf files need to be part of the local file system during execution.

Our current thoughts are:

  1. to create a small REST API that uses BTFHUB. You would inform the kernel version and distro and it would provide you the entire BTF file for that kernel (or multiple kernels, depending on the range you provide). You could provide your BPF object to the API and it would generate smaller BTF files for 1 or more kernel versions you provide.

for this case, caching the downloaded files locally would be smart (tracee does that, for example).

  1. Use btfgen.sh like it is now and include all generated BTF files in your project. This requires that you build all smaller BTFs every time your .bpf.c source files are changed (adding/removing kernel types from it). Something like a github action during release could take care of this (downloading BTFHUB, generating specific BTFs using your compiled object, including the BTF files in the filesystem.

WDYT ?

@saschagrunert
Copy link
Member Author

saschagrunert commented Oct 27, 2021

@rafaeldtinoco thank you for the input, I'm working on a syscall recorder PoC in kubernetes-sigs/security-profiles-operator#618.

I decided to go through the following build steps:

  1. Build the bpf.o once on the local build system
  2. Using the object as input to build the smaller btfs from btfhub once and commit them into the repo
  3. Using go generate to move the btfs into the binary
  4. Depending on the system where the operator runs find the right incremental btf during runtime, write it to disk and load it via btf_custom_path. (there seems to be no way to load the btf from memory, aka []byte)

2. and 3. are steps only necessary when the bpf code changes, we will verify that later on by running them in CI and comparing against the committed code.

The question is now: What if a kernel and architecture is not supported by the recorder? Should we leave btf_custom_path empty, which would mean we fallback to the /sys/kernel/btf/vmlinux. Is this safe?

@rafaeldtinoco
Copy link

@rafaeldtinoco thank you for the input, I'm working on a syscall recorder PoC in kubernetes-sigs/security-profiles-operator#618.

Yep, that work brought me here IIRC.

I decided to go through the following build steps:

  1. Build the bpf.o once on the local build system
  2. Using the object as input to build the smaller btfs from btfhub once and commit them into the repo
  3. Using go generate to move the btfs into the binary
  4. Depending on the system where the operator runs find the right incremental btf during runtime, write it to disk and load it via btf_custom_path. (there seems to be no way to load the btf from memory, aka []byte)

2. and 3. are steps only necessary when the bpf code changes, we will verify that later on by running them in CI and comparing against the committed code.

This all makes sense to me, and goes in the same direction our tracee project heads to (together with libbpfgo's intent). If you ever think that libbpfgo can help in any way (by, for example glueing BTFs and adding them automatically through 'go generate', or something similar, we can discuss this in that project's discussions page.

The question is now: What if a kernel and architecture is not supported by the recorder? Should we leave btf_custom_path empty, which would mean we fallback to the /sys/kernel/btf/vmlinux. Is this safe?

The default should be to always use "/sys/kernel/btf/vmlinux". If it does not exist, then your code should identify the OS and kernel and pick the right BTF file for it.

Check how we do this in libbpfgo, here and here.

We then started this approach making tracee to download the BTF file for the environment it was running. This will become the API I mentioned to you. HTTP REST > GIMME OS X KERNEL Y BTF FILE.

Now, let's suppose you're offline... you should try to use the BTF for the closest kernel version you're running. Let's say you have BTF file for 5.4.0-87 and you're running a 5.4.0-89 kernel that does not have a prepared BTF (embedded into your go binary). Then you can try to load using 5.4.0-84 for example and see if it works. Of course, best option would be to download the missing BTF from the API but that won't be always possible (thus the idea of trying to use latest you have, which, very likely, will fit).

@weirdwiz
Copy link
Collaborator

weirdwiz commented Oct 28, 2021

I'm having a little trouble figuring out how would we get the $PARENT_PID 1 from the userspace. If we were using C we could change the value in the bss section 2 in the structure generated in the skeleton. But for libbpfgo, I couldn't find a way to to that.

One way could be adding a uprobe to get the PID from the userspace, but it would be hard to figure out which pid belongs to which container, if a lot of binaries are run at the same time.

@saschagrunert
Copy link
Member Author

I'm having a little trouble figuring out how would we get the $PARENT_PID 1 from the userspace. If we were using C we could change the value in the bss section 2 in the structure generated in the skeleton. But for libbpfgo, I couldn't find a way to to that.

We could use a map for now, but I think setting the rodata should be a feature of libbpfgo

Ref aquasecurity/libbpfgo#2, aquasecurity/libbpfgo#27

@vrothberg
Copy link
Member

Any updates, @weirdwiz? I want to make sure that @saschagrunert's request is not falling from our radar.

@vrothberg
Copy link
Member

OK, let's unblock the issue. @weirdwiz is busy with his internship at Red Hat's storage team. If others want to give it a shot, feel free to self-assign or drop a comment.

@saschagrunert
Copy link
Member Author

In the meanwhile we released a first integration within the security-profiles-operator: https://github.com/kubernetes-sigs/security-profiles-operator/tree/main/internal/pkg/daemon/bpfrecorder

Packaging a generic libbpf-based application seems to be the most tricky part here. On the other side, we probably do not have to support a custom BTF if we focus on Fedora/RHEL packaging in the first place.

@vrothberg I can put it in our Node Observability backlog if you don't mind, because we plan to work on ebpf applications in any case in mid-term.

@vrothberg
Copy link
Member

SGTM, thanks!

@vrothberg
Copy link
Member

@saschagrunert did you find time looking into it?

@saschagrunert
Copy link
Member Author

@vrothberg unfortunately not directly, because I think we should clarify how to package the application before moving forward. I'm working with other teams on solving that issue right now, but I think it will take some time (months).

libbpf still seems not to be supported with all features for all kernels by the way. For example ring buffer maps are not supported by Linux < 5.8. Not sure if that is a problem we can encounter.

@vrothberg
Copy link
Member

@saschagrunert what's your current take on the issue? Shall we leave it open or close it?

@saschagrunert
Copy link
Member Author

Let's close it for now, it does not have the priority that I can work on it in the near future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants