-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enabling eBPF Recorder on AKS crashes SPOD containers #795
Comments
I can reproduce it and we probably should update libbpf and the vendored btf to see if that fixes the issue. |
Did a test with #796 and it does not work, because:
That's odd, I'm not sure if the kernel configuration of the azure nodes are correct to support our BPF application. |
great , i’m grateful for the time you invested in that
LMK if more checks / changes are needed from my end .
Tomer
On Mon, 31 Jan 2022 at 11:52 Sascha Grunert ***@***.***> wrote:
I can reproduce it and we probably should update libbpf and the vendored
btf to see if that fixes the issue.
—
Reply to this email directly, view it on GitHub
<#795 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAD5FRNSXB32CRC7GYVOOUDUYZLVHANCNFSM5NDHGRVA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you authored the thread.Message ID:
***@***.***
com>
--
Sent from Gmail Mobile
|
@tshaiman can you share the configuration flags how the kernel has been built? Ubuntu 18.04 does not expose |
When trying the llvm-bootstrap demo application: https://github.com/libbpf/libbpf-bootstrap/blob/master/examples/c/bootstrap.c Then I'm getting the same error on an azure node (I disabled the failure on
|
@saschagrunert : I don't have insights on how the kernel was built as I'm not part of the AKS team. |
Maybe we can open an issue in their tracker to describe the problem there? |
@saschagrunert : done |
Still the same with the latest Azure deployment. |
correct, as I still see the bug here : Azure/AKS#2768 is still open . I send a ping/reminder on the ticket |
still pending on AKS, i have reminded them many times . |
Can reproduce this issue on GKE cos nodes too. Error logs:
|
Not related to GKE, but maybe BTF Hub can help with the AKS case? |
AKS already exposes |
I have the same issue and I deploy SPO in my local cluster, is this concerned as kernel problem? |
AKS doesn't do anything special for our kernels. They are based on Azure marketplace Ubuntu 18.04 images. Have you tried to reproduce this on a vanilla Azure (non-AKS) VM? FWIW, I tried this once like a year or so ago and hit similar issues I wasn't able to resolve. I ended up not using BTF sadly. My suspicion is it could be something to do with 18.04 or how they backported kernel fixes and a version like 20.04 originally based on 5.x+ could work out of the box (i.e., something is wonky between 4.15 + 18.04 and 5.4 + 18.04 because 4.15 didn't support BTF, but later kernels did). But TBH, I am not an expert in BTF, and I don't think we are doing anything special here, so I'm not sure where to investigate. |
@saschagrunert in case it's helpful, attached the kconfig from a running AKS node. Notably I do see here's a snippet of the config grepping for bpf/btf flags to save you some time (admittedly 1067 vs 1074 but they are basically the same)
|
@alexeldeib : thanks a lot for assisting in getting those configs . my 2 cents here is the latest kernel config which is 5.4.0.1074 from my running AKS node. |
ah. I suspect you need kernel 5.8 (not available on ubuntu 18.04 or AKS yet)
https://github.com/iovisor/bcc/blob/master/docs/kernel-versions.md ring buffer maps are only in 5.8. see also libbpf/libbpf-bootstrap#42 |
that is indeed seems to be the root cause, well done @alexeldeib ! |
@tshaiman maybe, I'll see what we can do here. |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
@saschagrunert I'm facing the same issue on a k3s cluster
Environment:
uname -a:
|
@B3ns44d I think we require kernel 5.8 for that to work :-/ |
@saschagrunert ohhh didn't know that, it now functions properly after upgrading to 5.13.0-41. |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
This seems to have been answered with the kernel version comment. Closing. |
following @saschagrunert excellent tutorial here , I have called the method :
kubectl patch spod spod --type=merge -p '{"spec":{"enableBpfRecorder":true}}'
which eventually led to the following output on the bpf-recorder container :
uname -a
): 5.4.0-1067-azurekubectl get nodes -o wide
❯ k get nodes -o wide
The text was updated successfully, but these errors were encountered: