-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpf,test: Fix verifier issues in IPv6 BPF tests when running locally #25191
bpf,test: Fix verifier issues in IPv6 BPF tests when running locally #25191
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was able to fix this by replacing
ctx_data_valid
with a normal check
and adding a bounds check toipv6_with_hop_auth_tcp_pktgen
.
It still failed the verifier in the CI...
bounds checks on one copy somehow do not propagate to the other copies.
That would be a kernel bug. Do you plan to do some research on that?
Well, shit. More fun, will take a look.
It is even more interesting.
So this could indeed be something that we could address in the verifier, if this has not been fixed already. |
When running the `TestBPF/ipv6_test.o` test locally with clang v13 and kernel v5.17 it resulted in the following verifier error: ``` 171: (b7) r2 = 1 ; R2_w=inv1 ; authhdr->seq = 1; 172: (63) *(u32 *)(r1 +8) = r2 invalid access to packet, off=8 size=4, R1(id=6,off=8,r=2) R1 offset is outside of the packet ``` Tracked this down to `ipv6_with_hop_auth_tcp_pktgen`. It gets a pointer from `pktgen__append_ipv6_extension_header`. All expected bounds checks were in place. It even has a function called `ctx_data_valid` with some inline assembly which is supposed to perform additional bounds checks. Was able to fix this by replacing `ctx_data_valid` with a normal check and adding a bounds check to `ipv6_with_hop_auth_tcp_pktgen`. This bounds check seems to be necessary because bounds checks do not seem to apply to copies. The verifier log reveals the following: ``` 85: (69) r2 = *(u16 *)(r10 -40) ; R2_w=invP14 R10=fp0 ; builder->layer_offsets[layer_idx] = builder->cur_off; 86: (6b) *(u16 *)(r4 +10) = r2 ; R2_w=invP14 R4_w=fp-44 fp-40=mmmmmmmm ; layer = ctx_data(ctx) + builder->cur_off; 87: (0f) r1 += r7 ; R1_w=pkt(id=0,off=14,r=54,imm=0) R7=invP14 ; builder->cur_off += hdrsize; 88: (69) r2 = *(u16 *)(r10 -40) ; R2_w=inv(id=0,umax_value=65535,var_off=(0x0; 0xffff)) R10=fp0 ``` At instruction 85 we know the offset is `14` exactly. It is then written back to the stack. And at 88 we have lost bounds data. Since this is an offset we use to construct the pointers, this uncertainty becomes part of all pointers causing the need for additional checks. To mitigate the result of the above apparent verifier bug/limitation, I have also changes our offset value from 16 bit to 64 bit which tracks better. Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
0ba208a
to
686b2c5
Compare
This issue looks like it's already in my TODO list. If the spill was 64-bit, the subsequent 32-bit or 16-bit fill will lose boundaries: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/bpf/verifier.c?id=6715df8d5d24655b9fd368e904028112b54c7de1#n3835 |
I fixed CI by using |
/test Job 'Cilium-PR-K8s-1.25-kernel-4.19' failed: Click to show.Test Name
Failure Output
Jenkins URL: https://jenkins.cilium.io/job/Cilium-PR-K8s-1.25-kernel-4.19/1909/ If it is a flake and a GitHub issue doesn't already exist to track it, comment Then please upload the Jenkins artifacts to that issue. |
We appear to be hitting flake #24961 |
/test-1.25-4.19 |
Marking for backports to v1.12 and v1.13 as otherwise #25699 fails there with the above verifier error. EDIT: Removing v1.12 as the BPF unit tests are too different there to be able to easily backport. |
When running the
TestBPF/ipv6_test.o
test locally with clang v13 and kernel v5.17 it resulted in the following verifier error:Tracked this down to
ipv6_with_hop_auth_tcp_pktgen
. It gets a pointerfrom
pktgen__append_ipv6_extension_header
. All expected bounds checkswere in place. It even has a function called
ctx_data_valid
with someinline assembly which is supposed to perform additional bounds checks.
Was able to fix this by replacing
ctx_data_valid
with a normal checkand adding a bounds check to
ipv6_with_hop_auth_tcp_pktgen
. Thisbounds check seems to be necessary because bounds checks do not seem
to apply to copies.
The verifier log reveals the following:
At instruction 85 we know the offset is
14
exactly. It is then writtenback to the stack. And at 88 we have lost bounds data. Since this is
an offset we use to construct the pointers, this uncertainty becomes
part of all pointers causing the need for additional checks.
To mitigate the result of the above apparent verifier bug/limitation,
I have also changes our offset value from 16 bit to 64 bit which tracks
better.