-
Notifications
You must be signed in to change notification settings - Fork 143
selftests/bpf: Fix selftest verif_scale_strobemeta failure with llvm22 #9981
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: bpf-next_base
Are you sure you want to change the base?
Conversation
Upstream branch: 39e9d5f |
4038f13
to
5ffad15
Compare
Upstream branch: 39e9d5f |
2291034
to
c691a0a
Compare
5ffad15
to
bfffeb1
Compare
Upstream branch: 48a97ff |
c691a0a
to
ba011ab
Compare
bfffeb1
to
58474bc
Compare
Upstream branch: 27aab47 |
ba011ab
to
6c68974
Compare
58474bc
to
bd61720
Compare
Upstream branch: 48a97ff |
6c68974
to
6e09734
Compare
bd61720
to
ca16241
Compare
Upstream branch: 48a97ff |
6e09734
to
07e9d77
Compare
ca16241
to
2226789
Compare
Upstream branch: 48a97ff |
07e9d77
to
6d4994d
Compare
2226789
to
3763655
Compare
Upstream branch: 48a97ff |
With latest llvm22, I hit the verif_scale_strobemeta selftest failure below: $ ./test_progs -n 618 libbpf: prog 'on_event': BPF program load failed: -E2BIG libbpf: prog 'on_event': -- BEGIN PROG LOAD LOG -- BPF program is too large. Processed 1000001 insn verification time 7019091 usec stack depth 488 processed 1000001 insns (limit 1000000) max_states_per_insn 28 total_states 33927 peak_states 12813 mark_read 0 -- END PROG LOAD LOG -- libbpf: prog 'on_event': failed to load: -E2BIG libbpf: failed to load object 'strobemeta.bpf.o' scale_test:FAIL:expect_success unexpected error: -7 (errno 7) #618 verif_scale_strobemeta:FAIL But if I increase the verificaiton insn limit from 1M to 10M, the above test_progs run actually will succeed. The below is the result from veristat: $ ./veristat strobemeta.bpf.o Processing 'strobemeta.bpf.o'... File Program Verdict Duration (us) Insns States Program size Jited size ---------------- -------- ------- ------------- ------- ------ ------------ ---------- strobemeta.bpf.o on_event success 90250893 9777685 358230 15954 80794 ---------------- -------- ------- ------------- ------- ------ ------------ ---------- Done. Processed 1 files, 0 programs. Skipped 1 files, 0 programs. Further debugging shows the llvm commit [1] is responsible for the verificaiton failure as it tries to convert certain switch statement to if-condition. Such change may cause different transformation compared to original switch statement. In bpf program strobemeta.c case, the initial llvm ir for read_int_var() function is define internal void @read_int_var(ptr noundef %0, i64 noundef %1, ptr noundef %2, ptr noundef %3, ptr noundef %4) #2 !dbg !535 { %6 = alloca ptr, align 8 %7 = alloca i64, align 8 %8 = alloca ptr, align 8 %9 = alloca ptr, align 8 %10 = alloca ptr, align 8 %11 = alloca ptr, align 8 %12 = alloca i32, align 4 ... %20 = icmp ne ptr %19, null, !dbg !561 br i1 %20, label %22, label %21, !dbg !562 21: ; preds = %5 store i32 1, ptr %12, align 4 br label %48, !dbg !563 22: %23 = load ptr, ptr %9, align 8, !dbg !564 ... 47: ; preds = %38, %22 store i32 0, ptr %12, align 4, !dbg !588 br label %48, !dbg !588 48: ; preds = %47, %21 call void @llvm.lifetime.end.p0(ptr %11) #4, !dbg !588 %49 = load i32, ptr %12, align 4 switch i32 %49, label %51 [ i32 0, label %50 i32 1, label %50 ] 50: ; preds = %48, %48 ret void, !dbg !589 51: ; preds = %48 unreachable } Note that the above 'switch' statement is added by clang frontend. Without [1], the switch statement will survive until SelectionDag, so the switch statement acts like a 'barrier' and prevents some transformation involved with both 'before' and 'after' the switch statement. But with [1], the switch statement will be removed during middle end optimization and later middle end passes (esp. after inlining) have more freedom to reorder the code. The following is the related source code: static void *calc_location(struct strobe_value_loc *loc, void *tls_base): bpf_probe_read_user(&tls_ptr, sizeof(void *), dtv); /* if pointer has (void *)-1 value, then TLS wasn't initialized yet */ return tls_ptr && tls_ptr != (void *)-1 ? tls_ptr + tls_index.offset : NULL; In read_int_var() func, we have: void *location = calc_location(&cfg->int_locs[idx], tls_base); if (!location) return; bpf_probe_read_user(value, sizeof(struct strobe_value_generic), location); ... The static func calc_location() is called inside read_int_var(). The asm code without [1]: 77: .123....89 (85) call bpf_probe_read_user#112 78: ........89 (79) r1 = *(u64 *)(r10 -368) 79: .1......89 (79) r2 = *(u64 *)(r10 -8) 80: .12.....89 (bf) r3 = r2 81: .123....89 (0f) r3 += r1 82: ..23....89 (07) r2 += 1 83: ..23....89 (79) r4 = *(u64 *)(r10 -464) 84: ..234...89 (a5) if r2 < 0x2 goto pc+13 85: ...34...89 (15) if r3 == 0x0 goto pc+12 86: ...3....89 (bf) r1 = r10 87: .1.3....89 (07) r1 += -400 88: .1.3....89 (b4) w2 = 16 In this case, 'r2 < 0x2' and 'r3 == 0x0' go to null 'locaiton' place, so the verifier actually prefers to do verification first at 'r1 = r10' etc. The asm code with [1]: 119: .123....89 (85) call bpf_probe_read_user#112 120: ........89 (79) r1 = *(u64 *)(r10 -368) 121: .1......89 (79) r2 = *(u64 *)(r10 -8) 122: .12.....89 (bf) r3 = r2 123: .123....89 (0f) r3 += r1 124: ..23....89 (07) r2 += -1 125: ..23....89 (a5) if r2 < 0xfffffffe goto pc+6 126: ........89 (05) goto pc+17 ... 144: ........89 (b4) w1 = 0 145: .1......89 (6b) *(u16 *)(r8 +80) = r1 In this case, if 'r2 < 0xfffffffe' is true, the control will go to non-null 'location' branch, so 'goto pc+17' will actually go to null 'location' branch. This seems causing tremendous amount of verificaiton state. To fix the issue, rewrite the following code return tls_ptr && tls_ptr != (void *)-1 ? tls_ptr + tls_index.offset : NULL; to if/then statement and hopefully these explicit if/then statements are sticky during middle-end optimizations. Test with llvm20 and llvm21 as well and all strobemeta related selftests are passed. [1] llvm/llvm-project#161000 Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
6d4994d
to
49ddfe2
Compare
Pull request for series with
subject: selftests/bpf: Fix selftest verif_scale_strobemeta failure with llvm22
version: 1
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=1011081