-
Notifications
You must be signed in to change notification settings - Fork 151
s390/bpf: Fix multiple tail calls #31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Master branch: 8081ede patch https://patchwork.ozlabs.org/project/netdev/patch/20200909232141.3099367-1-iii@linux.ibm.com/ applied successfully |
|
Master branch: 2f7de98 patch https://patchwork.ozlabs.org/project/netdev/patch/20200909232141.3099367-1-iii@linux.ibm.com/ applied successfully |
de95b8d to
11481c8
Compare
|
Master branch: e3b9626 patch https://patchwork.ozlabs.org/project/netdev/patch/20200909232141.3099367-1-iii@linux.ibm.com/ applied successfully |
11481c8 to
59609ed
Compare
|
Master branch: d66423f patch https://patchwork.ozlabs.org/project/netdev/patch/20200909232141.3099367-1-iii@linux.ibm.com/ applied successfully |
59609ed to
b83943f
Compare
|
Master branch: 90a1ded patch https://patchwork.ozlabs.org/project/netdev/patch/20200909232141.3099367-1-iii@linux.ibm.com/ applied successfully |
b83943f to
36ed747
Compare
|
Master branch: 18841da patch https://patchwork.ozlabs.org/project/netdev/patch/20200909232141.3099367-1-iii@linux.ibm.com/ applied successfully |
36ed747 to
db9224f
Compare
|
Master branch: 2bab48c patch https://patchwork.ozlabs.org/project/netdev/patch/20200909232141.3099367-1-iii@linux.ibm.com/ applied successfully |
db9224f to
da00798
Compare
exceeding tail call count or missing tail call target), JIT uses label[0] field, which contains the address of the instruction following the tail call. When there are multiple tail calls, label[0] value comes from handling of a previous tail call, which is incorrect. Fix by getting rid of label array and resolving the label address locally: for all 3 branches that jump to it, emit 0 offsets at the beginning, and then backpatch them with the correct value. Also, do not use the long jump infrastructure: the tail call sequence is known to be short, so make all 3 jumps short. Fixes: 6651ee0 ("s390/bpf: implement bpf_tail_call() helper") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> --- arch/s390/net/bpf_jit_comp.c | 61 ++++++++++++++++-------------------- 1 file changed, 27 insertions(+), 34 deletions(-)
|
Master branch: 2bab48c patch https://patchwork.ozlabs.org/project/netdev/patch/20200909232141.3099367-1-iii@linux.ibm.com/ applied successfully |
da00798 to
a3013de
Compare
|
At least one diff in series https://patchwork.ozlabs.org/project/netdev/list/?series=200680 irrelevant now. Closing PR. |
Fix BPF_CORE_READ_BITFIELD() macro used for reading CO-RE-relocatable bitfields. Missing breaks in a switch caused 8-byte reads always. This can confuse libbpf because it does strict checks that memory load size corresponds to the original size of the field, which in this case quite often would be wrong. After fixing that, we run into another problem, which quite subtle, so worth documenting here. The issue is in Clang optimization and CO-RE relocation interactions. Without that asm volatile construct (also known as barrier_var()), Clang will re-order BYTE_OFFSET and BYTE_SIZE relocations and will apply BYTE_OFFSET 4 times for each switch case arm. This will result in the same error from libbpf about mismatch of memory load size and original field size. I.e., if we were reading u32, we'd still have *(u8 *), *(u16 *), *(u32 *), and *(u64 *) memory loads, three of which will fail. Using barrier_var() forces Clang to apply BYTE_OFFSET relocation first (and once) to calculate p, after which value of p is used without relocation in each of switch case arms, doing appropiately-sized memory load. Here's the list of relevant relocations and pieces of generated BPF code before and after this patch for test_core_reloc_bitfields_direct selftests. BEFORE ===== #45: core_reloc: insn #160 --> [5] + 0:5: byte_sz --> struct core_reloc_bitfields.u32 #46: core_reloc: insn #167 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 #47: core_reloc: insn #174 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 #48: core_reloc: insn #178 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 #49: core_reloc: insn #182 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 157: 18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0 ll 159: 7b 12 20 01 00 00 00 00 *(u64 *)(r2 + 288) = r1 160: b7 02 00 00 04 00 00 00 r2 = 4 ; BYTE_SIZE relocation here ^^^ 161: 66 02 07 00 03 00 00 00 if w2 s> 3 goto +7 <LBB0_63> 162: 16 02 0d 00 01 00 00 00 if w2 == 1 goto +13 <LBB0_65> 163: 16 02 01 00 02 00 00 00 if w2 == 2 goto +1 <LBB0_66> 164: 05 00 12 00 00 00 00 00 goto +18 <LBB0_69> 0000000000000528 <LBB0_66>: 165: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 167: 69 11 08 00 00 00 00 00 r1 = *(u16 *)(r1 + 8) ; BYTE_OFFSET relo here w/ WRONG size ^^^^^^^^^^^^^^^^ 168: 05 00 0e 00 00 00 00 00 goto +14 <LBB0_69> 0000000000000548 <LBB0_63>: 169: 16 02 0a 00 04 00 00 00 if w2 == 4 goto +10 <LBB0_67> 170: 16 02 01 00 08 00 00 00 if w2 == 8 goto +1 <LBB0_68> 171: 05 00 0b 00 00 00 00 00 goto +11 <LBB0_69> 0000000000000560 <LBB0_68>: 172: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 174: 79 11 08 00 00 00 00 00 r1 = *(u64 *)(r1 + 8) ; BYTE_OFFSET relo here w/ WRONG size ^^^^^^^^^^^^^^^^ 175: 05 00 07 00 00 00 00 00 goto +7 <LBB0_69> 0000000000000580 <LBB0_65>: 176: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 178: 71 11 08 00 00 00 00 00 r1 = *(u8 *)(r1 + 8) ; BYTE_OFFSET relo here w/ WRONG size ^^^^^^^^^^^^^^^^ 179: 05 00 03 00 00 00 00 00 goto +3 <LBB0_69> 00000000000005a0 <LBB0_67>: 180: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 182: 61 11 08 00 00 00 00 00 r1 = *(u32 *)(r1 + 8) ; BYTE_OFFSET relo here w/ RIGHT size ^^^^^^^^^^^^^^^^ 00000000000005b8 <LBB0_69>: 183: 67 01 00 00 20 00 00 00 r1 <<= 32 184: b7 02 00 00 00 00 00 00 r2 = 0 185: 16 02 02 00 00 00 00 00 if w2 == 0 goto +2 <LBB0_71> 186: c7 01 00 00 20 00 00 00 r1 s>>= 32 187: 05 00 01 00 00 00 00 00 goto +1 <LBB0_72> 00000000000005e0 <LBB0_71>: 188: 77 01 00 00 20 00 00 00 r1 >>= 32 AFTER ===== #30: core_reloc: insn #132 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 #31: core_reloc: insn #134 --> [5] + 0:5: byte_sz --> struct core_reloc_bitfields.u32 129: 18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0 ll 131: 7b 12 20 01 00 00 00 00 *(u64 *)(r2 + 288) = r1 132: b7 01 00 00 08 00 00 00 r1 = 8 ; BYTE_OFFSET relo here ^^^ ; no size check for non-memory dereferencing instructions 133: 0f 12 00 00 00 00 00 00 r2 += r1 134: b7 03 00 00 04 00 00 00 r3 = 4 ; BYTE_SIZE relocation here ^^^ 135: 66 03 05 00 03 00 00 00 if w3 s> 3 goto +5 <LBB0_63> 136: 16 03 09 00 01 00 00 00 if w3 == 1 goto +9 <LBB0_65> 137: 16 03 01 00 02 00 00 00 if w3 == 2 goto +1 <LBB0_66> 138: 05 00 0a 00 00 00 00 00 goto +10 <LBB0_69> 0000000000000458 <LBB0_66>: 139: 69 21 00 00 00 00 00 00 r1 = *(u16 *)(r2 + 0) ; NO CO-RE relocation here ^^^^^^^^^^^^^^^^ 140: 05 00 08 00 00 00 00 00 goto +8 <LBB0_69> 0000000000000468 <LBB0_63>: 141: 16 03 06 00 04 00 00 00 if w3 == 4 goto +6 <LBB0_67> 142: 16 03 01 00 08 00 00 00 if w3 == 8 goto +1 <LBB0_68> 143: 05 00 05 00 00 00 00 00 goto +5 <LBB0_69> 0000000000000480 <LBB0_68>: 144: 79 21 00 00 00 00 00 00 r1 = *(u64 *)(r2 + 0) ; NO CO-RE relocation here ^^^^^^^^^^^^^^^^ 145: 05 00 03 00 00 00 00 00 goto +3 <LBB0_69> 0000000000000490 <LBB0_65>: 146: 71 21 00 00 00 00 00 00 r1 = *(u8 *)(r2 + 0) ; NO CO-RE relocation here ^^^^^^^^^^^^^^^^ 147: 05 00 01 00 00 00 00 00 goto +1 <LBB0_69> 00000000000004a0 <LBB0_67>: 148: 61 21 00 00 00 00 00 00 r1 = *(u32 *)(r2 + 0) ; NO CO-RE relocation here ^^^^^^^^^^^^^^^^ 00000000000004a8 <LBB0_69>: 149: 67 01 00 00 20 00 00 00 r1 <<= 32 150: b7 02 00 00 00 00 00 00 r2 = 0 151: 16 02 02 00 00 00 00 00 if w2 == 0 goto +2 <LBB0_71> 152: c7 01 00 00 20 00 00 00 r1 s>>= 32 153: 05 00 01 00 00 00 00 00 goto +1 <LBB0_72> 00000000000004d0 <LBB0_71>: 154: 77 01 00 00 20 00 00 00 r1 >>= 323 Fixes: ee26dad ("libbpf: Add support for relocatable bitfields") Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Fix BPF_CORE_READ_BITFIELD() macro used for reading CO-RE-relocatable bitfields. Missing breaks in a switch caused 8-byte reads always. This can confuse libbpf because it does strict checks that memory load size corresponds to the original size of the field, which in this case quite often would be wrong. After fixing that, we run into another problem, which quite subtle, so worth documenting here. The issue is in Clang optimization and CO-RE relocation interactions. Without that asm volatile construct (also known as barrier_var()), Clang will re-order BYTE_OFFSET and BYTE_SIZE relocations and will apply BYTE_OFFSET 4 times for each switch case arm. This will result in the same error from libbpf about mismatch of memory load size and original field size. I.e., if we were reading u32, we'd still have *(u8 *), *(u16 *), *(u32 *), and *(u64 *) memory loads, three of which will fail. Using barrier_var() forces Clang to apply BYTE_OFFSET relocation first (and once) to calculate p, after which value of p is used without relocation in each of switch case arms, doing appropiately-sized memory load. Here's the list of relevant relocations and pieces of generated BPF code before and after this patch for test_core_reloc_bitfields_direct selftests. BEFORE ===== #45: core_reloc: insn #160 --> [5] + 0:5: byte_sz --> struct core_reloc_bitfields.u32 #46: core_reloc: insn #167 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 #47: core_reloc: insn #174 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 #48: core_reloc: insn #178 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 #49: core_reloc: insn #182 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 157: 18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0 ll 159: 7b 12 20 01 00 00 00 00 *(u64 *)(r2 + 288) = r1 160: b7 02 00 00 04 00 00 00 r2 = 4 ; BYTE_SIZE relocation here ^^^ 161: 66 02 07 00 03 00 00 00 if w2 s> 3 goto +7 <LBB0_63> 162: 16 02 0d 00 01 00 00 00 if w2 == 1 goto +13 <LBB0_65> 163: 16 02 01 00 02 00 00 00 if w2 == 2 goto +1 <LBB0_66> 164: 05 00 12 00 00 00 00 00 goto +18 <LBB0_69> 0000000000000528 <LBB0_66>: 165: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 167: 69 11 08 00 00 00 00 00 r1 = *(u16 *)(r1 + 8) ; BYTE_OFFSET relo here w/ WRONG size ^^^^^^^^^^^^^^^^ 168: 05 00 0e 00 00 00 00 00 goto +14 <LBB0_69> 0000000000000548 <LBB0_63>: 169: 16 02 0a 00 04 00 00 00 if w2 == 4 goto +10 <LBB0_67> 170: 16 02 01 00 08 00 00 00 if w2 == 8 goto +1 <LBB0_68> 171: 05 00 0b 00 00 00 00 00 goto +11 <LBB0_69> 0000000000000560 <LBB0_68>: 172: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 174: 79 11 08 00 00 00 00 00 r1 = *(u64 *)(r1 + 8) ; BYTE_OFFSET relo here w/ WRONG size ^^^^^^^^^^^^^^^^ 175: 05 00 07 00 00 00 00 00 goto +7 <LBB0_69> 0000000000000580 <LBB0_65>: 176: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 178: 71 11 08 00 00 00 00 00 r1 = *(u8 *)(r1 + 8) ; BYTE_OFFSET relo here w/ WRONG size ^^^^^^^^^^^^^^^^ 179: 05 00 03 00 00 00 00 00 goto +3 <LBB0_69> 00000000000005a0 <LBB0_67>: 180: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 182: 61 11 08 00 00 00 00 00 r1 = *(u32 *)(r1 + 8) ; BYTE_OFFSET relo here w/ RIGHT size ^^^^^^^^^^^^^^^^ 00000000000005b8 <LBB0_69>: 183: 67 01 00 00 20 00 00 00 r1 <<= 32 184: b7 02 00 00 00 00 00 00 r2 = 0 185: 16 02 02 00 00 00 00 00 if w2 == 0 goto +2 <LBB0_71> 186: c7 01 00 00 20 00 00 00 r1 s>>= 32 187: 05 00 01 00 00 00 00 00 goto +1 <LBB0_72> 00000000000005e0 <LBB0_71>: 188: 77 01 00 00 20 00 00 00 r1 >>= 32 AFTER ===== #30: core_reloc: insn #132 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 #31: core_reloc: insn #134 --> [5] + 0:5: byte_sz --> struct core_reloc_bitfields.u32 129: 18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0 ll 131: 7b 12 20 01 00 00 00 00 *(u64 *)(r2 + 288) = r1 132: b7 01 00 00 08 00 00 00 r1 = 8 ; BYTE_OFFSET relo here ^^^ ; no size check for non-memory dereferencing instructions 133: 0f 12 00 00 00 00 00 00 r2 += r1 134: b7 03 00 00 04 00 00 00 r3 = 4 ; BYTE_SIZE relocation here ^^^ 135: 66 03 05 00 03 00 00 00 if w3 s> 3 goto +5 <LBB0_63> 136: 16 03 09 00 01 00 00 00 if w3 == 1 goto +9 <LBB0_65> 137: 16 03 01 00 02 00 00 00 if w3 == 2 goto +1 <LBB0_66> 138: 05 00 0a 00 00 00 00 00 goto +10 <LBB0_69> 0000000000000458 <LBB0_66>: 139: 69 21 00 00 00 00 00 00 r1 = *(u16 *)(r2 + 0) ; NO CO-RE relocation here ^^^^^^^^^^^^^^^^ 140: 05 00 08 00 00 00 00 00 goto +8 <LBB0_69> 0000000000000468 <LBB0_63>: 141: 16 03 06 00 04 00 00 00 if w3 == 4 goto +6 <LBB0_67> 142: 16 03 01 00 08 00 00 00 if w3 == 8 goto +1 <LBB0_68> 143: 05 00 05 00 00 00 00 00 goto +5 <LBB0_69> 0000000000000480 <LBB0_68>: 144: 79 21 00 00 00 00 00 00 r1 = *(u64 *)(r2 + 0) ; NO CO-RE relocation here ^^^^^^^^^^^^^^^^ 145: 05 00 03 00 00 00 00 00 goto +3 <LBB0_69> 0000000000000490 <LBB0_65>: 146: 71 21 00 00 00 00 00 00 r1 = *(u8 *)(r2 + 0) ; NO CO-RE relocation here ^^^^^^^^^^^^^^^^ 147: 05 00 01 00 00 00 00 00 goto +1 <LBB0_69> 00000000000004a0 <LBB0_67>: 148: 61 21 00 00 00 00 00 00 r1 = *(u32 *)(r2 + 0) ; NO CO-RE relocation here ^^^^^^^^^^^^^^^^ 00000000000004a8 <LBB0_69>: 149: 67 01 00 00 20 00 00 00 r1 <<= 32 150: b7 02 00 00 00 00 00 00 r2 = 0 151: 16 02 02 00 00 00 00 00 if w2 == 0 goto +2 <LBB0_71> 152: c7 01 00 00 20 00 00 00 r1 s>>= 32 153: 05 00 01 00 00 00 00 00 goto +1 <LBB0_72> 00000000000004d0 <LBB0_71>: 154: 77 01 00 00 20 00 00 00 r1 >>= 323 Fixes: ee26dad ("libbpf: Add support for relocatable bitfields") Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Fix BPF_CORE_READ_BITFIELD() macro used for reading CO-RE-relocatable bitfields. Missing breaks in a switch caused 8-byte reads always. This can confuse libbpf because it does strict checks that memory load size corresponds to the original size of the field, which in this case quite often would be wrong. After fixing that, we run into another problem, which quite subtle, so worth documenting here. The issue is in Clang optimization and CO-RE relocation interactions. Without that asm volatile construct (also known as barrier_var()), Clang will re-order BYTE_OFFSET and BYTE_SIZE relocations and will apply BYTE_OFFSET 4 times for each switch case arm. This will result in the same error from libbpf about mismatch of memory load size and original field size. I.e., if we were reading u32, we'd still have *(u8 *), *(u16 *), *(u32 *), and *(u64 *) memory loads, three of which will fail. Using barrier_var() forces Clang to apply BYTE_OFFSET relocation first (and once) to calculate p, after which value of p is used without relocation in each of switch case arms, doing appropiately-sized memory load. Here's the list of relevant relocations and pieces of generated BPF code before and after this patch for test_core_reloc_bitfields_direct selftests. BEFORE ===== #45: core_reloc: insn #160 --> [5] + 0:5: byte_sz --> struct core_reloc_bitfields.u32 #46: core_reloc: insn #167 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 #47: core_reloc: insn #174 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 #48: core_reloc: insn #178 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 #49: core_reloc: insn #182 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 157: 18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0 ll 159: 7b 12 20 01 00 00 00 00 *(u64 *)(r2 + 288) = r1 160: b7 02 00 00 04 00 00 00 r2 = 4 ; BYTE_SIZE relocation here ^^^ 161: 66 02 07 00 03 00 00 00 if w2 s> 3 goto +7 <LBB0_63> 162: 16 02 0d 00 01 00 00 00 if w2 == 1 goto +13 <LBB0_65> 163: 16 02 01 00 02 00 00 00 if w2 == 2 goto +1 <LBB0_66> 164: 05 00 12 00 00 00 00 00 goto +18 <LBB0_69> 0000000000000528 <LBB0_66>: 165: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 167: 69 11 08 00 00 00 00 00 r1 = *(u16 *)(r1 + 8) ; BYTE_OFFSET relo here w/ WRONG size ^^^^^^^^^^^^^^^^ 168: 05 00 0e 00 00 00 00 00 goto +14 <LBB0_69> 0000000000000548 <LBB0_63>: 169: 16 02 0a 00 04 00 00 00 if w2 == 4 goto +10 <LBB0_67> 170: 16 02 01 00 08 00 00 00 if w2 == 8 goto +1 <LBB0_68> 171: 05 00 0b 00 00 00 00 00 goto +11 <LBB0_69> 0000000000000560 <LBB0_68>: 172: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 174: 79 11 08 00 00 00 00 00 r1 = *(u64 *)(r1 + 8) ; BYTE_OFFSET relo here w/ WRONG size ^^^^^^^^^^^^^^^^ 175: 05 00 07 00 00 00 00 00 goto +7 <LBB0_69> 0000000000000580 <LBB0_65>: 176: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 178: 71 11 08 00 00 00 00 00 r1 = *(u8 *)(r1 + 8) ; BYTE_OFFSET relo here w/ WRONG size ^^^^^^^^^^^^^^^^ 179: 05 00 03 00 00 00 00 00 goto +3 <LBB0_69> 00000000000005a0 <LBB0_67>: 180: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 182: 61 11 08 00 00 00 00 00 r1 = *(u32 *)(r1 + 8) ; BYTE_OFFSET relo here w/ RIGHT size ^^^^^^^^^^^^^^^^ 00000000000005b8 <LBB0_69>: 183: 67 01 00 00 20 00 00 00 r1 <<= 32 184: b7 02 00 00 00 00 00 00 r2 = 0 185: 16 02 02 00 00 00 00 00 if w2 == 0 goto +2 <LBB0_71> 186: c7 01 00 00 20 00 00 00 r1 s>>= 32 187: 05 00 01 00 00 00 00 00 goto +1 <LBB0_72> 00000000000005e0 <LBB0_71>: 188: 77 01 00 00 20 00 00 00 r1 >>= 32 AFTER ===== #30: core_reloc: insn #132 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 #31: core_reloc: insn #134 --> [5] + 0:5: byte_sz --> struct core_reloc_bitfields.u32 129: 18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0 ll 131: 7b 12 20 01 00 00 00 00 *(u64 *)(r2 + 288) = r1 132: b7 01 00 00 08 00 00 00 r1 = 8 ; BYTE_OFFSET relo here ^^^ ; no size check for non-memory dereferencing instructions 133: 0f 12 00 00 00 00 00 00 r2 += r1 134: b7 03 00 00 04 00 00 00 r3 = 4 ; BYTE_SIZE relocation here ^^^ 135: 66 03 05 00 03 00 00 00 if w3 s> 3 goto +5 <LBB0_63> 136: 16 03 09 00 01 00 00 00 if w3 == 1 goto +9 <LBB0_65> 137: 16 03 01 00 02 00 00 00 if w3 == 2 goto +1 <LBB0_66> 138: 05 00 0a 00 00 00 00 00 goto +10 <LBB0_69> 0000000000000458 <LBB0_66>: 139: 69 21 00 00 00 00 00 00 r1 = *(u16 *)(r2 + 0) ; NO CO-RE relocation here ^^^^^^^^^^^^^^^^ 140: 05 00 08 00 00 00 00 00 goto +8 <LBB0_69> 0000000000000468 <LBB0_63>: 141: 16 03 06 00 04 00 00 00 if w3 == 4 goto +6 <LBB0_67> 142: 16 03 01 00 08 00 00 00 if w3 == 8 goto +1 <LBB0_68> 143: 05 00 05 00 00 00 00 00 goto +5 <LBB0_69> 0000000000000480 <LBB0_68>: 144: 79 21 00 00 00 00 00 00 r1 = *(u64 *)(r2 + 0) ; NO CO-RE relocation here ^^^^^^^^^^^^^^^^ 145: 05 00 03 00 00 00 00 00 goto +3 <LBB0_69> 0000000000000490 <LBB0_65>: 146: 71 21 00 00 00 00 00 00 r1 = *(u8 *)(r2 + 0) ; NO CO-RE relocation here ^^^^^^^^^^^^^^^^ 147: 05 00 01 00 00 00 00 00 goto +1 <LBB0_69> 00000000000004a0 <LBB0_67>: 148: 61 21 00 00 00 00 00 00 r1 = *(u32 *)(r2 + 0) ; NO CO-RE relocation here ^^^^^^^^^^^^^^^^ 00000000000004a8 <LBB0_69>: 149: 67 01 00 00 20 00 00 00 r1 <<= 32 150: b7 02 00 00 00 00 00 00 r2 = 0 151: 16 02 02 00 00 00 00 00 if w2 == 0 goto +2 <LBB0_71> 152: c7 01 00 00 20 00 00 00 r1 s>>= 32 153: 05 00 01 00 00 00 00 00 goto +1 <LBB0_72> 00000000000004d0 <LBB0_71>: 154: 77 01 00 00 20 00 00 00 r1 >>= 323 Fixes: ee26dad ("libbpf: Add support for relocatable bitfields") Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Fix BPF_CORE_READ_BITFIELD() macro used for reading CO-RE-relocatable bitfields. Missing breaks in a switch caused 8-byte reads always. This can confuse libbpf because it does strict checks that memory load size corresponds to the original size of the field, which in this case quite often would be wrong. After fixing that, we run into another problem, which quite subtle, so worth documenting here. The issue is in Clang optimization and CO-RE relocation interactions. Without that asm volatile construct (also known as barrier_var()), Clang will re-order BYTE_OFFSET and BYTE_SIZE relocations and will apply BYTE_OFFSET 4 times for each switch case arm. This will result in the same error from libbpf about mismatch of memory load size and original field size. I.e., if we were reading u32, we'd still have *(u8 *), *(u16 *), *(u32 *), and *(u64 *) memory loads, three of which will fail. Using barrier_var() forces Clang to apply BYTE_OFFSET relocation first (and once) to calculate p, after which value of p is used without relocation in each of switch case arms, doing appropiately-sized memory load. Here's the list of relevant relocations and pieces of generated BPF code before and after this patch for test_core_reloc_bitfields_direct selftests. BEFORE ===== #45: core_reloc: insn #160 --> [5] + 0:5: byte_sz --> struct core_reloc_bitfields.u32 #46: core_reloc: insn #167 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 #47: core_reloc: insn #174 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 #48: core_reloc: insn #178 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 #49: core_reloc: insn #182 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 157: 18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0 ll 159: 7b 12 20 01 00 00 00 00 *(u64 *)(r2 + 288) = r1 160: b7 02 00 00 04 00 00 00 r2 = 4 ; BYTE_SIZE relocation here ^^^ 161: 66 02 07 00 03 00 00 00 if w2 s> 3 goto +7 <LBB0_63> 162: 16 02 0d 00 01 00 00 00 if w2 == 1 goto +13 <LBB0_65> 163: 16 02 01 00 02 00 00 00 if w2 == 2 goto +1 <LBB0_66> 164: 05 00 12 00 00 00 00 00 goto +18 <LBB0_69> 0000000000000528 <LBB0_66>: 165: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 167: 69 11 08 00 00 00 00 00 r1 = *(u16 *)(r1 + 8) ; BYTE_OFFSET relo here w/ WRONG size ^^^^^^^^^^^^^^^^ 168: 05 00 0e 00 00 00 00 00 goto +14 <LBB0_69> 0000000000000548 <LBB0_63>: 169: 16 02 0a 00 04 00 00 00 if w2 == 4 goto +10 <LBB0_67> 170: 16 02 01 00 08 00 00 00 if w2 == 8 goto +1 <LBB0_68> 171: 05 00 0b 00 00 00 00 00 goto +11 <LBB0_69> 0000000000000560 <LBB0_68>: 172: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 174: 79 11 08 00 00 00 00 00 r1 = *(u64 *)(r1 + 8) ; BYTE_OFFSET relo here w/ WRONG size ^^^^^^^^^^^^^^^^ 175: 05 00 07 00 00 00 00 00 goto +7 <LBB0_69> 0000000000000580 <LBB0_65>: 176: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 178: 71 11 08 00 00 00 00 00 r1 = *(u8 *)(r1 + 8) ; BYTE_OFFSET relo here w/ WRONG size ^^^^^^^^^^^^^^^^ 179: 05 00 03 00 00 00 00 00 goto +3 <LBB0_69> 00000000000005a0 <LBB0_67>: 180: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 182: 61 11 08 00 00 00 00 00 r1 = *(u32 *)(r1 + 8) ; BYTE_OFFSET relo here w/ RIGHT size ^^^^^^^^^^^^^^^^ 00000000000005b8 <LBB0_69>: 183: 67 01 00 00 20 00 00 00 r1 <<= 32 184: b7 02 00 00 00 00 00 00 r2 = 0 185: 16 02 02 00 00 00 00 00 if w2 == 0 goto +2 <LBB0_71> 186: c7 01 00 00 20 00 00 00 r1 s>>= 32 187: 05 00 01 00 00 00 00 00 goto +1 <LBB0_72> 00000000000005e0 <LBB0_71>: 188: 77 01 00 00 20 00 00 00 r1 >>= 32 AFTER ===== #30: core_reloc: insn #132 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 #31: core_reloc: insn #134 --> [5] + 0:5: byte_sz --> struct core_reloc_bitfields.u32 129: 18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0 ll 131: 7b 12 20 01 00 00 00 00 *(u64 *)(r2 + 288) = r1 132: b7 01 00 00 08 00 00 00 r1 = 8 ; BYTE_OFFSET relo here ^^^ ; no size check for non-memory dereferencing instructions 133: 0f 12 00 00 00 00 00 00 r2 += r1 134: b7 03 00 00 04 00 00 00 r3 = 4 ; BYTE_SIZE relocation here ^^^ 135: 66 03 05 00 03 00 00 00 if w3 s> 3 goto +5 <LBB0_63> 136: 16 03 09 00 01 00 00 00 if w3 == 1 goto +9 <LBB0_65> 137: 16 03 01 00 02 00 00 00 if w3 == 2 goto +1 <LBB0_66> 138: 05 00 0a 00 00 00 00 00 goto +10 <LBB0_69> 0000000000000458 <LBB0_66>: 139: 69 21 00 00 00 00 00 00 r1 = *(u16 *)(r2 + 0) ; NO CO-RE relocation here ^^^^^^^^^^^^^^^^ 140: 05 00 08 00 00 00 00 00 goto +8 <LBB0_69> 0000000000000468 <LBB0_63>: 141: 16 03 06 00 04 00 00 00 if w3 == 4 goto +6 <LBB0_67> 142: 16 03 01 00 08 00 00 00 if w3 == 8 goto +1 <LBB0_68> 143: 05 00 05 00 00 00 00 00 goto +5 <LBB0_69> 0000000000000480 <LBB0_68>: 144: 79 21 00 00 00 00 00 00 r1 = *(u64 *)(r2 + 0) ; NO CO-RE relocation here ^^^^^^^^^^^^^^^^ 145: 05 00 03 00 00 00 00 00 goto +3 <LBB0_69> 0000000000000490 <LBB0_65>: 146: 71 21 00 00 00 00 00 00 r1 = *(u8 *)(r2 + 0) ; NO CO-RE relocation here ^^^^^^^^^^^^^^^^ 147: 05 00 01 00 00 00 00 00 goto +1 <LBB0_69> 00000000000004a0 <LBB0_67>: 148: 61 21 00 00 00 00 00 00 r1 = *(u32 *)(r2 + 0) ; NO CO-RE relocation here ^^^^^^^^^^^^^^^^ 00000000000004a8 <LBB0_69>: 149: 67 01 00 00 20 00 00 00 r1 <<= 32 150: b7 02 00 00 00 00 00 00 r2 = 0 151: 16 02 02 00 00 00 00 00 if w2 == 0 goto +2 <LBB0_71> 152: c7 01 00 00 20 00 00 00 r1 s>>= 32 153: 05 00 01 00 00 00 00 00 goto +1 <LBB0_72> 00000000000004d0 <LBB0_71>: 154: 77 01 00 00 20 00 00 00 r1 >>= 323 Fixes: ee26dad ("libbpf: Add support for relocatable bitfields") Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Fix BPF_CORE_READ_BITFIELD() macro used for reading CO-RE-relocatable bitfields. Missing breaks in a switch caused 8-byte reads always. This can confuse libbpf because it does strict checks that memory load size corresponds to the original size of the field, which in this case quite often would be wrong. After fixing that, we run into another problem, which quite subtle, so worth documenting here. The issue is in Clang optimization and CO-RE relocation interactions. Without that asm volatile construct (also known as barrier_var()), Clang will re-order BYTE_OFFSET and BYTE_SIZE relocations and will apply BYTE_OFFSET 4 times for each switch case arm. This will result in the same error from libbpf about mismatch of memory load size and original field size. I.e., if we were reading u32, we'd still have *(u8 *), *(u16 *), *(u32 *), and *(u64 *) memory loads, three of which will fail. Using barrier_var() forces Clang to apply BYTE_OFFSET relocation first (and once) to calculate p, after which value of p is used without relocation in each of switch case arms, doing appropiately-sized memory load. Here's the list of relevant relocations and pieces of generated BPF code before and after this patch for test_core_reloc_bitfields_direct selftests. BEFORE ===== #45: core_reloc: insn #160 --> [5] + 0:5: byte_sz --> struct core_reloc_bitfields.u32 #46: core_reloc: insn #167 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 #47: core_reloc: insn #174 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 #48: core_reloc: insn #178 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 #49: core_reloc: insn #182 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 157: 18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0 ll 159: 7b 12 20 01 00 00 00 00 *(u64 *)(r2 + 288) = r1 160: b7 02 00 00 04 00 00 00 r2 = 4 ; BYTE_SIZE relocation here ^^^ 161: 66 02 07 00 03 00 00 00 if w2 s> 3 goto +7 <LBB0_63> 162: 16 02 0d 00 01 00 00 00 if w2 == 1 goto +13 <LBB0_65> 163: 16 02 01 00 02 00 00 00 if w2 == 2 goto +1 <LBB0_66> 164: 05 00 12 00 00 00 00 00 goto +18 <LBB0_69> 0000000000000528 <LBB0_66>: 165: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 167: 69 11 08 00 00 00 00 00 r1 = *(u16 *)(r1 + 8) ; BYTE_OFFSET relo here w/ WRONG size ^^^^^^^^^^^^^^^^ 168: 05 00 0e 00 00 00 00 00 goto +14 <LBB0_69> 0000000000000548 <LBB0_63>: 169: 16 02 0a 00 04 00 00 00 if w2 == 4 goto +10 <LBB0_67> 170: 16 02 01 00 08 00 00 00 if w2 == 8 goto +1 <LBB0_68> 171: 05 00 0b 00 00 00 00 00 goto +11 <LBB0_69> 0000000000000560 <LBB0_68>: 172: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 174: 79 11 08 00 00 00 00 00 r1 = *(u64 *)(r1 + 8) ; BYTE_OFFSET relo here w/ WRONG size ^^^^^^^^^^^^^^^^ 175: 05 00 07 00 00 00 00 00 goto +7 <LBB0_69> 0000000000000580 <LBB0_65>: 176: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 178: 71 11 08 00 00 00 00 00 r1 = *(u8 *)(r1 + 8) ; BYTE_OFFSET relo here w/ WRONG size ^^^^^^^^^^^^^^^^ 179: 05 00 03 00 00 00 00 00 goto +3 <LBB0_69> 00000000000005a0 <LBB0_67>: 180: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 182: 61 11 08 00 00 00 00 00 r1 = *(u32 *)(r1 + 8) ; BYTE_OFFSET relo here w/ RIGHT size ^^^^^^^^^^^^^^^^ 00000000000005b8 <LBB0_69>: 183: 67 01 00 00 20 00 00 00 r1 <<= 32 184: b7 02 00 00 00 00 00 00 r2 = 0 185: 16 02 02 00 00 00 00 00 if w2 == 0 goto +2 <LBB0_71> 186: c7 01 00 00 20 00 00 00 r1 s>>= 32 187: 05 00 01 00 00 00 00 00 goto +1 <LBB0_72> 00000000000005e0 <LBB0_71>: 188: 77 01 00 00 20 00 00 00 r1 >>= 32 AFTER ===== #30: core_reloc: insn #132 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 #31: core_reloc: insn #134 --> [5] + 0:5: byte_sz --> struct core_reloc_bitfields.u32 129: 18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0 ll 131: 7b 12 20 01 00 00 00 00 *(u64 *)(r2 + 288) = r1 132: b7 01 00 00 08 00 00 00 r1 = 8 ; BYTE_OFFSET relo here ^^^ ; no size check for non-memory dereferencing instructions 133: 0f 12 00 00 00 00 00 00 r2 += r1 134: b7 03 00 00 04 00 00 00 r3 = 4 ; BYTE_SIZE relocation here ^^^ 135: 66 03 05 00 03 00 00 00 if w3 s> 3 goto +5 <LBB0_63> 136: 16 03 09 00 01 00 00 00 if w3 == 1 goto +9 <LBB0_65> 137: 16 03 01 00 02 00 00 00 if w3 == 2 goto +1 <LBB0_66> 138: 05 00 0a 00 00 00 00 00 goto +10 <LBB0_69> 0000000000000458 <LBB0_66>: 139: 69 21 00 00 00 00 00 00 r1 = *(u16 *)(r2 + 0) ; NO CO-RE relocation here ^^^^^^^^^^^^^^^^ 140: 05 00 08 00 00 00 00 00 goto +8 <LBB0_69> 0000000000000468 <LBB0_63>: 141: 16 03 06 00 04 00 00 00 if w3 == 4 goto +6 <LBB0_67> 142: 16 03 01 00 08 00 00 00 if w3 == 8 goto +1 <LBB0_68> 143: 05 00 05 00 00 00 00 00 goto +5 <LBB0_69> 0000000000000480 <LBB0_68>: 144: 79 21 00 00 00 00 00 00 r1 = *(u64 *)(r2 + 0) ; NO CO-RE relocation here ^^^^^^^^^^^^^^^^ 145: 05 00 03 00 00 00 00 00 goto +3 <LBB0_69> 0000000000000490 <LBB0_65>: 146: 71 21 00 00 00 00 00 00 r1 = *(u8 *)(r2 + 0) ; NO CO-RE relocation here ^^^^^^^^^^^^^^^^ 147: 05 00 01 00 00 00 00 00 goto +1 <LBB0_69> 00000000000004a0 <LBB0_67>: 148: 61 21 00 00 00 00 00 00 r1 = *(u32 *)(r2 + 0) ; NO CO-RE relocation here ^^^^^^^^^^^^^^^^ 00000000000004a8 <LBB0_69>: 149: 67 01 00 00 20 00 00 00 r1 <<= 32 150: b7 02 00 00 00 00 00 00 r2 = 0 151: 16 02 02 00 00 00 00 00 if w2 == 0 goto +2 <LBB0_71> 152: c7 01 00 00 20 00 00 00 r1 s>>= 32 153: 05 00 01 00 00 00 00 00 goto +1 <LBB0_72> 00000000000004d0 <LBB0_71>: 154: 77 01 00 00 20 00 00 00 r1 >>= 323 Acked-by: Lorenz Bauer <lmb@cloudflare.com> Fixes: ee26dad ("libbpf: Add support for relocatable bitfields") Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Since commit 1c123c5 ("bpf: Resolve fext program type when checking map compatibility"), freplace prog can be used as tail-callee. However, when freplace prog has been attached and then updates to PROG_ARRAY map, it will panic, because the updating checks prog type of freplace prog by 'prog->aux->dst_prog->type' and 'prog->aux->dst_prog' of freplace prog is NULL. [309049.036402] BUG: kernel NULL pointer dereference, address: 0000000000000004 [309049.036419] #PF: supervisor read access in kernel mode [309049.036426] #PF: error_code(0x0000) - not-present page [309049.036432] PGD 0 P4D 0 [309049.036437] Oops: 0000 [kernel-patches#1] PREEMPT SMP NOPTI [309049.036444] CPU: 2 PID: 788148 Comm: test_progs Not tainted 6.8.0-31-generic kernel-patches#31-Ubuntu [309049.036465] Hardware name: VMware, Inc. VMware20,1/440BX Desktop Reference Platform, BIOS VMW201.00V.21805430.B64.2305221830 05/22/2023 [309049.036477] RIP: 0010:bpf_prog_map_compatible+0x2a/0x140 [309049.036488] Code: 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 49 89 fe 41 55 41 54 53 44 8b 6e 04 48 89 f3 41 83 fd 1c 75 0c 48 8b 46 38 48 8b 40 70 <44> 8b 68 04 f6 43 03 01 75 1c 48 8b 43 38 44 0f b6 a0 89 00 00 00 [309049.036505] RSP: 0018:ffffb2e080fd7ce0 EFLAGS: 00010246 [309049.036513] RAX: 0000000000000000 RBX: ffffb2e0807c1000 RCX: 0000000000000000 [309049.036521] RDX: 0000000000000000 RSI: ffffb2e0807c1000 RDI: ffff990290259e00 [309049.036528] RBP: ffffb2e080fd7d08 R08: 0000000000000000 R09: 0000000000000000 [309049.036536] R10: 0000000000000000 R11: 0000000000000000 R12: ffff990290259e00 [309049.036543] R13: 000000000000001c R14: ffff990290259e00 R15: ffff99028e29c400 [309049.036551] FS: 00007b82cbc28140(0000) GS:ffff9903b3f00000(0000) knlGS:0000000000000000 [309049.036559] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [309049.036566] CR2: 0000000000000004 CR3: 0000000101286002 CR4: 00000000003706f0 [309049.036573] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [309049.036581] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [309049.036588] Call Trace: [309049.036592] <TASK> [309049.036597] ? show_regs+0x6d/0x80 [309049.036604] ? __die+0x24/0x80 [309049.036619] ? page_fault_oops+0x99/0x1b0 [309049.036628] ? do_user_addr_fault+0x2ee/0x6b0 [309049.036634] ? exc_page_fault+0x83/0x1b0 [309049.036641] ? asm_exc_page_fault+0x27/0x30 [309049.036649] ? bpf_prog_map_compatible+0x2a/0x140 [309049.036656] prog_fd_array_get_ptr+0x2c/0x70 [309049.036664] bpf_fd_array_map_update_elem+0x37/0x130 [309049.036671] bpf_map_update_value+0x1d3/0x260 [309049.036677] map_update_elem+0x1fa/0x360 [309049.036683] __sys_bpf+0x54c/0xa10 [309049.036689] __x64_sys_bpf+0x1a/0x30 [309049.036694] x64_sys_call+0x1936/0x25c0 [309049.036700] do_syscall_64+0x7f/0x180 [309049.036706] ? do_syscall_64+0x8c/0x180 [309049.036712] ? do_syscall_64+0x8c/0x180 [309049.036717] ? irqentry_exit+0x43/0x50 [309049.036723] ? common_interrupt+0x54/0xb0 [309049.036729] entry_SYSCALL_64_after_hwframe+0x73/0x7b Why 'prog->aux->dst_prog' of freplace prog is NULL? It causes by commit 3aac1ea ("bpf: Move prog->aux->linked_prog and trampoline into bpf_link on attach"). As 'prog->aux->dst_prog' of freplace prog is set as NULL when attach, freplace prog does not have stable dst_prog type. But when to update freplace prog to PROG_ARRAY map, it requires checking prog type. They are conflict in theory. This patch resolves prog type of freplace prog by 'prog->aux->saved_dst_prog_type' to avoid panic. Fixes: 1c123c5 ("bpf: Resolve fext program type when checking map compatibility") Signed-off-by: Leon Hwang <hffilwlqm@gmail.com>
The commit f7866c3 ("bpf: Fix null pointer dereference in resolve_prog_type() for BPF_PROG_TYPE_EXT") fixed the following panic, which was caused by updating attached freplace prog to PROG_ARRAY map. But, it does not support updating attached freplace prog to PROG_ARRAY map. [309049.036402] BUG: kernel NULL pointer dereference, address: 0000000000000004 [309049.036419] #PF: supervisor read access in kernel mode [309049.036426] #PF: error_code(0x0000) - not-present page [309049.036432] PGD 0 P4D 0 [309049.036437] Oops: 0000 [kernel-patches#1] PREEMPT SMP NOPTI [309049.036444] CPU: 2 PID: 788148 Comm: test_progs Not tainted 6.8.0-31-generic kernel-patches#31-Ubuntu [309049.036465] Hardware name: VMware, Inc. VMware20,1/440BX Desktop Reference Platform, BIOS VMW201.00V.21805430.B64.2305221830 05/22/2023 [309049.036477] RIP: 0010:bpf_prog_map_compatible+0x2a/0x140 [309049.036488] Code: 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 49 89 fe 41 55 41 54 53 44 8b 6e 04 48 89 f3 41 83 fd 1c 75 0c 48 8b 46 38 48 8b 40 70 <44> 8b 68 04 f6 43 03 01 75 1c 48 8b 43 38 44 0f b6 a0 89 00 00 00 [309049.036505] RSP: 0018:ffffb2e080fd7ce0 EFLAGS: 00010246 [309049.036513] RAX: 0000000000000000 RBX: ffffb2e0807c1000 RCX: 0000000000000000 [309049.036521] RDX: 0000000000000000 RSI: ffffb2e0807c1000 RDI: ffff990290259e00 [309049.036528] RBP: ffffb2e080fd7d08 R08: 0000000000000000 R09: 0000000000000000 [309049.036536] R10: 0000000000000000 R11: 0000000000000000 R12: ffff990290259e00 [309049.036543] R13: 000000000000001c R14: ffff990290259e00 R15: ffff99028e29c400 [309049.036551] FS: 00007b82cbc28140(0000) GS:ffff9903b3f00000(0000) knlGS:0000000000000000 [309049.036559] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [309049.036566] CR2: 0000000000000004 CR3: 0000000101286002 CR4: 00000000003706f0 [309049.036573] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [309049.036581] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [309049.036588] Call Trace: [309049.036592] <TASK> [309049.036597] ? show_regs+0x6d/0x80 [309049.036604] ? __die+0x24/0x80 [309049.036619] ? page_fault_oops+0x99/0x1b0 [309049.036628] ? do_user_addr_fault+0x2ee/0x6b0 [309049.036634] ? exc_page_fault+0x83/0x1b0 [309049.036641] ? asm_exc_page_fault+0x27/0x30 [309049.036649] ? bpf_prog_map_compatible+0x2a/0x140 [309049.036656] prog_fd_array_get_ptr+0x2c/0x70 [309049.036664] bpf_fd_array_map_update_elem+0x37/0x130 [309049.036671] bpf_map_update_value+0x1d3/0x260 [309049.036677] map_update_elem+0x1fa/0x360 [309049.036683] __sys_bpf+0x54c/0xa10 [309049.036689] __x64_sys_bpf+0x1a/0x30 [309049.036694] x64_sys_call+0x1936/0x25c0 [309049.036700] do_syscall_64+0x7f/0x180 [309049.036706] ? do_syscall_64+0x8c/0x180 [309049.036712] ? do_syscall_64+0x8c/0x180 [309049.036717] ? irqentry_exit+0x43/0x50 [309049.036723] ? common_interrupt+0x54/0xb0 [309049.036729] entry_SYSCALL_64_after_hwframe+0x73/0x7b Since commit 1c123c5 ("bpf: Resolve fext program type when checking map compatibility"), freplace prog can be used as tail-callee of its target prog. And the commit 3aac1ea ("bpf: Move prog->aux->linked_prog and trampoline into bpf_link on attach") sets prog->aux->dst_prog as NULL when attach freplace prog to its target. Then, as for following example: tailcall_freplace.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> \#include "bpf_legacy.h" struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps"); int count = 0; __noinline int subprog(struct __sk_buff *skb) { volatile int ret = 1; count++; bpf_tail_call_static(skb, &jmp_table, 0); return ret; } SEC("freplace") int entry(struct __sk_buff *skb) { return subprog(skb); } char __license[] SEC("license") = "GPL"; tc_bpf2bpf.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> \#include "bpf_legacy.h" __noinline int subprog(struct __sk_buff *skb) { volatile int ret = 1; return ret; } SEC("tc") int entry(struct __sk_buff *skb) { return subprog(skb); } char __license[] SEC("license") = "GPL"; And freplace entry prog's target is the tc subprog. After loading, the freplace jmp_table's owner type is BPF_PROG_TYPE_SCHED_CLS. Next, after attaching freplace prog to tc subprog, its prog->aux-> dst_prog is NULL. Next, when update freplace prog to jmp_table, bpf_prog_map_compatible() will return false because resolve_prog_type() returns BPF_PROG_TYPE_EXT instead of BPF_PROG_TYPE_SCHED_CLS. With this patch, resolve_prog_type() return BPF_PROG_TYPE_SCHED_CLS to support updating attached freplace prog to PROG_ARRY map for this example. Fixes: f7866c3 ("bpf: Fix null pointer dereference in resolve_prog_type() for BPF_PROG_TYPE_EXT") Signed-off-by: Leon Hwang <hffilwlqm@gmail.com>
The commit f7866c3 ("bpf: Fix null pointer dereference in resolve_prog_type() for BPF_PROG_TYPE_EXT") fixed the following panic, which was caused by updating attached freplace prog to PROG_ARRAY map. But, it does not support updating attached freplace prog to PROG_ARRAY map. [309049.036402] BUG: kernel NULL pointer dereference, address: 0000000000000004 [309049.036419] #PF: supervisor read access in kernel mode [309049.036426] #PF: error_code(0x0000) - not-present page [309049.036432] PGD 0 P4D 0 [309049.036437] Oops: 0000 [kernel-patches#1] PREEMPT SMP NOPTI [309049.036444] CPU: 2 PID: 788148 Comm: test_progs Not tainted 6.8.0-31-generic kernel-patches#31-Ubuntu [309049.036465] Hardware name: VMware, Inc. VMware20,1/440BX Desktop Reference Platform, BIOS VMW201.00V.21805430.B64.2305221830 05/22/2023 [309049.036477] RIP: 0010:bpf_prog_map_compatible+0x2a/0x140 [309049.036488] Code: 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 49 89 fe 41 55 41 54 53 44 8b 6e 04 48 89 f3 41 83 fd 1c 75 0c 48 8b 46 38 48 8b 40 70 <44> 8b 68 04 f6 43 03 01 75 1c 48 8b 43 38 44 0f b6 a0 89 00 00 00 [309049.036505] RSP: 0018:ffffb2e080fd7ce0 EFLAGS: 00010246 [309049.036513] RAX: 0000000000000000 RBX: ffffb2e0807c1000 RCX: 0000000000000000 [309049.036521] RDX: 0000000000000000 RSI: ffffb2e0807c1000 RDI: ffff990290259e00 [309049.036528] RBP: ffffb2e080fd7d08 R08: 0000000000000000 R09: 0000000000000000 [309049.036536] R10: 0000000000000000 R11: 0000000000000000 R12: ffff990290259e00 [309049.036543] R13: 000000000000001c R14: ffff990290259e00 R15: ffff99028e29c400 [309049.036551] FS: 00007b82cbc28140(0000) GS:ffff9903b3f00000(0000) knlGS:0000000000000000 [309049.036559] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [309049.036566] CR2: 0000000000000004 CR3: 0000000101286002 CR4: 00000000003706f0 [309049.036573] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [309049.036581] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [309049.036588] Call Trace: [309049.036592] <TASK> [309049.036597] ? show_regs+0x6d/0x80 [309049.036604] ? __die+0x24/0x80 [309049.036619] ? page_fault_oops+0x99/0x1b0 [309049.036628] ? do_user_addr_fault+0x2ee/0x6b0 [309049.036634] ? exc_page_fault+0x83/0x1b0 [309049.036641] ? asm_exc_page_fault+0x27/0x30 [309049.036649] ? bpf_prog_map_compatible+0x2a/0x140 [309049.036656] prog_fd_array_get_ptr+0x2c/0x70 [309049.036664] bpf_fd_array_map_update_elem+0x37/0x130 [309049.036671] bpf_map_update_value+0x1d3/0x260 [309049.036677] map_update_elem+0x1fa/0x360 [309049.036683] __sys_bpf+0x54c/0xa10 [309049.036689] __x64_sys_bpf+0x1a/0x30 [309049.036694] x64_sys_call+0x1936/0x25c0 [309049.036700] do_syscall_64+0x7f/0x180 [309049.036706] ? do_syscall_64+0x8c/0x180 [309049.036712] ? do_syscall_64+0x8c/0x180 [309049.036717] ? irqentry_exit+0x43/0x50 [309049.036723] ? common_interrupt+0x54/0xb0 [309049.036729] entry_SYSCALL_64_after_hwframe+0x73/0x7b Since commit 1c123c5 ("bpf: Resolve fext program type when checking map compatibility"), freplace prog can be used as tail-callee of its target prog. And the commit 3aac1ea ("bpf: Move prog->aux->linked_prog and trampoline into bpf_link on attach") sets prog->aux->dst_prog as NULL when attach freplace prog to its target. Then, as for following example: tailcall_freplace.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> \#include "bpf_legacy.h" struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps"); int count = 0; __noinline int subprog(struct __sk_buff *skb) { volatile int ret = 1; count++; bpf_tail_call_static(skb, &jmp_table, 0); return ret; } SEC("freplace") int entry(struct __sk_buff *skb) { return subprog(skb); } char __license[] SEC("license") = "GPL"; tc_bpf2bpf.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> \#include "bpf_legacy.h" __noinline int subprog(struct __sk_buff *skb) { volatile int ret = 1; return ret; } SEC("tc") int entry(struct __sk_buff *skb) { return subprog(skb); } char __license[] SEC("license") = "GPL"; And freplace entry prog's target is the tc subprog. After loading, the freplace jmp_table's owner type is BPF_PROG_TYPE_SCHED_CLS. Next, after attaching freplace prog to tc subprog, its prog->aux-> dst_prog is NULL. Next, when update freplace prog to jmp_table, bpf_prog_map_compatible() returns false because resolve_prog_type() returns BPF_PROG_TYPE_EXT instead of BPF_PROG_TYPE_SCHED_CLS. With this patch, resolve_prog_type() returns BPF_PROG_TYPE_SCHED_CLS to support updating attached freplace prog to PROG_ARRY map for this example. Fixes: f7866c3 ("bpf: Fix null pointer dereference in resolve_prog_type() for BPF_PROG_TYPE_EXT") Cc: Toke Høiland-Jørgensen <toke@redhat.com> Cc: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
The commit f7866c3 ("bpf: Fix null pointer dereference in resolve_prog_type() for BPF_PROG_TYPE_EXT") fixed the following panic, which was caused by updating attached freplace prog to PROG_ARRAY map. But, it does not support updating attached freplace prog to PROG_ARRAY map. [309049.036402] BUG: kernel NULL pointer dereference, address: 0000000000000004 [309049.036419] #PF: supervisor read access in kernel mode [309049.036426] #PF: error_code(0x0000) - not-present page [309049.036432] PGD 0 P4D 0 [309049.036437] Oops: 0000 [#1] PREEMPT SMP NOPTI [309049.036444] CPU: 2 PID: 788148 Comm: test_progs Not tainted 6.8.0-31-generic #31-Ubuntu [309049.036465] Hardware name: VMware, Inc. VMware20,1/440BX Desktop Reference Platform, BIOS VMW201.00V.21805430.B64.2305221830 05/22/2023 [309049.036477] RIP: 0010:bpf_prog_map_compatible+0x2a/0x140 [309049.036488] Code: 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 49 89 fe 41 55 41 54 53 44 8b 6e 04 48 89 f3 41 83 fd 1c 75 0c 48 8b 46 38 48 8b 40 70 <44> 8b 68 04 f6 43 03 01 75 1c 48 8b 43 38 44 0f b6 a0 89 00 00 00 [309049.036505] RSP: 0018:ffffb2e080fd7ce0 EFLAGS: 00010246 [309049.036513] RAX: 0000000000000000 RBX: ffffb2e0807c1000 RCX: 0000000000000000 [309049.036521] RDX: 0000000000000000 RSI: ffffb2e0807c1000 RDI: ffff990290259e00 [309049.036528] RBP: ffffb2e080fd7d08 R08: 0000000000000000 R09: 0000000000000000 [309049.036536] R10: 0000000000000000 R11: 0000000000000000 R12: ffff990290259e00 [309049.036543] R13: 000000000000001c R14: ffff990290259e00 R15: ffff99028e29c400 [309049.036551] FS: 00007b82cbc28140(0000) GS:ffff9903b3f00000(0000) knlGS:0000000000000000 [309049.036559] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [309049.036566] CR2: 0000000000000004 CR3: 0000000101286002 CR4: 00000000003706f0 [309049.036573] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [309049.036581] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [309049.036588] Call Trace: [309049.036592] <TASK> [309049.036597] ? show_regs+0x6d/0x80 [309049.036604] ? __die+0x24/0x80 [309049.036619] ? page_fault_oops+0x99/0x1b0 [309049.036628] ? do_user_addr_fault+0x2ee/0x6b0 [309049.036634] ? exc_page_fault+0x83/0x1b0 [309049.036641] ? asm_exc_page_fault+0x27/0x30 [309049.036649] ? bpf_prog_map_compatible+0x2a/0x140 [309049.036656] prog_fd_array_get_ptr+0x2c/0x70 [309049.036664] bpf_fd_array_map_update_elem+0x37/0x130 [309049.036671] bpf_map_update_value+0x1d3/0x260 [309049.036677] map_update_elem+0x1fa/0x360 [309049.036683] __sys_bpf+0x54c/0xa10 [309049.036689] __x64_sys_bpf+0x1a/0x30 [309049.036694] x64_sys_call+0x1936/0x25c0 [309049.036700] do_syscall_64+0x7f/0x180 [309049.036706] ? do_syscall_64+0x8c/0x180 [309049.036712] ? do_syscall_64+0x8c/0x180 [309049.036717] ? irqentry_exit+0x43/0x50 [309049.036723] ? common_interrupt+0x54/0xb0 [309049.036729] entry_SYSCALL_64_after_hwframe+0x73/0x7b Since commit 1c123c5 ("bpf: Resolve fext program type when checking map compatibility"), freplace prog can be used as tail-callee of its target prog. And the commit 3aac1ea ("bpf: Move prog->aux->linked_prog and trampoline into bpf_link on attach") sets prog->aux->dst_prog as NULL when attach freplace prog to its target. Then, as for following example: tailcall_freplace.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> \#include "bpf_legacy.h" struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps"); int count = 0; __noinline int subprog(struct __sk_buff *skb) { volatile int ret = 1; count++; bpf_tail_call_static(skb, &jmp_table, 0); return ret; } SEC("freplace") int entry(struct __sk_buff *skb) { return subprog(skb); } char __license[] SEC("license") = "GPL"; tc_bpf2bpf.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> \#include "bpf_legacy.h" __noinline int subprog(struct __sk_buff *skb) { volatile int ret = 1; return ret; } SEC("tc") int entry(struct __sk_buff *skb) { return subprog(skb); } char __license[] SEC("license") = "GPL"; And freplace entry prog's target is the tc subprog. After loading, the freplace jmp_table's owner type is BPF_PROG_TYPE_SCHED_CLS. Next, after attaching freplace prog to tc subprog, its prog->aux-> dst_prog is NULL. Next, when update freplace prog to jmp_table, bpf_prog_map_compatible() returns false because resolve_prog_type() returns BPF_PROG_TYPE_EXT instead of BPF_PROG_TYPE_SCHED_CLS. With this patch, resolve_prog_type() returns BPF_PROG_TYPE_SCHED_CLS to support updating attached freplace prog to PROG_ARRY map for this example. Fixes: f7866c3 ("bpf: Fix null pointer dereference in resolve_prog_type() for BPF_PROG_TYPE_EXT") Cc: Toke Høiland-Jørgensen <toke@redhat.com> Cc: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
The commit f7866c3 ("bpf: Fix null pointer dereference in resolve_prog_type() for BPF_PROG_TYPE_EXT") fixed the following panic, which was caused by updating attached freplace prog to PROG_ARRAY map. But, it does not support updating attached freplace prog to PROG_ARRAY map. [309049.036402] BUG: kernel NULL pointer dereference, address: 0000000000000004 [309049.036419] #PF: supervisor read access in kernel mode [309049.036426] #PF: error_code(0x0000) - not-present page [309049.036432] PGD 0 P4D 0 [309049.036437] Oops: 0000 [#1] PREEMPT SMP NOPTI [309049.036444] CPU: 2 PID: 788148 Comm: test_progs Not tainted 6.8.0-31-generic #31-Ubuntu [309049.036465] Hardware name: VMware, Inc. VMware20,1/440BX Desktop Reference Platform, BIOS VMW201.00V.21805430.B64.2305221830 05/22/2023 [309049.036477] RIP: 0010:bpf_prog_map_compatible+0x2a/0x140 [309049.036488] Code: 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 49 89 fe 41 55 41 54 53 44 8b 6e 04 48 89 f3 41 83 fd 1c 75 0c 48 8b 46 38 48 8b 40 70 <44> 8b 68 04 f6 43 03 01 75 1c 48 8b 43 38 44 0f b6 a0 89 00 00 00 [309049.036505] RSP: 0018:ffffb2e080fd7ce0 EFLAGS: 00010246 [309049.036513] RAX: 0000000000000000 RBX: ffffb2e0807c1000 RCX: 0000000000000000 [309049.036521] RDX: 0000000000000000 RSI: ffffb2e0807c1000 RDI: ffff990290259e00 [309049.036528] RBP: ffffb2e080fd7d08 R08: 0000000000000000 R09: 0000000000000000 [309049.036536] R10: 0000000000000000 R11: 0000000000000000 R12: ffff990290259e00 [309049.036543] R13: 000000000000001c R14: ffff990290259e00 R15: ffff99028e29c400 [309049.036551] FS: 00007b82cbc28140(0000) GS:ffff9903b3f00000(0000) knlGS:0000000000000000 [309049.036559] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [309049.036566] CR2: 0000000000000004 CR3: 0000000101286002 CR4: 00000000003706f0 [309049.036573] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [309049.036581] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [309049.036588] Call Trace: [309049.036592] <TASK> [309049.036597] ? show_regs+0x6d/0x80 [309049.036604] ? __die+0x24/0x80 [309049.036619] ? page_fault_oops+0x99/0x1b0 [309049.036628] ? do_user_addr_fault+0x2ee/0x6b0 [309049.036634] ? exc_page_fault+0x83/0x1b0 [309049.036641] ? asm_exc_page_fault+0x27/0x30 [309049.036649] ? bpf_prog_map_compatible+0x2a/0x140 [309049.036656] prog_fd_array_get_ptr+0x2c/0x70 [309049.036664] bpf_fd_array_map_update_elem+0x37/0x130 [309049.036671] bpf_map_update_value+0x1d3/0x260 [309049.036677] map_update_elem+0x1fa/0x360 [309049.036683] __sys_bpf+0x54c/0xa10 [309049.036689] __x64_sys_bpf+0x1a/0x30 [309049.036694] x64_sys_call+0x1936/0x25c0 [309049.036700] do_syscall_64+0x7f/0x180 [309049.036706] ? do_syscall_64+0x8c/0x180 [309049.036712] ? do_syscall_64+0x8c/0x180 [309049.036717] ? irqentry_exit+0x43/0x50 [309049.036723] ? common_interrupt+0x54/0xb0 [309049.036729] entry_SYSCALL_64_after_hwframe+0x73/0x7b Since commit 1c123c5 ("bpf: Resolve fext program type when checking map compatibility"), freplace prog can be used as tail-callee of its target prog. And the commit 3aac1ea ("bpf: Move prog->aux->linked_prog and trampoline into bpf_link on attach") sets prog->aux->dst_prog as NULL when attach freplace prog to its target. Then, as for following example: tailcall_freplace.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> \#include "bpf_legacy.h" struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps"); int count = 0; __noinline int subprog(struct __sk_buff *skb) { volatile int ret = 1; count++; bpf_tail_call_static(skb, &jmp_table, 0); return ret; } SEC("freplace") int entry(struct __sk_buff *skb) { return subprog(skb); } char __license[] SEC("license") = "GPL"; tc_bpf2bpf.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> \#include "bpf_legacy.h" __noinline int subprog(struct __sk_buff *skb) { volatile int ret = 1; return ret; } SEC("tc") int entry(struct __sk_buff *skb) { return subprog(skb); } char __license[] SEC("license") = "GPL"; And freplace entry prog's target is the tc subprog. After loading, the freplace jmp_table's owner type is BPF_PROG_TYPE_SCHED_CLS. Next, after attaching freplace prog to tc subprog, its prog->aux-> dst_prog is NULL. Next, when update freplace prog to jmp_table, bpf_prog_map_compatible() returns false because resolve_prog_type() returns BPF_PROG_TYPE_EXT instead of BPF_PROG_TYPE_SCHED_CLS. With this patch, resolve_prog_type() returns BPF_PROG_TYPE_SCHED_CLS to support updating attached freplace prog to PROG_ARRY map for this example. Fixes: f7866c3 ("bpf: Fix null pointer dereference in resolve_prog_type() for BPF_PROG_TYPE_EXT") Cc: Toke Høiland-Jørgensen <toke@redhat.com> Cc: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
The commit f7866c3 ("bpf: Fix null pointer dereference in resolve_prog_type() for BPF_PROG_TYPE_EXT") fixed the following panic, which was caused by updating attached freplace prog to PROG_ARRAY map. But, it does not support updating attached freplace prog to PROG_ARRAY map. [309049.036402] BUG: kernel NULL pointer dereference, address: 0000000000000004 [309049.036419] #PF: supervisor read access in kernel mode [309049.036426] #PF: error_code(0x0000) - not-present page [309049.036432] PGD 0 P4D 0 [309049.036437] Oops: 0000 [#1] PREEMPT SMP NOPTI [309049.036444] CPU: 2 PID: 788148 Comm: test_progs Not tainted 6.8.0-31-generic #31-Ubuntu [309049.036465] Hardware name: VMware, Inc. VMware20,1/440BX Desktop Reference Platform, BIOS VMW201.00V.21805430.B64.2305221830 05/22/2023 [309049.036477] RIP: 0010:bpf_prog_map_compatible+0x2a/0x140 [309049.036488] Code: 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 49 89 fe 41 55 41 54 53 44 8b 6e 04 48 89 f3 41 83 fd 1c 75 0c 48 8b 46 38 48 8b 40 70 <44> 8b 68 04 f6 43 03 01 75 1c 48 8b 43 38 44 0f b6 a0 89 00 00 00 [309049.036505] RSP: 0018:ffffb2e080fd7ce0 EFLAGS: 00010246 [309049.036513] RAX: 0000000000000000 RBX: ffffb2e0807c1000 RCX: 0000000000000000 [309049.036521] RDX: 0000000000000000 RSI: ffffb2e0807c1000 RDI: ffff990290259e00 [309049.036528] RBP: ffffb2e080fd7d08 R08: 0000000000000000 R09: 0000000000000000 [309049.036536] R10: 0000000000000000 R11: 0000000000000000 R12: ffff990290259e00 [309049.036543] R13: 000000000000001c R14: ffff990290259e00 R15: ffff99028e29c400 [309049.036551] FS: 00007b82cbc28140(0000) GS:ffff9903b3f00000(0000) knlGS:0000000000000000 [309049.036559] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [309049.036566] CR2: 0000000000000004 CR3: 0000000101286002 CR4: 00000000003706f0 [309049.036573] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [309049.036581] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [309049.036588] Call Trace: [309049.036592] <TASK> [309049.036597] ? show_regs+0x6d/0x80 [309049.036604] ? __die+0x24/0x80 [309049.036619] ? page_fault_oops+0x99/0x1b0 [309049.036628] ? do_user_addr_fault+0x2ee/0x6b0 [309049.036634] ? exc_page_fault+0x83/0x1b0 [309049.036641] ? asm_exc_page_fault+0x27/0x30 [309049.036649] ? bpf_prog_map_compatible+0x2a/0x140 [309049.036656] prog_fd_array_get_ptr+0x2c/0x70 [309049.036664] bpf_fd_array_map_update_elem+0x37/0x130 [309049.036671] bpf_map_update_value+0x1d3/0x260 [309049.036677] map_update_elem+0x1fa/0x360 [309049.036683] __sys_bpf+0x54c/0xa10 [309049.036689] __x64_sys_bpf+0x1a/0x30 [309049.036694] x64_sys_call+0x1936/0x25c0 [309049.036700] do_syscall_64+0x7f/0x180 [309049.036706] ? do_syscall_64+0x8c/0x180 [309049.036712] ? do_syscall_64+0x8c/0x180 [309049.036717] ? irqentry_exit+0x43/0x50 [309049.036723] ? common_interrupt+0x54/0xb0 [309049.036729] entry_SYSCALL_64_after_hwframe+0x73/0x7b Since commit 1c123c5 ("bpf: Resolve fext program type when checking map compatibility"), freplace prog can be used as tail-callee of its target prog. And the commit 3aac1ea ("bpf: Move prog->aux->linked_prog and trampoline into bpf_link on attach") sets prog->aux->dst_prog as NULL when attach freplace prog to its target. Then, as for following example: tailcall_freplace.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> \#include "bpf_legacy.h" struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps"); int count = 0; __noinline int subprog(struct __sk_buff *skb) { volatile int ret = 1; count++; bpf_tail_call_static(skb, &jmp_table, 0); return ret; } SEC("freplace") int entry(struct __sk_buff *skb) { return subprog(skb); } char __license[] SEC("license") = "GPL"; tc_bpf2bpf.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> \#include "bpf_legacy.h" __noinline int subprog(struct __sk_buff *skb) { volatile int ret = 1; return ret; } SEC("tc") int entry(struct __sk_buff *skb) { return subprog(skb); } char __license[] SEC("license") = "GPL"; And freplace entry prog's target is the tc subprog. After loading, the freplace jmp_table's owner type is BPF_PROG_TYPE_SCHED_CLS. Next, after attaching freplace prog to tc subprog, its prog->aux-> dst_prog is NULL. Next, when update freplace prog to jmp_table, bpf_prog_map_compatible() returns false because resolve_prog_type() returns BPF_PROG_TYPE_EXT instead of BPF_PROG_TYPE_SCHED_CLS. With this patch, resolve_prog_type() returns BPF_PROG_TYPE_SCHED_CLS to support updating attached freplace prog to PROG_ARRY map for this example. Fixes: f7866c3 ("bpf: Fix null pointer dereference in resolve_prog_type() for BPF_PROG_TYPE_EXT") Cc: Toke Høiland-Jørgensen <toke@redhat.com> Cc: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
The commit f7866c3 ("bpf: Fix null pointer dereference in resolve_prog_type() for BPF_PROG_TYPE_EXT") fixed the following panic, which was caused by updating attached freplace prog to PROG_ARRAY map. But, it does not support updating attached freplace prog to PROG_ARRAY map. [309049.036402] BUG: kernel NULL pointer dereference, address: 0000000000000004 [309049.036419] #PF: supervisor read access in kernel mode [309049.036426] #PF: error_code(0x0000) - not-present page [309049.036432] PGD 0 P4D 0 [309049.036437] Oops: 0000 [#1] PREEMPT SMP NOPTI [309049.036444] CPU: 2 PID: 788148 Comm: test_progs Not tainted 6.8.0-31-generic #31-Ubuntu [309049.036465] Hardware name: VMware, Inc. VMware20,1/440BX Desktop Reference Platform, BIOS VMW201.00V.21805430.B64.2305221830 05/22/2023 [309049.036477] RIP: 0010:bpf_prog_map_compatible+0x2a/0x140 [309049.036488] Code: 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 49 89 fe 41 55 41 54 53 44 8b 6e 04 48 89 f3 41 83 fd 1c 75 0c 48 8b 46 38 48 8b 40 70 <44> 8b 68 04 f6 43 03 01 75 1c 48 8b 43 38 44 0f b6 a0 89 00 00 00 [309049.036505] RSP: 0018:ffffb2e080fd7ce0 EFLAGS: 00010246 [309049.036513] RAX: 0000000000000000 RBX: ffffb2e0807c1000 RCX: 0000000000000000 [309049.036521] RDX: 0000000000000000 RSI: ffffb2e0807c1000 RDI: ffff990290259e00 [309049.036528] RBP: ffffb2e080fd7d08 R08: 0000000000000000 R09: 0000000000000000 [309049.036536] R10: 0000000000000000 R11: 0000000000000000 R12: ffff990290259e00 [309049.036543] R13: 000000000000001c R14: ffff990290259e00 R15: ffff99028e29c400 [309049.036551] FS: 00007b82cbc28140(0000) GS:ffff9903b3f00000(0000) knlGS:0000000000000000 [309049.036559] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [309049.036566] CR2: 0000000000000004 CR3: 0000000101286002 CR4: 00000000003706f0 [309049.036573] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [309049.036581] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [309049.036588] Call Trace: [309049.036592] <TASK> [309049.036597] ? show_regs+0x6d/0x80 [309049.036604] ? __die+0x24/0x80 [309049.036619] ? page_fault_oops+0x99/0x1b0 [309049.036628] ? do_user_addr_fault+0x2ee/0x6b0 [309049.036634] ? exc_page_fault+0x83/0x1b0 [309049.036641] ? asm_exc_page_fault+0x27/0x30 [309049.036649] ? bpf_prog_map_compatible+0x2a/0x140 [309049.036656] prog_fd_array_get_ptr+0x2c/0x70 [309049.036664] bpf_fd_array_map_update_elem+0x37/0x130 [309049.036671] bpf_map_update_value+0x1d3/0x260 [309049.036677] map_update_elem+0x1fa/0x360 [309049.036683] __sys_bpf+0x54c/0xa10 [309049.036689] __x64_sys_bpf+0x1a/0x30 [309049.036694] x64_sys_call+0x1936/0x25c0 [309049.036700] do_syscall_64+0x7f/0x180 [309049.036706] ? do_syscall_64+0x8c/0x180 [309049.036712] ? do_syscall_64+0x8c/0x180 [309049.036717] ? irqentry_exit+0x43/0x50 [309049.036723] ? common_interrupt+0x54/0xb0 [309049.036729] entry_SYSCALL_64_after_hwframe+0x73/0x7b Since commit 1c123c5 ("bpf: Resolve fext program type when checking map compatibility"), freplace prog can be used as tail-callee of its target prog. And the commit 3aac1ea ("bpf: Move prog->aux->linked_prog and trampoline into bpf_link on attach") sets prog->aux->dst_prog as NULL when attach freplace prog to its target. Then, as for following example: tailcall_freplace.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> \#include "bpf_legacy.h" struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps"); int count = 0; __noinline int subprog(struct __sk_buff *skb) { volatile int ret = 1; count++; bpf_tail_call_static(skb, &jmp_table, 0); return ret; } SEC("freplace") int entry(struct __sk_buff *skb) { return subprog(skb); } char __license[] SEC("license") = "GPL"; tc_bpf2bpf.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> \#include "bpf_legacy.h" __noinline int subprog(struct __sk_buff *skb) { volatile int ret = 1; return ret; } SEC("tc") int entry(struct __sk_buff *skb) { return subprog(skb); } char __license[] SEC("license") = "GPL"; And freplace entry prog's target is the tc subprog. After loading, the freplace jmp_table's owner type is BPF_PROG_TYPE_SCHED_CLS. Next, after attaching freplace prog to tc subprog, its prog->aux-> dst_prog is NULL. Next, when update freplace prog to jmp_table, bpf_prog_map_compatible() returns false because resolve_prog_type() returns BPF_PROG_TYPE_EXT instead of BPF_PROG_TYPE_SCHED_CLS. With this patch, resolve_prog_type() returns BPF_PROG_TYPE_SCHED_CLS to support updating attached freplace prog to PROG_ARRY map for this example. Fixes: f7866c3 ("bpf: Fix null pointer dereference in resolve_prog_type() for BPF_PROG_TYPE_EXT") Cc: Toke Høiland-Jørgensen <toke@redhat.com> Cc: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
The commit f7866c3 ("bpf: Fix null pointer dereference in resolve_prog_type() for BPF_PROG_TYPE_EXT") fixed the following panic, which was caused by updating attached freplace prog to PROG_ARRAY map. But, it does not support updating attached freplace prog to PROG_ARRAY map. [309049.036402] BUG: kernel NULL pointer dereference, address: 0000000000000004 [309049.036419] #PF: supervisor read access in kernel mode [309049.036426] #PF: error_code(0x0000) - not-present page [309049.036432] PGD 0 P4D 0 [309049.036437] Oops: 0000 [#1] PREEMPT SMP NOPTI [309049.036444] CPU: 2 PID: 788148 Comm: test_progs Not tainted 6.8.0-31-generic #31-Ubuntu [309049.036465] Hardware name: VMware, Inc. VMware20,1/440BX Desktop Reference Platform, BIOS VMW201.00V.21805430.B64.2305221830 05/22/2023 [309049.036477] RIP: 0010:bpf_prog_map_compatible+0x2a/0x140 [309049.036488] Code: 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 49 89 fe 41 55 41 54 53 44 8b 6e 04 48 89 f3 41 83 fd 1c 75 0c 48 8b 46 38 48 8b 40 70 <44> 8b 68 04 f6 43 03 01 75 1c 48 8b 43 38 44 0f b6 a0 89 00 00 00 [309049.036505] RSP: 0018:ffffb2e080fd7ce0 EFLAGS: 00010246 [309049.036513] RAX: 0000000000000000 RBX: ffffb2e0807c1000 RCX: 0000000000000000 [309049.036521] RDX: 0000000000000000 RSI: ffffb2e0807c1000 RDI: ffff990290259e00 [309049.036528] RBP: ffffb2e080fd7d08 R08: 0000000000000000 R09: 0000000000000000 [309049.036536] R10: 0000000000000000 R11: 0000000000000000 R12: ffff990290259e00 [309049.036543] R13: 000000000000001c R14: ffff990290259e00 R15: ffff99028e29c400 [309049.036551] FS: 00007b82cbc28140(0000) GS:ffff9903b3f00000(0000) knlGS:0000000000000000 [309049.036559] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [309049.036566] CR2: 0000000000000004 CR3: 0000000101286002 CR4: 00000000003706f0 [309049.036573] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [309049.036581] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [309049.036588] Call Trace: [309049.036592] <TASK> [309049.036597] ? show_regs+0x6d/0x80 [309049.036604] ? __die+0x24/0x80 [309049.036619] ? page_fault_oops+0x99/0x1b0 [309049.036628] ? do_user_addr_fault+0x2ee/0x6b0 [309049.036634] ? exc_page_fault+0x83/0x1b0 [309049.036641] ? asm_exc_page_fault+0x27/0x30 [309049.036649] ? bpf_prog_map_compatible+0x2a/0x140 [309049.036656] prog_fd_array_get_ptr+0x2c/0x70 [309049.036664] bpf_fd_array_map_update_elem+0x37/0x130 [309049.036671] bpf_map_update_value+0x1d3/0x260 [309049.036677] map_update_elem+0x1fa/0x360 [309049.036683] __sys_bpf+0x54c/0xa10 [309049.036689] __x64_sys_bpf+0x1a/0x30 [309049.036694] x64_sys_call+0x1936/0x25c0 [309049.036700] do_syscall_64+0x7f/0x180 [309049.036706] ? do_syscall_64+0x8c/0x180 [309049.036712] ? do_syscall_64+0x8c/0x180 [309049.036717] ? irqentry_exit+0x43/0x50 [309049.036723] ? common_interrupt+0x54/0xb0 [309049.036729] entry_SYSCALL_64_after_hwframe+0x73/0x7b Since commit 1c123c5 ("bpf: Resolve fext program type when checking map compatibility"), freplace prog can be used as tail-callee of its target prog. And the commit 3aac1ea ("bpf: Move prog->aux->linked_prog and trampoline into bpf_link on attach") sets prog->aux->dst_prog as NULL when attach freplace prog to its target. Then, as for following example: tailcall_freplace.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> \#include "bpf_legacy.h" struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 1); __uint(key_size, sizeof(__u32)); __uint(value_size, sizeof(__u32)); } jmp_table SEC(".maps"); int count = 0; __noinline int subprog(struct __sk_buff *skb) { volatile int ret = 1; count++; bpf_tail_call_static(skb, &jmp_table, 0); return ret; } SEC("freplace") int entry(struct __sk_buff *skb) { return subprog(skb); } char __license[] SEC("license") = "GPL"; tc_bpf2bpf.c: // SPDX-License-Identifier: GPL-2.0 \#include <linux/bpf.h> \#include <bpf/bpf_helpers.h> \#include "bpf_legacy.h" __noinline int subprog(struct __sk_buff *skb) { volatile int ret = 1; return ret; } SEC("tc") int entry(struct __sk_buff *skb) { return subprog(skb); } char __license[] SEC("license") = "GPL"; And freplace entry prog's target is the tc subprog. After loading, the freplace jmp_table's owner type is BPF_PROG_TYPE_SCHED_CLS. Next, after attaching freplace prog to tc subprog, its prog->aux-> dst_prog is NULL. Next, when update freplace prog to jmp_table, bpf_prog_map_compatible() returns false because resolve_prog_type() returns BPF_PROG_TYPE_EXT instead of BPF_PROG_TYPE_SCHED_CLS. With this patch, resolve_prog_type() returns BPF_PROG_TYPE_SCHED_CLS to support updating attached freplace prog to PROG_ARRY map for this example. Fixes: f7866c3 ("bpf: Fix null pointer dereference in resolve_prog_type() for BPF_PROG_TYPE_EXT") Cc: Toke Høiland-Jørgensen <toke@redhat.com> Cc: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
…er dereference
A malicious HID device with quirk APPLE_MAGIC_BACKLIGHT can trigger a NULL
pointer dereference whilst the power feature-report is toggled and sent to
the device in apple_magic_backlight_report_set(). The power feature-report
is expected to have two data fields, but if the descriptor declares one
field then accessing field[1] and dereferencing it in
apple_magic_backlight_report_set() becomes invalid
since field[1] will be NULL.
An example of a minimal descriptor which can cause the crash is something
like the following where the report with ID 3 (power report) only
references a single 1-byte field. When hid core parses the descriptor it
will encounter the final feature tag, allocate a hid_report (all members
of field[] will be zeroed out), create field structure and populate it,
increasing the maxfield to 1. The subsequent field[1] access and
dereference causes the crash.
Usage Page (Vendor Defined 0xFF00)
Usage (0x0F)
Collection (Application)
Report ID (1)
Usage (0x01)
Logical Minimum (0)
Logical Maximum (255)
Report Size (8)
Report Count (1)
Feature (Data,Var,Abs)
Usage (0x02)
Logical Maximum (32767)
Report Size (16)
Report Count (1)
Feature (Data,Var,Abs)
Report ID (3)
Usage (0x03)
Logical Minimum (0)
Logical Maximum (1)
Report Size (8)
Report Count (1)
Feature (Data,Var,Abs)
End Collection
Here we see the KASAN splat when the kernel dereferences the
NULL pointer and crashes:
[ 15.164723] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000006: 0000 [#1] SMP KASAN NOPTI
[ 15.165691] KASAN: null-ptr-deref in range [0x0000000000000030-0x0000000000000037]
[ 15.165691] CPU: 0 UID: 0 PID: 10 Comm: kworker/0:1 Not tainted 6.15.0 #31 PREEMPT(voluntary)
[ 15.165691] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[ 15.165691] RIP: 0010:apple_magic_backlight_report_set+0xbf/0x210
[ 15.165691] Call Trace:
[ 15.165691] <TASK>
[ 15.165691] apple_probe+0x571/0xa20
[ 15.165691] hid_device_probe+0x2e2/0x6f0
[ 15.165691] really_probe+0x1ca/0x5c0
[ 15.165691] __driver_probe_device+0x24f/0x310
[ 15.165691] driver_probe_device+0x4a/0xd0
[ 15.165691] __device_attach_driver+0x169/0x220
[ 15.165691] bus_for_each_drv+0x118/0x1b0
[ 15.165691] __device_attach+0x1d5/0x380
[ 15.165691] device_initial_probe+0x12/0x20
[ 15.165691] bus_probe_device+0x13d/0x180
[ 15.165691] device_add+0xd87/0x1510
[...]
To fix this issue we should validate the number of fields that the
backlight and power reports have and if they do not have the required
number of fields then bail.
Fixes: 394ba61 ("HID: apple: Add support for magic keyboard backlight on T2 Macs")
Cc: stable@vger.kernel.org
Signed-off-by: Qasim Ijaz <qasdev00@gmail.com>
Reviewed-by: Orlando Chamberlain <orlandoch.dev@gmail.com>
Tested-by: Aditya Garg <gargaditya08@live.com>
Link: https://patch.msgid.link/20250713233008.15131-1-qasdev00@gmail.com
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
Without the change `perf `hangs up on charaster devices. On my system
it's enough to run system-wide sampler for a few seconds to get the
hangup:
$ perf record -a -g --call-graph=dwarf
$ perf report
# hung
`strace` shows that hangup happens on reading on a character device
`/dev/dri/renderD128`
$ strace -y -f -p 2780484
strace: Process 2780484 attached
pread64(101</dev/dri/renderD128>, strace: Process 2780484 detached
It's call trace descends into `elfutils`:
$ gdb -p 2780484
(gdb) bt
#0 0x00007f5e508f04b7 in __libc_pread64 (fd=101, buf=0x7fff9df7edb0, count=0, offset=0)
at ../sysdeps/unix/sysv/linux/pread64.c:25
#1 0x00007f5e52b79515 in read_file () from /<<NIX>>/elfutils-0.192/lib/libelf.so.1
#2 0x00007f5e52b25666 in libdw_open_elf () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
#3 0x00007f5e52b25907 in __libdw_open_file () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
#4 0x00007f5e52b120a9 in dwfl_report_elf@@ELFUTILS_0.156 ()
from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
#5 0x000000000068bf20 in __report_module (al=al@entry=0x7fff9df80010, ip=ip@entry=139803237033216, ui=ui@entry=0x5369b5e0)
at util/dso.h:537
#6 0x000000000068c3d1 in report_module (ip=139803237033216, ui=0x5369b5e0) at util/unwind-libdw.c:114
#7 frame_callback (state=0x535aef10, arg=0x5369b5e0) at util/unwind-libdw.c:242
#8 0x00007f5e52b261d3 in dwfl_thread_getframes () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
#9 0x00007f5e52b25bdb in get_one_thread_cb () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
#10 0x00007f5e52b25faa in dwfl_getthreads () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
#11 0x00007f5e52b26514 in dwfl_getthread_frames () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
#12 0x000000000068c6ce in unwind__get_entries (cb=cb@entry=0x5d4620 <unwind_entry>, arg=arg@entry=0x10cd5fa0,
thread=thread@entry=0x1076a290, data=data@entry=0x7fff9df80540, max_stack=max_stack@entry=127,
best_effort=best_effort@entry=false) at util/thread.h:152
#13 0x00000000005dae95 in thread__resolve_callchain_unwind (evsel=0x106006d0, thread=0x1076a290, cursor=0x10cd5fa0,
sample=0x7fff9df80540, max_stack=127, symbols=true) at util/machine.c:2939
#14 thread__resolve_callchain_unwind (thread=0x1076a290, cursor=0x10cd5fa0, evsel=0x106006d0, sample=0x7fff9df80540,
max_stack=127, symbols=true) at util/machine.c:2920
#15 __thread__resolve_callchain (thread=0x1076a290, cursor=0x10cd5fa0, evsel=0x106006d0, evsel@entry=0x7fff9df80440,
sample=0x7fff9df80540, parent=parent@entry=0x7fff9df804a0, root_al=root_al@entry=0x7fff9df80440, max_stack=127, symbols=true)
at util/machine.c:2970
#16 0x00000000005d0cb2 in thread__resolve_callchain (thread=<optimized out>, cursor=<optimized out>, evsel=0x7fff9df80440,
sample=<optimized out>, parent=0x7fff9df804a0, root_al=0x7fff9df80440, max_stack=127) at util/machine.h:198
#17 sample__resolve_callchain (sample=<optimized out>, cursor=<optimized out>, parent=parent@entry=0x7fff9df804a0,
evsel=evsel@entry=0x106006d0, al=al@entry=0x7fff9df80440, max_stack=max_stack@entry=127) at util/callchain.c:1127
#18 0x0000000000617e08 in hist_entry_iter__add (iter=iter@entry=0x7fff9df80480, al=al@entry=0x7fff9df80440, max_stack_depth=127,
arg=arg@entry=0x7fff9df81ae0) at util/hist.c:1255
#19 0x000000000045d2d0 in process_sample_event (tool=0x7fff9df81ae0, event=<optimized out>, sample=0x7fff9df80540,
evsel=0x106006d0, machine=<optimized out>) at builtin-report.c:334
#20 0x00000000005e3bb1 in perf_session__deliver_event (session=0x105ff2c0, event=0x7f5c7d735ca0, tool=0x7fff9df81ae0,
file_offset=2914716832, file_path=0x105ffbf0 "perf.data") at util/session.c:1367
#21 0x00000000005e8d93 in do_flush (oe=0x105ffa50, show_progress=false) at util/ordered-events.c:245
#22 __ordered_events__flush (oe=0x105ffa50, how=OE_FLUSH__ROUND, timestamp=<optimized out>) at util/ordered-events.c:324
#23 0x00000000005e1f64 in perf_session__process_user_event (session=0x105ff2c0, event=0x7f5c7d752b18, file_offset=2914835224,
file_path=0x105ffbf0 "perf.data") at util/session.c:1419
#24 0x00000000005e47c7 in reader__read_event (rd=rd@entry=0x7fff9df81260, session=session@entry=0x105ff2c0,
--Type <RET> for more, q to quit, c to continue without paging--
quit
prog=prog@entry=0x7fff9df81220) at util/session.c:2132
#25 0x00000000005e4b37 in reader__process_events (rd=0x7fff9df81260, session=0x105ff2c0, prog=0x7fff9df81220)
at util/session.c:2181
#26 __perf_session__process_events (session=0x105ff2c0) at util/session.c:2226
#27 perf_session__process_events (session=session@entry=0x105ff2c0) at util/session.c:2390
#28 0x0000000000460add in __cmd_report (rep=0x7fff9df81ae0) at builtin-report.c:1076
#29 cmd_report (argc=<optimized out>, argv=<optimized out>) at builtin-report.c:1827
#30 0x00000000004c5a40 in run_builtin (p=p@entry=0xd8f7f8 <commands+312>, argc=argc@entry=1, argv=argv@entry=0x7fff9df844b0)
at perf.c:351
#31 0x00000000004c5d63 in handle_internal_command (argc=argc@entry=1, argv=argv@entry=0x7fff9df844b0) at perf.c:404
#32 0x0000000000442de3 in run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:448
#33 main (argc=<optimized out>, argv=0x7fff9df844b0) at perf.c:556
The hangup happens because nothing in` perf` or `elfutils` checks if a
mapped file is easily readable.
The change conservatively skips all non-regular files.
Signed-off-by: Sergei Trofimovich <slyich@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20250505174419.2814857-1-slyich@gmail.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Test tries to build:
struct s {
void __attribute__((btf_type_tag("void_tag"))) *p1;
void __attribute__((void_attr)) *p2;
};
but hard-codes struct size to 8 while using default host-sized pointers
and overlaying both pointers at struct start.
Above is broken but only fails on 32-bit hosts:
[...]
test_ctx__dump_and_compare:FAIL:dump_and_compare unexpected dump_and_compare: actual 'struct s {
void __attribute__((btf_type_tag("void_tag"))) *p1;
void __attribute__((void_attr)) *p2;
long: 32;
};
' != expected 'struct s {
void __attribute__((btf_type_tag("void_tag"))) *p1;
void __attribute__((void_attr)) *p2;
};
'
[...]
kernel-patches#31/9 btf_dump/btf_dump: type_tags:FAIL
Resolve by sizing struct to fit 2 default-sized pointers which are laid
out sequentially.
Fixes: 6c2d2a0 ("selftests/bpf: Add a btf_dump test for type_tags")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
In btf_dump_enum_data() an enum value is declared as 64-bit but passed for dumping output using an 'int' format specifier. After a few macro layers with var args, vsnprintf() then shorts reading the 64-bit value from its va_list and further misreads later strings, resulting in garbage output and failed tests in test_progs: [...] btf_dump_data:FAIL:ensure expected/actual match unexpected ensure expected/actual match: actual '(enum bpf_cmd)2000(null)' != expected '(enum bpf_cmd)2000' [...] btf_dump_data:FAIL:ensure expected/actual match unexpected ensure expected/actual match: actual '2000(null)' != expected '2000' [...] btf_dump_data:FAIL:ensure expected/actual match unexpected ensure expected/actual match: actual '(enum bpf_cmd)2000(null)' != expected '(enum bpf_cmd)2000' [...] kernel-patches#31/14 btf_dump/btf_dump: enum_data:FAIL Resolve by explicitly using 64-bit output specifiers (e.g. "%llu"). Fixes: 920d16a ("libbpf: BTF dumper support for typed data") Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
Allow the test dumping BTF of 'struct file_operations' to succeed on 32-bit
targets by accounting for varying pointer size in the expected dump output.
This avoids failures in test_progs like:
[...]
test_btf_dump_struct_data:FAIL:file_operations unexpected file_operations: actual '(struct file_operations){
.owner = (struct module *)0xffffffff,
.fop_flags = (fop_flags_t)4294967295,
.llsee' != expected '(struct file_operations){
.owner = (struct module *)0xffffffffffffffff,
.fop_flags = (fop_flags_t)4294967295,'
[...]
kernel-patches#31/15 btf_dump/btf_dump: struct_data:FAIL
Fixes: 70a9241 ("selftests/bpf: Add dump type data tests to btf dump tests")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
Test tries to build:
struct s {
void __attribute__((btf_type_tag("void_tag"))) *p1;
void __attribute__((void_attr)) *p2;
};
but hard-codes struct size to 8 while using default host-sized pointers
and overlaying both pointers at struct start.
Above is broken but only fails on 32-bit hosts:
[...]
test_ctx__dump_and_compare:FAIL:dump_and_compare unexpected dump_and_compare: actual 'struct s {
void __attribute__((btf_type_tag("void_tag"))) *p1;
void __attribute__((void_attr)) *p2;
long: 32;
};
' != expected 'struct s {
void __attribute__((btf_type_tag("void_tag"))) *p1;
void __attribute__((void_attr)) *p2;
};
'
[...]
kernel-patches#31/9 btf_dump/btf_dump: type_tags:FAIL
Resolve by sizing struct to fit 2 default-sized pointers which are laid
out sequentially.
Fixes: 6c2d2a0 ("selftests/bpf: Add a btf_dump test for type_tags")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
In btf_dump_enum_data() an enum value is declared as 64-bit but passed for dumping output using an 'int' format specifier. After a few macro layers with var args, vsnprintf() then shorts reading the 64-bit value from its va_list and further misreads later strings, resulting in garbage output and failed tests in test_progs: [...] btf_dump_data:FAIL:ensure expected/actual match unexpected ensure expected/actual match: actual '(enum bpf_cmd)2000(null)' != expected '(enum bpf_cmd)2000' [...] btf_dump_data:FAIL:ensure expected/actual match unexpected ensure expected/actual match: actual '2000(null)' != expected '2000' [...] btf_dump_data:FAIL:ensure expected/actual match unexpected ensure expected/actual match: actual '(enum bpf_cmd)2000(null)' != expected '(enum bpf_cmd)2000' [...] kernel-patches#31/14 btf_dump/btf_dump: enum_data:FAIL Resolve by explicitly using 64-bit output specifiers (e.g. "%llu"). Fixes: 920d16a ("libbpf: BTF dumper support for typed data") Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
Allow the test dumping BTF of 'struct file_operations' to succeed on 32-bit
targets by accounting for varying pointer size in the expected dump output.
This avoids failures in test_progs like:
[...]
test_btf_dump_struct_data:FAIL:file_operations unexpected file_operations: actual '(struct file_operations){
.owner = (struct module *)0xffffffff,
.fop_flags = (fop_flags_t)4294967295,
.llsee' != expected '(struct file_operations){
.owner = (struct module *)0xffffffffffffffff,
.fop_flags = (fop_flags_t)4294967295,'
[...]
kernel-patches#31/15 btf_dump/btf_dump: struct_data:FAIL
Fixes: 70a9241 ("selftests/bpf: Add dump type data tests to btf dump tests")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
Test tries to build:
struct s {
void __attribute__((btf_type_tag("void_tag"))) *p1;
void __attribute__((void_attr)) *p2;
};
but hard-codes struct size to 8 while using default host-sized pointers
and overlaying both pointers at struct start.
Above is broken but only fails on 32-bit hosts:
[...]
test_ctx__dump_and_compare:FAIL:dump_and_compare unexpected dump_and_compare: actual 'struct s {
void __attribute__((btf_type_tag("void_tag"))) *p1;
void __attribute__((void_attr)) *p2;
long: 32;
};
' != expected 'struct s {
void __attribute__((btf_type_tag("void_tag"))) *p1;
void __attribute__((void_attr)) *p2;
};
'
[...]
kernel-patches#31/9 btf_dump/btf_dump: type_tags:FAIL
Resolve by sizing struct to fit 2 default-sized pointers which are laid
out sequentially.
Fixes: 6c2d2a0 ("selftests/bpf: Add a btf_dump test for type_tags")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
In btf_dump_enum_data() an enum value is declared as 64-bit but passed for dumping output using an 'int' format specifier. After a few macro layers with var args, vsnprintf() then shorts reading the 64-bit value from its va_list and further misreads later strings, resulting in garbage output and failed tests in test_progs: [...] btf_dump_data:FAIL:ensure expected/actual match unexpected ensure expected/actual match: actual '(enum bpf_cmd)2000(null)' != expected '(enum bpf_cmd)2000' [...] btf_dump_data:FAIL:ensure expected/actual match unexpected ensure expected/actual match: actual '2000(null)' != expected '2000' [...] btf_dump_data:FAIL:ensure expected/actual match unexpected ensure expected/actual match: actual '(enum bpf_cmd)2000(null)' != expected '(enum bpf_cmd)2000' [...] kernel-patches#31/14 btf_dump/btf_dump: enum_data:FAIL Resolve by explicitly using 64-bit output specifiers (e.g. "%llu"). Fixes: 920d16a ("libbpf: BTF dumper support for typed data") Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
Allow the test dumping BTF of 'struct file_operations' to succeed on 32-bit
targets by accounting for varying pointer size in the expected dump output.
This avoids failures in test_progs like:
[...]
test_btf_dump_struct_data:FAIL:file_operations unexpected file_operations: actual '(struct file_operations){
.owner = (struct module *)0xffffffff,
.fop_flags = (fop_flags_t)4294967295,
.llsee' != expected '(struct file_operations){
.owner = (struct module *)0xffffffffffffffff,
.fop_flags = (fop_flags_t)4294967295,'
[...]
kernel-patches#31/15 btf_dump/btf_dump: struct_data:FAIL
Fixes: 70a9241 ("selftests/bpf: Add dump type data tests to btf dump tests")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
Test tries to build:
struct s {
void __attribute__((btf_type_tag("void_tag"))) *p1;
void __attribute__((void_attr)) *p2;
};
but hard-codes struct size to 8 while using default host-sized pointers
and overlaying both pointers at struct start.
Above is broken but only fails on 32-bit hosts:
[...]
test_ctx__dump_and_compare:FAIL:dump_and_compare unexpected dump_and_compare: actual 'struct s {
void __attribute__((btf_type_tag("void_tag"))) *p1;
void __attribute__((void_attr)) *p2;
long: 32;
};
' != expected 'struct s {
void __attribute__((btf_type_tag("void_tag"))) *p1;
void __attribute__((void_attr)) *p2;
};
'
[...]
kernel-patches#31/9 btf_dump/btf_dump: type_tags:FAIL
Resolve by sizing struct to fit 2 default-sized pointers which are laid
out sequentially.
Fixes: 6c2d2a0 ("selftests/bpf: Add a btf_dump test for type_tags")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
In btf_dump_enum_data() an enum value is declared as 64-bit but passed for dumping output using an 'int' format specifier. After a few macro layers with var args, vsnprintf() then shorts reading the 64-bit value from its va_list and further misreads later strings, resulting in garbage output and failed tests in test_progs: [...] btf_dump_data:FAIL:ensure expected/actual match unexpected ensure expected/actual match: actual '(enum bpf_cmd)2000(null)' != expected '(enum bpf_cmd)2000' [...] btf_dump_data:FAIL:ensure expected/actual match unexpected ensure expected/actual match: actual '2000(null)' != expected '2000' [...] btf_dump_data:FAIL:ensure expected/actual match unexpected ensure expected/actual match: actual '(enum bpf_cmd)2000(null)' != expected '(enum bpf_cmd)2000' [...] kernel-patches#31/14 btf_dump/btf_dump: enum_data:FAIL Resolve by explicitly using 64-bit output specifiers (e.g. "%llu"). Fixes: 920d16a ("libbpf: BTF dumper support for typed data") Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
Allow the test dumping BTF of 'struct file_operations' to succeed on 32-bit
targets by accounting for varying pointer size in the expected dump output.
This avoids failures in test_progs like:
[...]
test_btf_dump_struct_data:FAIL:file_operations unexpected file_operations: actual '(struct file_operations){
.owner = (struct module *)0xffffffff,
.fop_flags = (fop_flags_t)4294967295,
.llsee' != expected '(struct file_operations){
.owner = (struct module *)0xffffffffffffffff,
.fop_flags = (fop_flags_t)4294967295,'
[...]
kernel-patches#31/15 btf_dump/btf_dump: struct_data:FAIL
Fixes: 70a9241 ("selftests/bpf: Add dump type data tests to btf dump tests")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
Test tries to build:
struct s {
void __attribute__((btf_type_tag("void_tag"))) *p1;
void __attribute__((void_attr)) *p2;
};
but hard-codes struct size to 8 while using default host-sized pointers
and overlaying both pointers at struct start.
Above is broken but only fails on 32-bit hosts:
[...]
test_ctx__dump_and_compare:FAIL:dump_and_compare unexpected dump_and_compare: actual 'struct s {
void __attribute__((btf_type_tag("void_tag"))) *p1;
void __attribute__((void_attr)) *p2;
long: 32;
};
' != expected 'struct s {
void __attribute__((btf_type_tag("void_tag"))) *p1;
void __attribute__((void_attr)) *p2;
};
'
[...]
kernel-patches#31/9 btf_dump/btf_dump: type_tags:FAIL
Resolve by sizing struct to fit 2 default-sized pointers which are laid
out sequentially.
Fixes: 6c2d2a0 ("selftests/bpf: Add a btf_dump test for type_tags")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
In btf_dump_enum_data() an enum value is declared as 64-bit but passed for dumping output using an 'int' format specifier. After a few macro layers with var args, vsnprintf() then shorts reading the 64-bit value from its va_list and further misreads later strings, resulting in garbage output and failed tests in test_progs: [...] btf_dump_data:FAIL:ensure expected/actual match unexpected ensure expected/actual match: actual '(enum bpf_cmd)2000(null)' != expected '(enum bpf_cmd)2000' [...] btf_dump_data:FAIL:ensure expected/actual match unexpected ensure expected/actual match: actual '2000(null)' != expected '2000' [...] btf_dump_data:FAIL:ensure expected/actual match unexpected ensure expected/actual match: actual '(enum bpf_cmd)2000(null)' != expected '(enum bpf_cmd)2000' [...] kernel-patches#31/14 btf_dump/btf_dump: enum_data:FAIL Resolve by explicitly using 64-bit output specifiers (e.g. "%llu"). Fixes: 920d16a ("libbpf: BTF dumper support for typed data") Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
Allow the test dumping BTF of 'struct file_operations' to succeed on 32-bit
targets by accounting for varying pointer size in the expected dump output.
This avoids failures in test_progs like:
[...]
test_btf_dump_struct_data:FAIL:file_operations unexpected file_operations: actual '(struct file_operations){
.owner = (struct module *)0xffffffff,
.fop_flags = (fop_flags_t)4294967295,
.llsee' != expected '(struct file_operations){
.owner = (struct module *)0xffffffffffffffff,
.fop_flags = (fop_flags_t)4294967295,'
[...]
kernel-patches#31/15 btf_dump/btf_dump: struct_data:FAIL
Fixes: 70a9241 ("selftests/bpf: Add dump type data tests to btf dump tests")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
…l_access() Function copy_from_user() and copy_to_user() may sleep because of page fault, and they cannot be called in spin_lock hold context. Here move function calling of copy_from_user() and copy_to_user() before spinlock context in function kvm_eiointc_ctrl_access(). Otherwise there will be possible warning such as: BUG: sleeping function called from invalid context at include/linux/uaccess.h:192 in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 6292, name: qemu-system-loo preempt_count: 1, expected: 0 RCU nest depth: 0, expected: 0 INFO: lockdep is turned off. irq event stamp: 0 hardirqs last enabled at (0): [<0000000000000000>] 0x0 hardirqs last disabled at (0): [<9000000004c4a554>] copy_process+0x90c/0x1d40 softirqs last enabled at (0): [<9000000004c4a554>] copy_process+0x90c/0x1d40 softirqs last disabled at (0): [<0000000000000000>] 0x0 CPU: 41 UID: 0 PID: 6292 Comm: qemu-system-loo Tainted: G W 6.17.0-rc3+ #31 PREEMPT(full) Tainted: [W]=WARN Stack : 0000000000000076 0000000000000000 9000000004c28264 9000100092ff4000 9000100092ff7b80 9000100092ff7b88 0000000000000000 9000100092ff7cc8 9000100092ff7cc0 9000100092ff7cc0 9000100092ff7a00 0000000000000001 0000000000000001 9000100092ff7b88 947d2f9216a5e8b9 900010008773d880 00000000ffff8b9f fffffffffffffffe 0000000000000ba1 fffffffffffffffe 000000000000003e 900000000825a15b 000010007ad38000 9000100092ff7ec0 0000000000000000 0000000000000000 9000000006f3ac60 9000000007252000 0000000000000000 00007ff746ff2230 0000000000000053 9000200088a021b0 0000555556c9d190 0000000000000000 9000000004c2827c 000055556cfb5f40 00000000000000b0 0000000000000007 0000000000000007 0000000000071c1d Call Trace: [<9000000004c2827c>] show_stack+0x5c/0x180 [<9000000004c20fac>] dump_stack_lvl+0x94/0xe4 [<9000000004c99c7c>] __might_resched+0x26c/0x290 [<9000000004f68968>] __might_fault+0x20/0x88 [<ffff800002311de0>] kvm_eiointc_ctrl_access.isra.0+0x88/0x380 [kvm] [<ffff8000022f8514>] kvm_device_ioctl+0x194/0x290 [kvm] [<900000000506b0d8>] sys_ioctl+0x388/0x1010 [<90000000063ed210>] do_syscall+0xb0/0x2d8 [<9000000004c25ef8>] handle_syscall+0xb8/0x158 Cc: stable@vger.kernel.org Fixes: 1ad7efa ("LoongArch: KVM: Add EIOINTC user mode read and write functions") Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
…s_access() Function copy_from_user() and copy_to_user() may sleep because of page fault, and they cannot be called in spin_lock hold context. Here move function calling of copy_from_user() and copy_to_user() before spinlock context in function kvm_eiointc_ctrl_access(). Otherwise there will be possible warning such as: BUG: sleeping function called from invalid context at include/linux/uaccess.h:192 in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 6292, name: qemu-system-loo preempt_count: 1, expected: 0 RCU nest depth: 0, expected: 0 INFO: lockdep is turned off. irq event stamp: 0 hardirqs last enabled at (0): [<0000000000000000>] 0x0 hardirqs last disabled at (0): [<9000000004c4a554>] copy_process+0x90c/0x1d40 softirqs last enabled at (0): [<9000000004c4a554>] copy_process+0x90c/0x1d40 softirqs last disabled at (0): [<0000000000000000>] 0x0 CPU: 41 UID: 0 PID: 6292 Comm: qemu-system-loo Tainted: G W 6.17.0-rc3+ #31 PREEMPT(full) Tainted: [W]=WARN Stack : 0000000000000076 0000000000000000 9000000004c28264 9000100092ff4000 9000100092ff7b80 9000100092ff7b88 0000000000000000 9000100092ff7cc8 9000100092ff7cc0 9000100092ff7cc0 9000100092ff7a00 0000000000000001 0000000000000001 9000100092ff7b88 947d2f9216a5e8b9 900010008773d880 00000000ffff8b9f fffffffffffffffe 0000000000000ba1 fffffffffffffffe 000000000000003e 900000000825a15b 000010007ad38000 9000100092ff7ec0 0000000000000000 0000000000000000 9000000006f3ac60 9000000007252000 0000000000000000 00007ff746ff2230 0000000000000053 9000200088a021b0 0000555556c9d190 0000000000000000 9000000004c2827c 000055556cfb5f40 00000000000000b0 0000000000000007 0000000000000007 0000000000071c1d Call Trace: [<9000000004c2827c>] show_stack+0x5c/0x180 [<9000000004c20fac>] dump_stack_lvl+0x94/0xe4 [<9000000004c99c7c>] __might_resched+0x26c/0x290 [<9000000004f68968>] __might_fault+0x20/0x88 [<ffff800002311de0>] kvm_eiointc_regs_access.isra.0+0x88/0x380 [kvm] [<ffff8000022f8514>] kvm_device_ioctl+0x194/0x290 [kvm] [<900000000506b0d8>] sys_ioctl+0x388/0x1010 [<90000000063ed210>] do_syscall+0xb0/0x2d8 [<9000000004c25ef8>] handle_syscall+0xb8/0x158 Cc: stable@vger.kernel.org Fixes: 1ad7efa ("LoongArch: KVM: Add EIOINTC user mode read and write functions") Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
…status_access() Function copy_from_user() and copy_to_user() may sleep because of page fault, and they cannot be called in spin_lock hold context. Here move funtcion calling of copy_from_user() and copy_to_user() out of function kvm_eiointc_sw_status_access(). Otherwise there will be possible warning such as: BUG: sleeping function called from invalid context at include/linux/uaccess.h:192 in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 6292, name: qemu-system-loo preempt_count: 1, expected: 0 RCU nest depth: 0, expected: 0 INFO: lockdep is turned off. irq event stamp: 0 hardirqs last enabled at (0): [<0000000000000000>] 0x0 hardirqs last disabled at (0): [<9000000004c4a554>] copy_process+0x90c/0x1d40 softirqs last enabled at (0): [<9000000004c4a554>] copy_process+0x90c/0x1d40 softirqs last disabled at (0): [<0000000000000000>] 0x0 CPU: 41 UID: 0 PID: 6292 Comm: qemu-system-loo Tainted: G W 6.17.0-rc3+ #31 PREEMPT(full) Tainted: [W]=WARN Stack : 0000000000000076 0000000000000000 9000000004c28264 9000100092ff4000 9000100092ff7b80 9000100092ff7b88 0000000000000000 9000100092ff7cc8 9000100092ff7cc0 9000100092ff7cc0 9000100092ff7a00 0000000000000001 0000000000000001 9000100092ff7b88 947d2f9216a5e8b9 900010008773d880 00000000ffff8b9f fffffffffffffffe 0000000000000ba1 fffffffffffffffe 000000000000003e 900000000825a15b 000010007ad38000 9000100092ff7ec0 0000000000000000 0000000000000000 9000000006f3ac60 9000000007252000 0000000000000000 00007ff746ff2230 0000000000000053 9000200088a021b0 0000555556c9d190 0000000000000000 9000000004c2827c 000055556cfb5f40 00000000000000b0 0000000000000007 0000000000000007 0000000000071c1d Call Trace: [<9000000004c2827c>] show_stack+0x5c/0x180 [<9000000004c20fac>] dump_stack_lvl+0x94/0xe4 [<9000000004c99c7c>] __might_resched+0x26c/0x290 [<9000000004f68968>] __might_fault+0x20/0x88 [<ffff800002311de0>] kvm_eiointc_sw_status_access.isra.0+0x88/0x380 [kvm] [<ffff8000022f8514>] kvm_device_ioctl+0x194/0x290 [kvm] [<900000000506b0d8>] sys_ioctl+0x388/0x1010 [<90000000063ed210>] do_syscall+0xb0/0x2d8 [<9000000004c25ef8>] handle_syscall+0xb8/0x158 Cc: stable@vger.kernel.org Fixes: 1ad7efa ("LoongArch: KVM: Add EIOINTC user mode read and write functions") Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
…s_access() Function copy_from_user() and copy_to_user() may sleep because of page fault, and they cannot be called in spin_lock hold context. Here move function calling of copy_from_user() and copy_to_user() out of spinlock context in function kvm_pch_pic_regs_access(). Otherwise there will be possible warning such as: BUG: sleeping function called from invalid context at include/linux/uaccess.h:192 in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 6292, name: qemu-system-loo preempt_count: 1, expected: 0 RCU nest depth: 0, expected: 0 INFO: lockdep is turned off. irq event stamp: 0 hardirqs last enabled at (0): [<0000000000000000>] 0x0 hardirqs last disabled at (0): [<9000000004c4a554>] copy_process+0x90c/0x1d40 softirqs last enabled at (0): [<9000000004c4a554>] copy_process+0x90c/0x1d40 softirqs last disabled at (0): [<0000000000000000>] 0x0 CPU: 41 UID: 0 PID: 6292 Comm: qemu-system-loo Tainted: G W 6.17.0-rc3+ #31 PREEMPT(full) Tainted: [W]=WARN Stack : 0000000000000076 0000000000000000 9000000004c28264 9000100092ff4000 9000100092ff7b80 9000100092ff7b88 0000000000000000 9000100092ff7cc8 9000100092ff7cc0 9000100092ff7cc0 9000100092ff7a00 0000000000000001 0000000000000001 9000100092ff7b88 947d2f9216a5e8b9 900010008773d880 00000000ffff8b9f fffffffffffffffe 0000000000000ba1 fffffffffffffffe 000000000000003e 900000000825a15b 000010007ad38000 9000100092ff7ec0 0000000000000000 0000000000000000 9000000006f3ac60 9000000007252000 0000000000000000 00007ff746ff2230 0000000000000053 9000200088a021b0 0000555556c9d190 0000000000000000 9000000004c2827c 000055556cfb5f40 00000000000000b0 0000000000000007 0000000000000007 0000000000071c1d Call Trace: [<9000000004c2827c>] show_stack+0x5c/0x180 [<9000000004c20fac>] dump_stack_lvl+0x94/0xe4 [<9000000004c99c7c>] __might_resched+0x26c/0x290 [<9000000004f68968>] __might_fault+0x20/0x88 [<ffff800002311de0>] kvm_pch_pic_regs_access.isra.0+0x88/0x380 [kvm] [<ffff8000022f8514>] kvm_device_ioctl+0x194/0x290 [kvm] [<900000000506b0d8>] sys_ioctl+0x388/0x1010 [<90000000063ed210>] do_syscall+0xb0/0x2d8 [<9000000004c25ef8>] handle_syscall+0xb8/0x158 Cc: stable@vger.kernel.org Fixes: d206d95 ("LoongArch: KVM: Add PCHPIC user mode read and write functions") Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
commit 2a6c727 ("cpufreq: Initialize cpufreq-based frequency-invariance later") postponed the frequency invariance initialization to avoid disabling it in the error case. This isn't locking safe, instead move the initialization up before the subsys interface is registered (which will rebuild the sched_domains) and add the corresponding disable on the error path. Observed lockdep without this patch: [ 0.989686] ====================================================== [ 0.989688] WARNING: possible circular locking dependency detected [ 0.989690] 6.17.0-rc4-cix-build+ #31 Tainted: G S [ 0.989691] ------------------------------------------------------ [ 0.989692] swapper/0/1 is trying to acquire lock: [ 0.989693] ffff800082ada7f8 (sched_energy_mutex){+.+.}-{4:4}, at: rebuild_sched_domains_energy+0x30/0x58 [ 0.989705] but task is already holding lock: [ 0.989706] ffff000088c89bc8 (&policy->rwsem){+.+.}-{4:4}, at: cpufreq_online+0x7f8/0xbe0 [ 0.989713] which lock already depends on the new lock. Fixes: 2a6c727 ("cpufreq: Initialize cpufreq-based frequency-invariance later") Signed-off-by: Christian Loehle <christian.loehle@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pull request for series with
subject: s390/bpf: Fix multiple tail calls
version: 1
url: https://patchwork.ozlabs.org/project/netdev/list/?series=200680