Skip to content

Conversation

@kernel-patches-bot
Copy link

Pull request for series with
subject: s390/bpf: Fix multiple tail calls
version: 1
url: https://patchwork.ozlabs.org/project/netdev/list/?series=200680

@kernel-patches-bot
Copy link
Author

@kernel-patches-bot
Copy link
Author

@kernel-patches-bot
Copy link
Author

@kernel-patches-bot
Copy link
Author

@kernel-patches-bot
Copy link
Author

@kernel-patches-bot
Copy link
Author

@kernel-patches-bot
Copy link
Author

kernel-patches-bot and others added 2 commits September 11, 2020 13:17
exceeding tail call count or missing tail call target), JIT uses
label[0] field, which contains the address of the instruction following
the tail call. When there are multiple tail calls, label[0] value comes
from handling of a previous tail call, which is incorrect.

Fix by getting rid of label array and resolving the label address
locally: for all 3 branches that jump to it, emit 0 offsets at the
beginning, and then backpatch them with the correct value.

Also, do not use the long jump infrastructure: the tail call sequence
is known to be short, so make all 3 jumps short.

Fixes: 6651ee0 ("s390/bpf: implement bpf_tail_call() helper")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 arch/s390/net/bpf_jit_comp.c | 61 ++++++++++++++++--------------------
 1 file changed, 27 insertions(+), 34 deletions(-)
@kernel-patches-bot
Copy link
Author

@kernel-patches-bot
Copy link
Author

At least one diff in series https://patchwork.ozlabs.org/project/netdev/list/?series=200680 irrelevant now. Closing PR.

@kernel-patches-bot kernel-patches-bot deleted the series/200680 branch September 15, 2020 17:49
kernel-patches-bot pushed a commit that referenced this pull request Apr 23, 2021
Fix BPF_CORE_READ_BITFIELD() macro used for reading CO-RE-relocatable
bitfields. Missing breaks in a switch caused 8-byte reads always. This can
confuse libbpf because it does strict checks that memory load size corresponds
to the original size of the field, which in this case quite often would be
wrong.

After fixing that, we run into another problem, which quite subtle, so worth
documenting here. The issue is in Clang optimization and CO-RE relocation
interactions. Without that asm volatile construct (also known as
barrier_var()), Clang will re-order BYTE_OFFSET and BYTE_SIZE relocations and
will apply BYTE_OFFSET 4 times for each switch case arm. This will result in
the same error from libbpf about mismatch of memory load size and original
field size. I.e., if we were reading u32, we'd still have *(u8 *), *(u16 *),
*(u32 *), and *(u64 *) memory loads, three of which will fail. Using
barrier_var() forces Clang to apply BYTE_OFFSET relocation first (and once) to
calculate p, after which value of p is used without relocation in each of
switch case arms, doing appropiately-sized memory load.

Here's the list of relevant relocations and pieces of generated BPF code
before and after this patch for test_core_reloc_bitfields_direct selftests.

BEFORE
=====
 #45: core_reloc: insn #160 --> [5] + 0:5: byte_sz --> struct core_reloc_bitfields.u32
 #46: core_reloc: insn #167 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32
 #47: core_reloc: insn #174 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32
 #48: core_reloc: insn #178 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32
 #49: core_reloc: insn #182 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32

     157:       18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0 ll
     159:       7b 12 20 01 00 00 00 00 *(u64 *)(r2 + 288) = r1
     160:       b7 02 00 00 04 00 00 00 r2 = 4
; BYTE_SIZE relocation here                 ^^^
     161:       66 02 07 00 03 00 00 00 if w2 s> 3 goto +7 <LBB0_63>
     162:       16 02 0d 00 01 00 00 00 if w2 == 1 goto +13 <LBB0_65>
     163:       16 02 01 00 02 00 00 00 if w2 == 2 goto +1 <LBB0_66>
     164:       05 00 12 00 00 00 00 00 goto +18 <LBB0_69>

0000000000000528 <LBB0_66>:
     165:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
     167:       69 11 08 00 00 00 00 00 r1 = *(u16 *)(r1 + 8)
; BYTE_OFFSET relo here w/ WRONG size        ^^^^^^^^^^^^^^^^
     168:       05 00 0e 00 00 00 00 00 goto +14 <LBB0_69>

0000000000000548 <LBB0_63>:
     169:       16 02 0a 00 04 00 00 00 if w2 == 4 goto +10 <LBB0_67>
     170:       16 02 01 00 08 00 00 00 if w2 == 8 goto +1 <LBB0_68>
     171:       05 00 0b 00 00 00 00 00 goto +11 <LBB0_69>

0000000000000560 <LBB0_68>:
     172:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
     174:       79 11 08 00 00 00 00 00 r1 = *(u64 *)(r1 + 8)
; BYTE_OFFSET relo here w/ WRONG size        ^^^^^^^^^^^^^^^^
     175:       05 00 07 00 00 00 00 00 goto +7 <LBB0_69>

0000000000000580 <LBB0_65>:
     176:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
     178:       71 11 08 00 00 00 00 00 r1 = *(u8 *)(r1 + 8)
; BYTE_OFFSET relo here w/ WRONG size        ^^^^^^^^^^^^^^^^
     179:       05 00 03 00 00 00 00 00 goto +3 <LBB0_69>

00000000000005a0 <LBB0_67>:
     180:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
     182:       61 11 08 00 00 00 00 00 r1 = *(u32 *)(r1 + 8)
; BYTE_OFFSET relo here w/ RIGHT size        ^^^^^^^^^^^^^^^^

00000000000005b8 <LBB0_69>:
     183:       67 01 00 00 20 00 00 00 r1 <<= 32
     184:       b7 02 00 00 00 00 00 00 r2 = 0
     185:       16 02 02 00 00 00 00 00 if w2 == 0 goto +2 <LBB0_71>
     186:       c7 01 00 00 20 00 00 00 r1 s>>= 32
     187:       05 00 01 00 00 00 00 00 goto +1 <LBB0_72>

00000000000005e0 <LBB0_71>:
     188:       77 01 00 00 20 00 00 00 r1 >>= 32

AFTER
=====

 #30: core_reloc: insn #132 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32
 #31: core_reloc: insn #134 --> [5] + 0:5: byte_sz --> struct core_reloc_bitfields.u32

     129:       18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0 ll
     131:       7b 12 20 01 00 00 00 00 *(u64 *)(r2 + 288) = r1
     132:       b7 01 00 00 08 00 00 00 r1 = 8
; BYTE_OFFSET relo here                     ^^^
; no size check for non-memory dereferencing instructions
     133:       0f 12 00 00 00 00 00 00 r2 += r1
     134:       b7 03 00 00 04 00 00 00 r3 = 4
; BYTE_SIZE relocation here                 ^^^
     135:       66 03 05 00 03 00 00 00 if w3 s> 3 goto +5 <LBB0_63>
     136:       16 03 09 00 01 00 00 00 if w3 == 1 goto +9 <LBB0_65>
     137:       16 03 01 00 02 00 00 00 if w3 == 2 goto +1 <LBB0_66>
     138:       05 00 0a 00 00 00 00 00 goto +10 <LBB0_69>

0000000000000458 <LBB0_66>:
     139:       69 21 00 00 00 00 00 00 r1 = *(u16 *)(r2 + 0)
; NO CO-RE relocation here                   ^^^^^^^^^^^^^^^^
     140:       05 00 08 00 00 00 00 00 goto +8 <LBB0_69>

0000000000000468 <LBB0_63>:
     141:       16 03 06 00 04 00 00 00 if w3 == 4 goto +6 <LBB0_67>
     142:       16 03 01 00 08 00 00 00 if w3 == 8 goto +1 <LBB0_68>
     143:       05 00 05 00 00 00 00 00 goto +5 <LBB0_69>

0000000000000480 <LBB0_68>:
     144:       79 21 00 00 00 00 00 00 r1 = *(u64 *)(r2 + 0)
; NO CO-RE relocation here                   ^^^^^^^^^^^^^^^^
     145:       05 00 03 00 00 00 00 00 goto +3 <LBB0_69>

0000000000000490 <LBB0_65>:
     146:       71 21 00 00 00 00 00 00 r1 = *(u8 *)(r2 + 0)
; NO CO-RE relocation here                   ^^^^^^^^^^^^^^^^
     147:       05 00 01 00 00 00 00 00 goto +1 <LBB0_69>

00000000000004a0 <LBB0_67>:
     148:       61 21 00 00 00 00 00 00 r1 = *(u32 *)(r2 + 0)
; NO CO-RE relocation here                   ^^^^^^^^^^^^^^^^

00000000000004a8 <LBB0_69>:
     149:       67 01 00 00 20 00 00 00 r1 <<= 32
     150:       b7 02 00 00 00 00 00 00 r2 = 0
     151:       16 02 02 00 00 00 00 00 if w2 == 0 goto +2 <LBB0_71>
     152:       c7 01 00 00 20 00 00 00 r1 s>>= 32
     153:       05 00 01 00 00 00 00 00 goto +1 <LBB0_72>

00000000000004d0 <LBB0_71>:
     154:       77 01 00 00 20 00 00 00 r1 >>= 323

Fixes: ee26dad ("libbpf: Add support for relocatable bitfields")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
kernel-patches-bot pushed a commit that referenced this pull request Apr 24, 2021
Fix BPF_CORE_READ_BITFIELD() macro used for reading CO-RE-relocatable
bitfields. Missing breaks in a switch caused 8-byte reads always. This can
confuse libbpf because it does strict checks that memory load size corresponds
to the original size of the field, which in this case quite often would be
wrong.

After fixing that, we run into another problem, which quite subtle, so worth
documenting here. The issue is in Clang optimization and CO-RE relocation
interactions. Without that asm volatile construct (also known as
barrier_var()), Clang will re-order BYTE_OFFSET and BYTE_SIZE relocations and
will apply BYTE_OFFSET 4 times for each switch case arm. This will result in
the same error from libbpf about mismatch of memory load size and original
field size. I.e., if we were reading u32, we'd still have *(u8 *), *(u16 *),
*(u32 *), and *(u64 *) memory loads, three of which will fail. Using
barrier_var() forces Clang to apply BYTE_OFFSET relocation first (and once) to
calculate p, after which value of p is used without relocation in each of
switch case arms, doing appropiately-sized memory load.

Here's the list of relevant relocations and pieces of generated BPF code
before and after this patch for test_core_reloc_bitfields_direct selftests.

BEFORE
=====
 #45: core_reloc: insn #160 --> [5] + 0:5: byte_sz --> struct core_reloc_bitfields.u32
 #46: core_reloc: insn #167 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32
 #47: core_reloc: insn #174 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32
 #48: core_reloc: insn #178 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32
 #49: core_reloc: insn #182 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32

     157:       18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0 ll
     159:       7b 12 20 01 00 00 00 00 *(u64 *)(r2 + 288) = r1
     160:       b7 02 00 00 04 00 00 00 r2 = 4
; BYTE_SIZE relocation here                 ^^^
     161:       66 02 07 00 03 00 00 00 if w2 s> 3 goto +7 <LBB0_63>
     162:       16 02 0d 00 01 00 00 00 if w2 == 1 goto +13 <LBB0_65>
     163:       16 02 01 00 02 00 00 00 if w2 == 2 goto +1 <LBB0_66>
     164:       05 00 12 00 00 00 00 00 goto +18 <LBB0_69>

0000000000000528 <LBB0_66>:
     165:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
     167:       69 11 08 00 00 00 00 00 r1 = *(u16 *)(r1 + 8)
; BYTE_OFFSET relo here w/ WRONG size        ^^^^^^^^^^^^^^^^
     168:       05 00 0e 00 00 00 00 00 goto +14 <LBB0_69>

0000000000000548 <LBB0_63>:
     169:       16 02 0a 00 04 00 00 00 if w2 == 4 goto +10 <LBB0_67>
     170:       16 02 01 00 08 00 00 00 if w2 == 8 goto +1 <LBB0_68>
     171:       05 00 0b 00 00 00 00 00 goto +11 <LBB0_69>

0000000000000560 <LBB0_68>:
     172:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
     174:       79 11 08 00 00 00 00 00 r1 = *(u64 *)(r1 + 8)
; BYTE_OFFSET relo here w/ WRONG size        ^^^^^^^^^^^^^^^^
     175:       05 00 07 00 00 00 00 00 goto +7 <LBB0_69>

0000000000000580 <LBB0_65>:
     176:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
     178:       71 11 08 00 00 00 00 00 r1 = *(u8 *)(r1 + 8)
; BYTE_OFFSET relo here w/ WRONG size        ^^^^^^^^^^^^^^^^
     179:       05 00 03 00 00 00 00 00 goto +3 <LBB0_69>

00000000000005a0 <LBB0_67>:
     180:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
     182:       61 11 08 00 00 00 00 00 r1 = *(u32 *)(r1 + 8)
; BYTE_OFFSET relo here w/ RIGHT size        ^^^^^^^^^^^^^^^^

00000000000005b8 <LBB0_69>:
     183:       67 01 00 00 20 00 00 00 r1 <<= 32
     184:       b7 02 00 00 00 00 00 00 r2 = 0
     185:       16 02 02 00 00 00 00 00 if w2 == 0 goto +2 <LBB0_71>
     186:       c7 01 00 00 20 00 00 00 r1 s>>= 32
     187:       05 00 01 00 00 00 00 00 goto +1 <LBB0_72>

00000000000005e0 <LBB0_71>:
     188:       77 01 00 00 20 00 00 00 r1 >>= 32

AFTER
=====

 #30: core_reloc: insn #132 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32
 #31: core_reloc: insn #134 --> [5] + 0:5: byte_sz --> struct core_reloc_bitfields.u32

     129:       18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0 ll
     131:       7b 12 20 01 00 00 00 00 *(u64 *)(r2 + 288) = r1
     132:       b7 01 00 00 08 00 00 00 r1 = 8
; BYTE_OFFSET relo here                     ^^^
; no size check for non-memory dereferencing instructions
     133:       0f 12 00 00 00 00 00 00 r2 += r1
     134:       b7 03 00 00 04 00 00 00 r3 = 4
; BYTE_SIZE relocation here                 ^^^
     135:       66 03 05 00 03 00 00 00 if w3 s> 3 goto +5 <LBB0_63>
     136:       16 03 09 00 01 00 00 00 if w3 == 1 goto +9 <LBB0_65>
     137:       16 03 01 00 02 00 00 00 if w3 == 2 goto +1 <LBB0_66>
     138:       05 00 0a 00 00 00 00 00 goto +10 <LBB0_69>

0000000000000458 <LBB0_66>:
     139:       69 21 00 00 00 00 00 00 r1 = *(u16 *)(r2 + 0)
; NO CO-RE relocation here                   ^^^^^^^^^^^^^^^^
     140:       05 00 08 00 00 00 00 00 goto +8 <LBB0_69>

0000000000000468 <LBB0_63>:
     141:       16 03 06 00 04 00 00 00 if w3 == 4 goto +6 <LBB0_67>
     142:       16 03 01 00 08 00 00 00 if w3 == 8 goto +1 <LBB0_68>
     143:       05 00 05 00 00 00 00 00 goto +5 <LBB0_69>

0000000000000480 <LBB0_68>:
     144:       79 21 00 00 00 00 00 00 r1 = *(u64 *)(r2 + 0)
; NO CO-RE relocation here                   ^^^^^^^^^^^^^^^^
     145:       05 00 03 00 00 00 00 00 goto +3 <LBB0_69>

0000000000000490 <LBB0_65>:
     146:       71 21 00 00 00 00 00 00 r1 = *(u8 *)(r2 + 0)
; NO CO-RE relocation here                   ^^^^^^^^^^^^^^^^
     147:       05 00 01 00 00 00 00 00 goto +1 <LBB0_69>

00000000000004a0 <LBB0_67>:
     148:       61 21 00 00 00 00 00 00 r1 = *(u32 *)(r2 + 0)
; NO CO-RE relocation here                   ^^^^^^^^^^^^^^^^

00000000000004a8 <LBB0_69>:
     149:       67 01 00 00 20 00 00 00 r1 <<= 32
     150:       b7 02 00 00 00 00 00 00 r2 = 0
     151:       16 02 02 00 00 00 00 00 if w2 == 0 goto +2 <LBB0_71>
     152:       c7 01 00 00 20 00 00 00 r1 s>>= 32
     153:       05 00 01 00 00 00 00 00 goto +1 <LBB0_72>

00000000000004d0 <LBB0_71>:
     154:       77 01 00 00 20 00 00 00 r1 >>= 323

Fixes: ee26dad ("libbpf: Add support for relocatable bitfields")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
kernel-patches-bot pushed a commit that referenced this pull request Apr 26, 2021
Fix BPF_CORE_READ_BITFIELD() macro used for reading CO-RE-relocatable
bitfields. Missing breaks in a switch caused 8-byte reads always. This can
confuse libbpf because it does strict checks that memory load size corresponds
to the original size of the field, which in this case quite often would be
wrong.

After fixing that, we run into another problem, which quite subtle, so worth
documenting here. The issue is in Clang optimization and CO-RE relocation
interactions. Without that asm volatile construct (also known as
barrier_var()), Clang will re-order BYTE_OFFSET and BYTE_SIZE relocations and
will apply BYTE_OFFSET 4 times for each switch case arm. This will result in
the same error from libbpf about mismatch of memory load size and original
field size. I.e., if we were reading u32, we'd still have *(u8 *), *(u16 *),
*(u32 *), and *(u64 *) memory loads, three of which will fail. Using
barrier_var() forces Clang to apply BYTE_OFFSET relocation first (and once) to
calculate p, after which value of p is used without relocation in each of
switch case arms, doing appropiately-sized memory load.

Here's the list of relevant relocations and pieces of generated BPF code
before and after this patch for test_core_reloc_bitfields_direct selftests.

BEFORE
=====
 #45: core_reloc: insn #160 --> [5] + 0:5: byte_sz --> struct core_reloc_bitfields.u32
 #46: core_reloc: insn #167 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32
 #47: core_reloc: insn #174 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32
 #48: core_reloc: insn #178 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32
 #49: core_reloc: insn #182 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32

     157:       18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0 ll
     159:       7b 12 20 01 00 00 00 00 *(u64 *)(r2 + 288) = r1
     160:       b7 02 00 00 04 00 00 00 r2 = 4
; BYTE_SIZE relocation here                 ^^^
     161:       66 02 07 00 03 00 00 00 if w2 s> 3 goto +7 <LBB0_63>
     162:       16 02 0d 00 01 00 00 00 if w2 == 1 goto +13 <LBB0_65>
     163:       16 02 01 00 02 00 00 00 if w2 == 2 goto +1 <LBB0_66>
     164:       05 00 12 00 00 00 00 00 goto +18 <LBB0_69>

0000000000000528 <LBB0_66>:
     165:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
     167:       69 11 08 00 00 00 00 00 r1 = *(u16 *)(r1 + 8)
; BYTE_OFFSET relo here w/ WRONG size        ^^^^^^^^^^^^^^^^
     168:       05 00 0e 00 00 00 00 00 goto +14 <LBB0_69>

0000000000000548 <LBB0_63>:
     169:       16 02 0a 00 04 00 00 00 if w2 == 4 goto +10 <LBB0_67>
     170:       16 02 01 00 08 00 00 00 if w2 == 8 goto +1 <LBB0_68>
     171:       05 00 0b 00 00 00 00 00 goto +11 <LBB0_69>

0000000000000560 <LBB0_68>:
     172:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
     174:       79 11 08 00 00 00 00 00 r1 = *(u64 *)(r1 + 8)
; BYTE_OFFSET relo here w/ WRONG size        ^^^^^^^^^^^^^^^^
     175:       05 00 07 00 00 00 00 00 goto +7 <LBB0_69>

0000000000000580 <LBB0_65>:
     176:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
     178:       71 11 08 00 00 00 00 00 r1 = *(u8 *)(r1 + 8)
; BYTE_OFFSET relo here w/ WRONG size        ^^^^^^^^^^^^^^^^
     179:       05 00 03 00 00 00 00 00 goto +3 <LBB0_69>

00000000000005a0 <LBB0_67>:
     180:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
     182:       61 11 08 00 00 00 00 00 r1 = *(u32 *)(r1 + 8)
; BYTE_OFFSET relo here w/ RIGHT size        ^^^^^^^^^^^^^^^^

00000000000005b8 <LBB0_69>:
     183:       67 01 00 00 20 00 00 00 r1 <<= 32
     184:       b7 02 00 00 00 00 00 00 r2 = 0
     185:       16 02 02 00 00 00 00 00 if w2 == 0 goto +2 <LBB0_71>
     186:       c7 01 00 00 20 00 00 00 r1 s>>= 32
     187:       05 00 01 00 00 00 00 00 goto +1 <LBB0_72>

00000000000005e0 <LBB0_71>:
     188:       77 01 00 00 20 00 00 00 r1 >>= 32

AFTER
=====

 #30: core_reloc: insn #132 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32
 #31: core_reloc: insn #134 --> [5] + 0:5: byte_sz --> struct core_reloc_bitfields.u32

     129:       18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0 ll
     131:       7b 12 20 01 00 00 00 00 *(u64 *)(r2 + 288) = r1
     132:       b7 01 00 00 08 00 00 00 r1 = 8
; BYTE_OFFSET relo here                     ^^^
; no size check for non-memory dereferencing instructions
     133:       0f 12 00 00 00 00 00 00 r2 += r1
     134:       b7 03 00 00 04 00 00 00 r3 = 4
; BYTE_SIZE relocation here                 ^^^
     135:       66 03 05 00 03 00 00 00 if w3 s> 3 goto +5 <LBB0_63>
     136:       16 03 09 00 01 00 00 00 if w3 == 1 goto +9 <LBB0_65>
     137:       16 03 01 00 02 00 00 00 if w3 == 2 goto +1 <LBB0_66>
     138:       05 00 0a 00 00 00 00 00 goto +10 <LBB0_69>

0000000000000458 <LBB0_66>:
     139:       69 21 00 00 00 00 00 00 r1 = *(u16 *)(r2 + 0)
; NO CO-RE relocation here                   ^^^^^^^^^^^^^^^^
     140:       05 00 08 00 00 00 00 00 goto +8 <LBB0_69>

0000000000000468 <LBB0_63>:
     141:       16 03 06 00 04 00 00 00 if w3 == 4 goto +6 <LBB0_67>
     142:       16 03 01 00 08 00 00 00 if w3 == 8 goto +1 <LBB0_68>
     143:       05 00 05 00 00 00 00 00 goto +5 <LBB0_69>

0000000000000480 <LBB0_68>:
     144:       79 21 00 00 00 00 00 00 r1 = *(u64 *)(r2 + 0)
; NO CO-RE relocation here                   ^^^^^^^^^^^^^^^^
     145:       05 00 03 00 00 00 00 00 goto +3 <LBB0_69>

0000000000000490 <LBB0_65>:
     146:       71 21 00 00 00 00 00 00 r1 = *(u8 *)(r2 + 0)
; NO CO-RE relocation here                   ^^^^^^^^^^^^^^^^
     147:       05 00 01 00 00 00 00 00 goto +1 <LBB0_69>

00000000000004a0 <LBB0_67>:
     148:       61 21 00 00 00 00 00 00 r1 = *(u32 *)(r2 + 0)
; NO CO-RE relocation here                   ^^^^^^^^^^^^^^^^

00000000000004a8 <LBB0_69>:
     149:       67 01 00 00 20 00 00 00 r1 <<= 32
     150:       b7 02 00 00 00 00 00 00 r2 = 0
     151:       16 02 02 00 00 00 00 00 if w2 == 0 goto +2 <LBB0_71>
     152:       c7 01 00 00 20 00 00 00 r1 s>>= 32
     153:       05 00 01 00 00 00 00 00 goto +1 <LBB0_72>

00000000000004d0 <LBB0_71>:
     154:       77 01 00 00 20 00 00 00 r1 >>= 323

Fixes: ee26dad ("libbpf: Add support for relocatable bitfields")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
kernel-patches-bot pushed a commit that referenced this pull request Apr 26, 2021
Fix BPF_CORE_READ_BITFIELD() macro used for reading CO-RE-relocatable
bitfields. Missing breaks in a switch caused 8-byte reads always. This can
confuse libbpf because it does strict checks that memory load size corresponds
to the original size of the field, which in this case quite often would be
wrong.

After fixing that, we run into another problem, which quite subtle, so worth
documenting here. The issue is in Clang optimization and CO-RE relocation
interactions. Without that asm volatile construct (also known as
barrier_var()), Clang will re-order BYTE_OFFSET and BYTE_SIZE relocations and
will apply BYTE_OFFSET 4 times for each switch case arm. This will result in
the same error from libbpf about mismatch of memory load size and original
field size. I.e., if we were reading u32, we'd still have *(u8 *), *(u16 *),
*(u32 *), and *(u64 *) memory loads, three of which will fail. Using
barrier_var() forces Clang to apply BYTE_OFFSET relocation first (and once) to
calculate p, after which value of p is used without relocation in each of
switch case arms, doing appropiately-sized memory load.

Here's the list of relevant relocations and pieces of generated BPF code
before and after this patch for test_core_reloc_bitfields_direct selftests.

BEFORE
=====
 #45: core_reloc: insn #160 --> [5] + 0:5: byte_sz --> struct core_reloc_bitfields.u32
 #46: core_reloc: insn #167 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32
 #47: core_reloc: insn #174 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32
 #48: core_reloc: insn #178 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32
 #49: core_reloc: insn #182 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32

     157:       18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0 ll
     159:       7b 12 20 01 00 00 00 00 *(u64 *)(r2 + 288) = r1
     160:       b7 02 00 00 04 00 00 00 r2 = 4
; BYTE_SIZE relocation here                 ^^^
     161:       66 02 07 00 03 00 00 00 if w2 s> 3 goto +7 <LBB0_63>
     162:       16 02 0d 00 01 00 00 00 if w2 == 1 goto +13 <LBB0_65>
     163:       16 02 01 00 02 00 00 00 if w2 == 2 goto +1 <LBB0_66>
     164:       05 00 12 00 00 00 00 00 goto +18 <LBB0_69>

0000000000000528 <LBB0_66>:
     165:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
     167:       69 11 08 00 00 00 00 00 r1 = *(u16 *)(r1 + 8)
; BYTE_OFFSET relo here w/ WRONG size        ^^^^^^^^^^^^^^^^
     168:       05 00 0e 00 00 00 00 00 goto +14 <LBB0_69>

0000000000000548 <LBB0_63>:
     169:       16 02 0a 00 04 00 00 00 if w2 == 4 goto +10 <LBB0_67>
     170:       16 02 01 00 08 00 00 00 if w2 == 8 goto +1 <LBB0_68>
     171:       05 00 0b 00 00 00 00 00 goto +11 <LBB0_69>

0000000000000560 <LBB0_68>:
     172:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
     174:       79 11 08 00 00 00 00 00 r1 = *(u64 *)(r1 + 8)
; BYTE_OFFSET relo here w/ WRONG size        ^^^^^^^^^^^^^^^^
     175:       05 00 07 00 00 00 00 00 goto +7 <LBB0_69>

0000000000000580 <LBB0_65>:
     176:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
     178:       71 11 08 00 00 00 00 00 r1 = *(u8 *)(r1 + 8)
; BYTE_OFFSET relo here w/ WRONG size        ^^^^^^^^^^^^^^^^
     179:       05 00 03 00 00 00 00 00 goto +3 <LBB0_69>

00000000000005a0 <LBB0_67>:
     180:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
     182:       61 11 08 00 00 00 00 00 r1 = *(u32 *)(r1 + 8)
; BYTE_OFFSET relo here w/ RIGHT size        ^^^^^^^^^^^^^^^^

00000000000005b8 <LBB0_69>:
     183:       67 01 00 00 20 00 00 00 r1 <<= 32
     184:       b7 02 00 00 00 00 00 00 r2 = 0
     185:       16 02 02 00 00 00 00 00 if w2 == 0 goto +2 <LBB0_71>
     186:       c7 01 00 00 20 00 00 00 r1 s>>= 32
     187:       05 00 01 00 00 00 00 00 goto +1 <LBB0_72>

00000000000005e0 <LBB0_71>:
     188:       77 01 00 00 20 00 00 00 r1 >>= 32

AFTER
=====

 #30: core_reloc: insn #132 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32
 #31: core_reloc: insn #134 --> [5] + 0:5: byte_sz --> struct core_reloc_bitfields.u32

     129:       18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0 ll
     131:       7b 12 20 01 00 00 00 00 *(u64 *)(r2 + 288) = r1
     132:       b7 01 00 00 08 00 00 00 r1 = 8
; BYTE_OFFSET relo here                     ^^^
; no size check for non-memory dereferencing instructions
     133:       0f 12 00 00 00 00 00 00 r2 += r1
     134:       b7 03 00 00 04 00 00 00 r3 = 4
; BYTE_SIZE relocation here                 ^^^
     135:       66 03 05 00 03 00 00 00 if w3 s> 3 goto +5 <LBB0_63>
     136:       16 03 09 00 01 00 00 00 if w3 == 1 goto +9 <LBB0_65>
     137:       16 03 01 00 02 00 00 00 if w3 == 2 goto +1 <LBB0_66>
     138:       05 00 0a 00 00 00 00 00 goto +10 <LBB0_69>

0000000000000458 <LBB0_66>:
     139:       69 21 00 00 00 00 00 00 r1 = *(u16 *)(r2 + 0)
; NO CO-RE relocation here                   ^^^^^^^^^^^^^^^^
     140:       05 00 08 00 00 00 00 00 goto +8 <LBB0_69>

0000000000000468 <LBB0_63>:
     141:       16 03 06 00 04 00 00 00 if w3 == 4 goto +6 <LBB0_67>
     142:       16 03 01 00 08 00 00 00 if w3 == 8 goto +1 <LBB0_68>
     143:       05 00 05 00 00 00 00 00 goto +5 <LBB0_69>

0000000000000480 <LBB0_68>:
     144:       79 21 00 00 00 00 00 00 r1 = *(u64 *)(r2 + 0)
; NO CO-RE relocation here                   ^^^^^^^^^^^^^^^^
     145:       05 00 03 00 00 00 00 00 goto +3 <LBB0_69>

0000000000000490 <LBB0_65>:
     146:       71 21 00 00 00 00 00 00 r1 = *(u8 *)(r2 + 0)
; NO CO-RE relocation here                   ^^^^^^^^^^^^^^^^
     147:       05 00 01 00 00 00 00 00 goto +1 <LBB0_69>

00000000000004a0 <LBB0_67>:
     148:       61 21 00 00 00 00 00 00 r1 = *(u32 *)(r2 + 0)
; NO CO-RE relocation here                   ^^^^^^^^^^^^^^^^

00000000000004a8 <LBB0_69>:
     149:       67 01 00 00 20 00 00 00 r1 <<= 32
     150:       b7 02 00 00 00 00 00 00 r2 = 0
     151:       16 02 02 00 00 00 00 00 if w2 == 0 goto +2 <LBB0_71>
     152:       c7 01 00 00 20 00 00 00 r1 s>>= 32
     153:       05 00 01 00 00 00 00 00 goto +1 <LBB0_72>

00000000000004d0 <LBB0_71>:
     154:       77 01 00 00 20 00 00 00 r1 >>= 323

Fixes: ee26dad ("libbpf: Add support for relocatable bitfields")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
kernel-patches-bot pushed a commit that referenced this pull request Apr 26, 2021
Fix BPF_CORE_READ_BITFIELD() macro used for reading CO-RE-relocatable
bitfields. Missing breaks in a switch caused 8-byte reads always. This can
confuse libbpf because it does strict checks that memory load size corresponds
to the original size of the field, which in this case quite often would be
wrong.

After fixing that, we run into another problem, which quite subtle, so worth
documenting here. The issue is in Clang optimization and CO-RE relocation
interactions. Without that asm volatile construct (also known as
barrier_var()), Clang will re-order BYTE_OFFSET and BYTE_SIZE relocations and
will apply BYTE_OFFSET 4 times for each switch case arm. This will result in
the same error from libbpf about mismatch of memory load size and original
field size. I.e., if we were reading u32, we'd still have *(u8 *), *(u16 *),
*(u32 *), and *(u64 *) memory loads, three of which will fail. Using
barrier_var() forces Clang to apply BYTE_OFFSET relocation first (and once) to
calculate p, after which value of p is used without relocation in each of
switch case arms, doing appropiately-sized memory load.

Here's the list of relevant relocations and pieces of generated BPF code
before and after this patch for test_core_reloc_bitfields_direct selftests.

BEFORE
=====
 #45: core_reloc: insn #160 --> [5] + 0:5: byte_sz --> struct core_reloc_bitfields.u32
 #46: core_reloc: insn #167 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32
 #47: core_reloc: insn #174 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32
 #48: core_reloc: insn #178 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32
 #49: core_reloc: insn #182 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32

     157:       18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0 ll
     159:       7b 12 20 01 00 00 00 00 *(u64 *)(r2 + 288) = r1
     160:       b7 02 00 00 04 00 00 00 r2 = 4
; BYTE_SIZE relocation here                 ^^^
     161:       66 02 07 00 03 00 00 00 if w2 s> 3 goto +7 <LBB0_63>
     162:       16 02 0d 00 01 00 00 00 if w2 == 1 goto +13 <LBB0_65>
     163:       16 02 01 00 02 00 00 00 if w2 == 2 goto +1 <LBB0_66>
     164:       05 00 12 00 00 00 00 00 goto +18 <LBB0_69>

0000000000000528 <LBB0_66>:
     165:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
     167:       69 11 08 00 00 00 00 00 r1 = *(u16 *)(r1 + 8)
; BYTE_OFFSET relo here w/ WRONG size        ^^^^^^^^^^^^^^^^
     168:       05 00 0e 00 00 00 00 00 goto +14 <LBB0_69>

0000000000000548 <LBB0_63>:
     169:       16 02 0a 00 04 00 00 00 if w2 == 4 goto +10 <LBB0_67>
     170:       16 02 01 00 08 00 00 00 if w2 == 8 goto +1 <LBB0_68>
     171:       05 00 0b 00 00 00 00 00 goto +11 <LBB0_69>

0000000000000560 <LBB0_68>:
     172:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
     174:       79 11 08 00 00 00 00 00 r1 = *(u64 *)(r1 + 8)
; BYTE_OFFSET relo here w/ WRONG size        ^^^^^^^^^^^^^^^^
     175:       05 00 07 00 00 00 00 00 goto +7 <LBB0_69>

0000000000000580 <LBB0_65>:
     176:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
     178:       71 11 08 00 00 00 00 00 r1 = *(u8 *)(r1 + 8)
; BYTE_OFFSET relo here w/ WRONG size        ^^^^^^^^^^^^^^^^
     179:       05 00 03 00 00 00 00 00 goto +3 <LBB0_69>

00000000000005a0 <LBB0_67>:
     180:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
     182:       61 11 08 00 00 00 00 00 r1 = *(u32 *)(r1 + 8)
; BYTE_OFFSET relo here w/ RIGHT size        ^^^^^^^^^^^^^^^^

00000000000005b8 <LBB0_69>:
     183:       67 01 00 00 20 00 00 00 r1 <<= 32
     184:       b7 02 00 00 00 00 00 00 r2 = 0
     185:       16 02 02 00 00 00 00 00 if w2 == 0 goto +2 <LBB0_71>
     186:       c7 01 00 00 20 00 00 00 r1 s>>= 32
     187:       05 00 01 00 00 00 00 00 goto +1 <LBB0_72>

00000000000005e0 <LBB0_71>:
     188:       77 01 00 00 20 00 00 00 r1 >>= 32

AFTER
=====

 #30: core_reloc: insn #132 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32
 #31: core_reloc: insn #134 --> [5] + 0:5: byte_sz --> struct core_reloc_bitfields.u32

     129:       18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0 ll
     131:       7b 12 20 01 00 00 00 00 *(u64 *)(r2 + 288) = r1
     132:       b7 01 00 00 08 00 00 00 r1 = 8
; BYTE_OFFSET relo here                     ^^^
; no size check for non-memory dereferencing instructions
     133:       0f 12 00 00 00 00 00 00 r2 += r1
     134:       b7 03 00 00 04 00 00 00 r3 = 4
; BYTE_SIZE relocation here                 ^^^
     135:       66 03 05 00 03 00 00 00 if w3 s> 3 goto +5 <LBB0_63>
     136:       16 03 09 00 01 00 00 00 if w3 == 1 goto +9 <LBB0_65>
     137:       16 03 01 00 02 00 00 00 if w3 == 2 goto +1 <LBB0_66>
     138:       05 00 0a 00 00 00 00 00 goto +10 <LBB0_69>

0000000000000458 <LBB0_66>:
     139:       69 21 00 00 00 00 00 00 r1 = *(u16 *)(r2 + 0)
; NO CO-RE relocation here                   ^^^^^^^^^^^^^^^^
     140:       05 00 08 00 00 00 00 00 goto +8 <LBB0_69>

0000000000000468 <LBB0_63>:
     141:       16 03 06 00 04 00 00 00 if w3 == 4 goto +6 <LBB0_67>
     142:       16 03 01 00 08 00 00 00 if w3 == 8 goto +1 <LBB0_68>
     143:       05 00 05 00 00 00 00 00 goto +5 <LBB0_69>

0000000000000480 <LBB0_68>:
     144:       79 21 00 00 00 00 00 00 r1 = *(u64 *)(r2 + 0)
; NO CO-RE relocation here                   ^^^^^^^^^^^^^^^^
     145:       05 00 03 00 00 00 00 00 goto +3 <LBB0_69>

0000000000000490 <LBB0_65>:
     146:       71 21 00 00 00 00 00 00 r1 = *(u8 *)(r2 + 0)
; NO CO-RE relocation here                   ^^^^^^^^^^^^^^^^
     147:       05 00 01 00 00 00 00 00 goto +1 <LBB0_69>

00000000000004a0 <LBB0_67>:
     148:       61 21 00 00 00 00 00 00 r1 = *(u32 *)(r2 + 0)
; NO CO-RE relocation here                   ^^^^^^^^^^^^^^^^

00000000000004a8 <LBB0_69>:
     149:       67 01 00 00 20 00 00 00 r1 <<= 32
     150:       b7 02 00 00 00 00 00 00 r2 = 0
     151:       16 02 02 00 00 00 00 00 if w2 == 0 goto +2 <LBB0_71>
     152:       c7 01 00 00 20 00 00 00 r1 s>>= 32
     153:       05 00 01 00 00 00 00 00 goto +1 <LBB0_72>

00000000000004d0 <LBB0_71>:
     154:       77 01 00 00 20 00 00 00 r1 >>= 323

Acked-by: Lorenz Bauer <lmb@cloudflare.com>
Fixes: ee26dad ("libbpf: Add support for relocatable bitfields")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Asphaltt added a commit to Asphaltt/bpf that referenced this pull request Jul 23, 2024
Since commit 1c123c5 ("bpf: Resolve fext program type when
checking map compatibility"), freplace prog can be used as tail-callee.

However, when freplace prog has been attached and then updates to
PROG_ARRAY map, it will panic, because the updating checks prog type of
freplace prog by 'prog->aux->dst_prog->type' and 'prog->aux->dst_prog' of
freplace prog is NULL.

[309049.036402] BUG: kernel NULL pointer dereference, address: 0000000000000004
[309049.036419] #PF: supervisor read access in kernel mode
[309049.036426] #PF: error_code(0x0000) - not-present page
[309049.036432] PGD 0 P4D 0
[309049.036437] Oops: 0000 [kernel-patches#1] PREEMPT SMP NOPTI
[309049.036444] CPU: 2 PID: 788148 Comm: test_progs Not tainted 6.8.0-31-generic kernel-patches#31-Ubuntu
[309049.036465] Hardware name: VMware, Inc. VMware20,1/440BX Desktop Reference Platform, BIOS VMW201.00V.21805430.B64.2305221830 05/22/2023
[309049.036477] RIP: 0010:bpf_prog_map_compatible+0x2a/0x140
[309049.036488] Code: 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 49 89 fe 41 55 41 54 53 44 8b 6e 04 48 89 f3 41 83 fd 1c 75 0c 48 8b 46 38 48 8b 40 70 <44> 8b 68 04 f6 43 03 01 75 1c 48 8b 43 38 44 0f b6 a0 89 00 00 00
[309049.036505] RSP: 0018:ffffb2e080fd7ce0 EFLAGS: 00010246
[309049.036513] RAX: 0000000000000000 RBX: ffffb2e0807c1000 RCX: 0000000000000000
[309049.036521] RDX: 0000000000000000 RSI: ffffb2e0807c1000 RDI: ffff990290259e00
[309049.036528] RBP: ffffb2e080fd7d08 R08: 0000000000000000 R09: 0000000000000000
[309049.036536] R10: 0000000000000000 R11: 0000000000000000 R12: ffff990290259e00
[309049.036543] R13: 000000000000001c R14: ffff990290259e00 R15: ffff99028e29c400
[309049.036551] FS:  00007b82cbc28140(0000) GS:ffff9903b3f00000(0000) knlGS:0000000000000000
[309049.036559] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[309049.036566] CR2: 0000000000000004 CR3: 0000000101286002 CR4: 00000000003706f0
[309049.036573] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[309049.036581] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[309049.036588] Call Trace:
[309049.036592]  <TASK>
[309049.036597]  ? show_regs+0x6d/0x80
[309049.036604]  ? __die+0x24/0x80
[309049.036619]  ? page_fault_oops+0x99/0x1b0
[309049.036628]  ? do_user_addr_fault+0x2ee/0x6b0
[309049.036634]  ? exc_page_fault+0x83/0x1b0
[309049.036641]  ? asm_exc_page_fault+0x27/0x30
[309049.036649]  ? bpf_prog_map_compatible+0x2a/0x140
[309049.036656]  prog_fd_array_get_ptr+0x2c/0x70
[309049.036664]  bpf_fd_array_map_update_elem+0x37/0x130
[309049.036671]  bpf_map_update_value+0x1d3/0x260
[309049.036677]  map_update_elem+0x1fa/0x360
[309049.036683]  __sys_bpf+0x54c/0xa10
[309049.036689]  __x64_sys_bpf+0x1a/0x30
[309049.036694]  x64_sys_call+0x1936/0x25c0
[309049.036700]  do_syscall_64+0x7f/0x180
[309049.036706]  ? do_syscall_64+0x8c/0x180
[309049.036712]  ? do_syscall_64+0x8c/0x180
[309049.036717]  ? irqentry_exit+0x43/0x50
[309049.036723]  ? common_interrupt+0x54/0xb0
[309049.036729]  entry_SYSCALL_64_after_hwframe+0x73/0x7b

Why 'prog->aux->dst_prog' of freplace prog is NULL? It causes by commit
3aac1ea ("bpf: Move prog->aux->linked_prog and trampoline into
bpf_link on attach").

As 'prog->aux->dst_prog' of freplace prog is set as NULL when attach,
freplace prog does not have stable dst_prog type. But when to update
freplace prog to PROG_ARRAY map, it requires checking prog type. They are
conflict in theory.

This patch resolves prog type of freplace prog by
'prog->aux->saved_dst_prog_type' to avoid panic.

Fixes: 1c123c5 ("bpf: Resolve fext program type when checking map compatibility")
Signed-off-by: Leon Hwang <hffilwlqm@gmail.com>
Asphaltt added a commit to Asphaltt/bpf that referenced this pull request Jul 24, 2024
The commit f7866c3 ("bpf: Fix null pointer dereference in
resolve_prog_type() for BPF_PROG_TYPE_EXT") fixed the following panic,
which was caused by updating attached freplace prog to PROG_ARRAY map.

But, it does not support updating attached freplace prog to PROG_ARRAY
map.

[309049.036402] BUG: kernel NULL pointer dereference, address: 0000000000000004
[309049.036419] #PF: supervisor read access in kernel mode
[309049.036426] #PF: error_code(0x0000) - not-present page
[309049.036432] PGD 0 P4D 0
[309049.036437] Oops: 0000 [kernel-patches#1] PREEMPT SMP NOPTI
[309049.036444] CPU: 2 PID: 788148 Comm: test_progs Not tainted 6.8.0-31-generic kernel-patches#31-Ubuntu
[309049.036465] Hardware name: VMware, Inc. VMware20,1/440BX Desktop Reference Platform, BIOS VMW201.00V.21805430.B64.2305221830 05/22/2023
[309049.036477] RIP: 0010:bpf_prog_map_compatible+0x2a/0x140
[309049.036488] Code: 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 49 89 fe 41 55 41 54 53 44 8b 6e 04 48 89 f3 41 83 fd 1c 75 0c 48 8b 46 38 48 8b 40 70 <44> 8b 68 04 f6 43 03 01 75 1c 48 8b 43 38 44 0f b6 a0 89 00 00 00
[309049.036505] RSP: 0018:ffffb2e080fd7ce0 EFLAGS: 00010246
[309049.036513] RAX: 0000000000000000 RBX: ffffb2e0807c1000 RCX: 0000000000000000
[309049.036521] RDX: 0000000000000000 RSI: ffffb2e0807c1000 RDI: ffff990290259e00
[309049.036528] RBP: ffffb2e080fd7d08 R08: 0000000000000000 R09: 0000000000000000
[309049.036536] R10: 0000000000000000 R11: 0000000000000000 R12: ffff990290259e00
[309049.036543] R13: 000000000000001c R14: ffff990290259e00 R15: ffff99028e29c400
[309049.036551] FS:  00007b82cbc28140(0000) GS:ffff9903b3f00000(0000) knlGS:0000000000000000
[309049.036559] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[309049.036566] CR2: 0000000000000004 CR3: 0000000101286002 CR4: 00000000003706f0
[309049.036573] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[309049.036581] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[309049.036588] Call Trace:
[309049.036592]  <TASK>
[309049.036597]  ? show_regs+0x6d/0x80
[309049.036604]  ? __die+0x24/0x80
[309049.036619]  ? page_fault_oops+0x99/0x1b0
[309049.036628]  ? do_user_addr_fault+0x2ee/0x6b0
[309049.036634]  ? exc_page_fault+0x83/0x1b0
[309049.036641]  ? asm_exc_page_fault+0x27/0x30
[309049.036649]  ? bpf_prog_map_compatible+0x2a/0x140
[309049.036656]  prog_fd_array_get_ptr+0x2c/0x70
[309049.036664]  bpf_fd_array_map_update_elem+0x37/0x130
[309049.036671]  bpf_map_update_value+0x1d3/0x260
[309049.036677]  map_update_elem+0x1fa/0x360
[309049.036683]  __sys_bpf+0x54c/0xa10
[309049.036689]  __x64_sys_bpf+0x1a/0x30
[309049.036694]  x64_sys_call+0x1936/0x25c0
[309049.036700]  do_syscall_64+0x7f/0x180
[309049.036706]  ? do_syscall_64+0x8c/0x180
[309049.036712]  ? do_syscall_64+0x8c/0x180
[309049.036717]  ? irqentry_exit+0x43/0x50
[309049.036723]  ? common_interrupt+0x54/0xb0
[309049.036729]  entry_SYSCALL_64_after_hwframe+0x73/0x7b

Since commit 1c123c5 ("bpf: Resolve fext program type when
checking map compatibility"), freplace prog can be used as tail-callee
of its target prog.
And the commit 3aac1ea ("bpf: Move prog->aux->linked_prog and
trampoline into bpf_link on attach") sets prog->aux->dst_prog as NULL
when attach freplace prog to its target.

Then, as for following example:

tailcall_freplace.c:

// SPDX-License-Identifier: GPL-2.0

\#include <linux/bpf.h>
\#include <bpf/bpf_helpers.h>
\#include "bpf_legacy.h"

struct {
	__uint(type, BPF_MAP_TYPE_PROG_ARRAY);
	__uint(max_entries, 1);
	__uint(key_size, sizeof(__u32));
	__uint(value_size, sizeof(__u32));
} jmp_table SEC(".maps");

int count = 0;

__noinline int
subprog(struct __sk_buff *skb)
{
	volatile int ret = 1;

	count++;

	bpf_tail_call_static(skb, &jmp_table, 0);

	return ret;
}

SEC("freplace")
int entry(struct __sk_buff *skb)
{
	return subprog(skb);
}

char __license[] SEC("license") = "GPL";

tc_bpf2bpf.c:

// SPDX-License-Identifier: GPL-2.0

\#include <linux/bpf.h>
\#include <bpf/bpf_helpers.h>
\#include "bpf_legacy.h"

__noinline int
subprog(struct __sk_buff *skb)
{
	volatile int ret = 1;

	return ret;
}

SEC("tc")
int entry(struct __sk_buff *skb)
{
	return subprog(skb);
}

char __license[] SEC("license") = "GPL";

And freplace entry prog's target is the tc subprog.

After loading, the freplace jmp_table's owner type is
BPF_PROG_TYPE_SCHED_CLS.

Next, after attaching freplace prog to tc subprog, its prog->aux->
dst_prog is NULL.

Next, when update freplace prog to jmp_table, bpf_prog_map_compatible()
will return false because resolve_prog_type() returns BPF_PROG_TYPE_EXT
instead of BPF_PROG_TYPE_SCHED_CLS.

With this patch, resolve_prog_type() return BPF_PROG_TYPE_SCHED_CLS to
support updating attached freplace prog to PROG_ARRY map for this
example.

Fixes: f7866c3 ("bpf: Fix null pointer dereference in resolve_prog_type() for BPF_PROG_TYPE_EXT")
Signed-off-by: Leon Hwang <hffilwlqm@gmail.com>
Asphaltt added a commit to Asphaltt/bpf that referenced this pull request Jul 24, 2024
The commit f7866c3 ("bpf: Fix null pointer dereference in
resolve_prog_type() for BPF_PROG_TYPE_EXT") fixed the following panic,
which was caused by updating attached freplace prog to PROG_ARRAY map.

But, it does not support updating attached freplace prog to PROG_ARRAY
map.

[309049.036402] BUG: kernel NULL pointer dereference, address: 0000000000000004
[309049.036419] #PF: supervisor read access in kernel mode
[309049.036426] #PF: error_code(0x0000) - not-present page
[309049.036432] PGD 0 P4D 0
[309049.036437] Oops: 0000 [kernel-patches#1] PREEMPT SMP NOPTI
[309049.036444] CPU: 2 PID: 788148 Comm: test_progs Not tainted 6.8.0-31-generic kernel-patches#31-Ubuntu
[309049.036465] Hardware name: VMware, Inc. VMware20,1/440BX Desktop Reference Platform, BIOS VMW201.00V.21805430.B64.2305221830 05/22/2023
[309049.036477] RIP: 0010:bpf_prog_map_compatible+0x2a/0x140
[309049.036488] Code: 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 49 89 fe 41 55 41 54 53 44 8b 6e 04 48 89 f3 41 83 fd 1c 75 0c 48 8b 46 38 48 8b 40 70 <44> 8b 68 04 f6 43 03 01 75 1c 48 8b 43 38 44 0f b6 a0 89 00 00 00
[309049.036505] RSP: 0018:ffffb2e080fd7ce0 EFLAGS: 00010246
[309049.036513] RAX: 0000000000000000 RBX: ffffb2e0807c1000 RCX: 0000000000000000
[309049.036521] RDX: 0000000000000000 RSI: ffffb2e0807c1000 RDI: ffff990290259e00
[309049.036528] RBP: ffffb2e080fd7d08 R08: 0000000000000000 R09: 0000000000000000
[309049.036536] R10: 0000000000000000 R11: 0000000000000000 R12: ffff990290259e00
[309049.036543] R13: 000000000000001c R14: ffff990290259e00 R15: ffff99028e29c400
[309049.036551] FS:  00007b82cbc28140(0000) GS:ffff9903b3f00000(0000) knlGS:0000000000000000
[309049.036559] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[309049.036566] CR2: 0000000000000004 CR3: 0000000101286002 CR4: 00000000003706f0
[309049.036573] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[309049.036581] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[309049.036588] Call Trace:
[309049.036592]  <TASK>
[309049.036597]  ? show_regs+0x6d/0x80
[309049.036604]  ? __die+0x24/0x80
[309049.036619]  ? page_fault_oops+0x99/0x1b0
[309049.036628]  ? do_user_addr_fault+0x2ee/0x6b0
[309049.036634]  ? exc_page_fault+0x83/0x1b0
[309049.036641]  ? asm_exc_page_fault+0x27/0x30
[309049.036649]  ? bpf_prog_map_compatible+0x2a/0x140
[309049.036656]  prog_fd_array_get_ptr+0x2c/0x70
[309049.036664]  bpf_fd_array_map_update_elem+0x37/0x130
[309049.036671]  bpf_map_update_value+0x1d3/0x260
[309049.036677]  map_update_elem+0x1fa/0x360
[309049.036683]  __sys_bpf+0x54c/0xa10
[309049.036689]  __x64_sys_bpf+0x1a/0x30
[309049.036694]  x64_sys_call+0x1936/0x25c0
[309049.036700]  do_syscall_64+0x7f/0x180
[309049.036706]  ? do_syscall_64+0x8c/0x180
[309049.036712]  ? do_syscall_64+0x8c/0x180
[309049.036717]  ? irqentry_exit+0x43/0x50
[309049.036723]  ? common_interrupt+0x54/0xb0
[309049.036729]  entry_SYSCALL_64_after_hwframe+0x73/0x7b

Since commit 1c123c5 ("bpf: Resolve fext program type when
checking map compatibility"), freplace prog can be used as tail-callee
of its target prog.
And the commit 3aac1ea ("bpf: Move prog->aux->linked_prog and
trampoline into bpf_link on attach") sets prog->aux->dst_prog as NULL
when attach freplace prog to its target.

Then, as for following example:

tailcall_freplace.c:

// SPDX-License-Identifier: GPL-2.0

\#include <linux/bpf.h>
\#include <bpf/bpf_helpers.h>
\#include "bpf_legacy.h"

struct {
	__uint(type, BPF_MAP_TYPE_PROG_ARRAY);
	__uint(max_entries, 1);
	__uint(key_size, sizeof(__u32));
	__uint(value_size, sizeof(__u32));
} jmp_table SEC(".maps");

int count = 0;

__noinline int
subprog(struct __sk_buff *skb)
{
	volatile int ret = 1;

	count++;

	bpf_tail_call_static(skb, &jmp_table, 0);

	return ret;
}

SEC("freplace")
int entry(struct __sk_buff *skb)
{
	return subprog(skb);
}

char __license[] SEC("license") = "GPL";

tc_bpf2bpf.c:

// SPDX-License-Identifier: GPL-2.0

\#include <linux/bpf.h>
\#include <bpf/bpf_helpers.h>
\#include "bpf_legacy.h"

__noinline int
subprog(struct __sk_buff *skb)
{
	volatile int ret = 1;

	return ret;
}

SEC("tc")
int entry(struct __sk_buff *skb)
{
	return subprog(skb);
}

char __license[] SEC("license") = "GPL";

And freplace entry prog's target is the tc subprog.

After loading, the freplace jmp_table's owner type is
BPF_PROG_TYPE_SCHED_CLS.

Next, after attaching freplace prog to tc subprog, its prog->aux->
dst_prog is NULL.

Next, when update freplace prog to jmp_table, bpf_prog_map_compatible()
returns false because resolve_prog_type() returns BPF_PROG_TYPE_EXT instead
of BPF_PROG_TYPE_SCHED_CLS.

With this patch, resolve_prog_type() returns BPF_PROG_TYPE_SCHED_CLS to
support updating attached freplace prog to PROG_ARRY map for this
example.

Fixes: f7866c3 ("bpf: Fix null pointer dereference in resolve_prog_type() for BPF_PROG_TYPE_EXT")
Cc: Toke Høiland-Jørgensen <toke@redhat.com>
Cc: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
kernel-patches-daemon-bpf bot pushed a commit that referenced this pull request Jul 25, 2024
The commit f7866c3 ("bpf: Fix null pointer dereference in
resolve_prog_type() for BPF_PROG_TYPE_EXT") fixed the following panic,
which was caused by updating attached freplace prog to PROG_ARRAY map.

But, it does not support updating attached freplace prog to PROG_ARRAY
map.

[309049.036402] BUG: kernel NULL pointer dereference, address: 0000000000000004
[309049.036419] #PF: supervisor read access in kernel mode
[309049.036426] #PF: error_code(0x0000) - not-present page
[309049.036432] PGD 0 P4D 0
[309049.036437] Oops: 0000 [#1] PREEMPT SMP NOPTI
[309049.036444] CPU: 2 PID: 788148 Comm: test_progs Not tainted 6.8.0-31-generic #31-Ubuntu
[309049.036465] Hardware name: VMware, Inc. VMware20,1/440BX Desktop Reference Platform, BIOS VMW201.00V.21805430.B64.2305221830 05/22/2023
[309049.036477] RIP: 0010:bpf_prog_map_compatible+0x2a/0x140
[309049.036488] Code: 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 49 89 fe 41 55 41 54 53 44 8b 6e 04 48 89 f3 41 83 fd 1c 75 0c 48 8b 46 38 48 8b 40 70 <44> 8b 68 04 f6 43 03 01 75 1c 48 8b 43 38 44 0f b6 a0 89 00 00 00
[309049.036505] RSP: 0018:ffffb2e080fd7ce0 EFLAGS: 00010246
[309049.036513] RAX: 0000000000000000 RBX: ffffb2e0807c1000 RCX: 0000000000000000
[309049.036521] RDX: 0000000000000000 RSI: ffffb2e0807c1000 RDI: ffff990290259e00
[309049.036528] RBP: ffffb2e080fd7d08 R08: 0000000000000000 R09: 0000000000000000
[309049.036536] R10: 0000000000000000 R11: 0000000000000000 R12: ffff990290259e00
[309049.036543] R13: 000000000000001c R14: ffff990290259e00 R15: ffff99028e29c400
[309049.036551] FS:  00007b82cbc28140(0000) GS:ffff9903b3f00000(0000) knlGS:0000000000000000
[309049.036559] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[309049.036566] CR2: 0000000000000004 CR3: 0000000101286002 CR4: 00000000003706f0
[309049.036573] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[309049.036581] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[309049.036588] Call Trace:
[309049.036592]  <TASK>
[309049.036597]  ? show_regs+0x6d/0x80
[309049.036604]  ? __die+0x24/0x80
[309049.036619]  ? page_fault_oops+0x99/0x1b0
[309049.036628]  ? do_user_addr_fault+0x2ee/0x6b0
[309049.036634]  ? exc_page_fault+0x83/0x1b0
[309049.036641]  ? asm_exc_page_fault+0x27/0x30
[309049.036649]  ? bpf_prog_map_compatible+0x2a/0x140
[309049.036656]  prog_fd_array_get_ptr+0x2c/0x70
[309049.036664]  bpf_fd_array_map_update_elem+0x37/0x130
[309049.036671]  bpf_map_update_value+0x1d3/0x260
[309049.036677]  map_update_elem+0x1fa/0x360
[309049.036683]  __sys_bpf+0x54c/0xa10
[309049.036689]  __x64_sys_bpf+0x1a/0x30
[309049.036694]  x64_sys_call+0x1936/0x25c0
[309049.036700]  do_syscall_64+0x7f/0x180
[309049.036706]  ? do_syscall_64+0x8c/0x180
[309049.036712]  ? do_syscall_64+0x8c/0x180
[309049.036717]  ? irqentry_exit+0x43/0x50
[309049.036723]  ? common_interrupt+0x54/0xb0
[309049.036729]  entry_SYSCALL_64_after_hwframe+0x73/0x7b

Since commit 1c123c5 ("bpf: Resolve fext program type when
checking map compatibility"), freplace prog can be used as tail-callee
of its target prog.
And the commit 3aac1ea ("bpf: Move prog->aux->linked_prog and
trampoline into bpf_link on attach") sets prog->aux->dst_prog as NULL
when attach freplace prog to its target.

Then, as for following example:

tailcall_freplace.c:

// SPDX-License-Identifier: GPL-2.0

\#include <linux/bpf.h>
\#include <bpf/bpf_helpers.h>
\#include "bpf_legacy.h"

struct {
	__uint(type, BPF_MAP_TYPE_PROG_ARRAY);
	__uint(max_entries, 1);
	__uint(key_size, sizeof(__u32));
	__uint(value_size, sizeof(__u32));
} jmp_table SEC(".maps");

int count = 0;

__noinline int
subprog(struct __sk_buff *skb)
{
	volatile int ret = 1;

	count++;

	bpf_tail_call_static(skb, &jmp_table, 0);

	return ret;
}

SEC("freplace")
int entry(struct __sk_buff *skb)
{
	return subprog(skb);
}

char __license[] SEC("license") = "GPL";

tc_bpf2bpf.c:

// SPDX-License-Identifier: GPL-2.0

\#include <linux/bpf.h>
\#include <bpf/bpf_helpers.h>
\#include "bpf_legacy.h"

__noinline int
subprog(struct __sk_buff *skb)
{
	volatile int ret = 1;

	return ret;
}

SEC("tc")
int entry(struct __sk_buff *skb)
{
	return subprog(skb);
}

char __license[] SEC("license") = "GPL";

And freplace entry prog's target is the tc subprog.

After loading, the freplace jmp_table's owner type is
BPF_PROG_TYPE_SCHED_CLS.

Next, after attaching freplace prog to tc subprog, its prog->aux->
dst_prog is NULL.

Next, when update freplace prog to jmp_table, bpf_prog_map_compatible()
returns false because resolve_prog_type() returns BPF_PROG_TYPE_EXT instead
of BPF_PROG_TYPE_SCHED_CLS.

With this patch, resolve_prog_type() returns BPF_PROG_TYPE_SCHED_CLS to
support updating attached freplace prog to PROG_ARRY map for this
example.

Fixes: f7866c3 ("bpf: Fix null pointer dereference in resolve_prog_type() for BPF_PROG_TYPE_EXT")
Cc: Toke Høiland-Jørgensen <toke@redhat.com>
Cc: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
kernel-patches-daemon-bpf bot pushed a commit that referenced this pull request Jul 25, 2024
The commit f7866c3 ("bpf: Fix null pointer dereference in
resolve_prog_type() for BPF_PROG_TYPE_EXT") fixed the following panic,
which was caused by updating attached freplace prog to PROG_ARRAY map.

But, it does not support updating attached freplace prog to PROG_ARRAY
map.

[309049.036402] BUG: kernel NULL pointer dereference, address: 0000000000000004
[309049.036419] #PF: supervisor read access in kernel mode
[309049.036426] #PF: error_code(0x0000) - not-present page
[309049.036432] PGD 0 P4D 0
[309049.036437] Oops: 0000 [#1] PREEMPT SMP NOPTI
[309049.036444] CPU: 2 PID: 788148 Comm: test_progs Not tainted 6.8.0-31-generic #31-Ubuntu
[309049.036465] Hardware name: VMware, Inc. VMware20,1/440BX Desktop Reference Platform, BIOS VMW201.00V.21805430.B64.2305221830 05/22/2023
[309049.036477] RIP: 0010:bpf_prog_map_compatible+0x2a/0x140
[309049.036488] Code: 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 49 89 fe 41 55 41 54 53 44 8b 6e 04 48 89 f3 41 83 fd 1c 75 0c 48 8b 46 38 48 8b 40 70 <44> 8b 68 04 f6 43 03 01 75 1c 48 8b 43 38 44 0f b6 a0 89 00 00 00
[309049.036505] RSP: 0018:ffffb2e080fd7ce0 EFLAGS: 00010246
[309049.036513] RAX: 0000000000000000 RBX: ffffb2e0807c1000 RCX: 0000000000000000
[309049.036521] RDX: 0000000000000000 RSI: ffffb2e0807c1000 RDI: ffff990290259e00
[309049.036528] RBP: ffffb2e080fd7d08 R08: 0000000000000000 R09: 0000000000000000
[309049.036536] R10: 0000000000000000 R11: 0000000000000000 R12: ffff990290259e00
[309049.036543] R13: 000000000000001c R14: ffff990290259e00 R15: ffff99028e29c400
[309049.036551] FS:  00007b82cbc28140(0000) GS:ffff9903b3f00000(0000) knlGS:0000000000000000
[309049.036559] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[309049.036566] CR2: 0000000000000004 CR3: 0000000101286002 CR4: 00000000003706f0
[309049.036573] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[309049.036581] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[309049.036588] Call Trace:
[309049.036592]  <TASK>
[309049.036597]  ? show_regs+0x6d/0x80
[309049.036604]  ? __die+0x24/0x80
[309049.036619]  ? page_fault_oops+0x99/0x1b0
[309049.036628]  ? do_user_addr_fault+0x2ee/0x6b0
[309049.036634]  ? exc_page_fault+0x83/0x1b0
[309049.036641]  ? asm_exc_page_fault+0x27/0x30
[309049.036649]  ? bpf_prog_map_compatible+0x2a/0x140
[309049.036656]  prog_fd_array_get_ptr+0x2c/0x70
[309049.036664]  bpf_fd_array_map_update_elem+0x37/0x130
[309049.036671]  bpf_map_update_value+0x1d3/0x260
[309049.036677]  map_update_elem+0x1fa/0x360
[309049.036683]  __sys_bpf+0x54c/0xa10
[309049.036689]  __x64_sys_bpf+0x1a/0x30
[309049.036694]  x64_sys_call+0x1936/0x25c0
[309049.036700]  do_syscall_64+0x7f/0x180
[309049.036706]  ? do_syscall_64+0x8c/0x180
[309049.036712]  ? do_syscall_64+0x8c/0x180
[309049.036717]  ? irqentry_exit+0x43/0x50
[309049.036723]  ? common_interrupt+0x54/0xb0
[309049.036729]  entry_SYSCALL_64_after_hwframe+0x73/0x7b

Since commit 1c123c5 ("bpf: Resolve fext program type when
checking map compatibility"), freplace prog can be used as tail-callee
of its target prog.
And the commit 3aac1ea ("bpf: Move prog->aux->linked_prog and
trampoline into bpf_link on attach") sets prog->aux->dst_prog as NULL
when attach freplace prog to its target.

Then, as for following example:

tailcall_freplace.c:

// SPDX-License-Identifier: GPL-2.0

\#include <linux/bpf.h>
\#include <bpf/bpf_helpers.h>
\#include "bpf_legacy.h"

struct {
	__uint(type, BPF_MAP_TYPE_PROG_ARRAY);
	__uint(max_entries, 1);
	__uint(key_size, sizeof(__u32));
	__uint(value_size, sizeof(__u32));
} jmp_table SEC(".maps");

int count = 0;

__noinline int
subprog(struct __sk_buff *skb)
{
	volatile int ret = 1;

	count++;

	bpf_tail_call_static(skb, &jmp_table, 0);

	return ret;
}

SEC("freplace")
int entry(struct __sk_buff *skb)
{
	return subprog(skb);
}

char __license[] SEC("license") = "GPL";

tc_bpf2bpf.c:

// SPDX-License-Identifier: GPL-2.0

\#include <linux/bpf.h>
\#include <bpf/bpf_helpers.h>
\#include "bpf_legacy.h"

__noinline int
subprog(struct __sk_buff *skb)
{
	volatile int ret = 1;

	return ret;
}

SEC("tc")
int entry(struct __sk_buff *skb)
{
	return subprog(skb);
}

char __license[] SEC("license") = "GPL";

And freplace entry prog's target is the tc subprog.

After loading, the freplace jmp_table's owner type is
BPF_PROG_TYPE_SCHED_CLS.

Next, after attaching freplace prog to tc subprog, its prog->aux->
dst_prog is NULL.

Next, when update freplace prog to jmp_table, bpf_prog_map_compatible()
returns false because resolve_prog_type() returns BPF_PROG_TYPE_EXT instead
of BPF_PROG_TYPE_SCHED_CLS.

With this patch, resolve_prog_type() returns BPF_PROG_TYPE_SCHED_CLS to
support updating attached freplace prog to PROG_ARRY map for this
example.

Fixes: f7866c3 ("bpf: Fix null pointer dereference in resolve_prog_type() for BPF_PROG_TYPE_EXT")
Cc: Toke Høiland-Jørgensen <toke@redhat.com>
Cc: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
kernel-patches-daemon-bpf bot pushed a commit that referenced this pull request Jul 25, 2024
The commit f7866c3 ("bpf: Fix null pointer dereference in
resolve_prog_type() for BPF_PROG_TYPE_EXT") fixed the following panic,
which was caused by updating attached freplace prog to PROG_ARRAY map.

But, it does not support updating attached freplace prog to PROG_ARRAY
map.

[309049.036402] BUG: kernel NULL pointer dereference, address: 0000000000000004
[309049.036419] #PF: supervisor read access in kernel mode
[309049.036426] #PF: error_code(0x0000) - not-present page
[309049.036432] PGD 0 P4D 0
[309049.036437] Oops: 0000 [#1] PREEMPT SMP NOPTI
[309049.036444] CPU: 2 PID: 788148 Comm: test_progs Not tainted 6.8.0-31-generic #31-Ubuntu
[309049.036465] Hardware name: VMware, Inc. VMware20,1/440BX Desktop Reference Platform, BIOS VMW201.00V.21805430.B64.2305221830 05/22/2023
[309049.036477] RIP: 0010:bpf_prog_map_compatible+0x2a/0x140
[309049.036488] Code: 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 49 89 fe 41 55 41 54 53 44 8b 6e 04 48 89 f3 41 83 fd 1c 75 0c 48 8b 46 38 48 8b 40 70 <44> 8b 68 04 f6 43 03 01 75 1c 48 8b 43 38 44 0f b6 a0 89 00 00 00
[309049.036505] RSP: 0018:ffffb2e080fd7ce0 EFLAGS: 00010246
[309049.036513] RAX: 0000000000000000 RBX: ffffb2e0807c1000 RCX: 0000000000000000
[309049.036521] RDX: 0000000000000000 RSI: ffffb2e0807c1000 RDI: ffff990290259e00
[309049.036528] RBP: ffffb2e080fd7d08 R08: 0000000000000000 R09: 0000000000000000
[309049.036536] R10: 0000000000000000 R11: 0000000000000000 R12: ffff990290259e00
[309049.036543] R13: 000000000000001c R14: ffff990290259e00 R15: ffff99028e29c400
[309049.036551] FS:  00007b82cbc28140(0000) GS:ffff9903b3f00000(0000) knlGS:0000000000000000
[309049.036559] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[309049.036566] CR2: 0000000000000004 CR3: 0000000101286002 CR4: 00000000003706f0
[309049.036573] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[309049.036581] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[309049.036588] Call Trace:
[309049.036592]  <TASK>
[309049.036597]  ? show_regs+0x6d/0x80
[309049.036604]  ? __die+0x24/0x80
[309049.036619]  ? page_fault_oops+0x99/0x1b0
[309049.036628]  ? do_user_addr_fault+0x2ee/0x6b0
[309049.036634]  ? exc_page_fault+0x83/0x1b0
[309049.036641]  ? asm_exc_page_fault+0x27/0x30
[309049.036649]  ? bpf_prog_map_compatible+0x2a/0x140
[309049.036656]  prog_fd_array_get_ptr+0x2c/0x70
[309049.036664]  bpf_fd_array_map_update_elem+0x37/0x130
[309049.036671]  bpf_map_update_value+0x1d3/0x260
[309049.036677]  map_update_elem+0x1fa/0x360
[309049.036683]  __sys_bpf+0x54c/0xa10
[309049.036689]  __x64_sys_bpf+0x1a/0x30
[309049.036694]  x64_sys_call+0x1936/0x25c0
[309049.036700]  do_syscall_64+0x7f/0x180
[309049.036706]  ? do_syscall_64+0x8c/0x180
[309049.036712]  ? do_syscall_64+0x8c/0x180
[309049.036717]  ? irqentry_exit+0x43/0x50
[309049.036723]  ? common_interrupt+0x54/0xb0
[309049.036729]  entry_SYSCALL_64_after_hwframe+0x73/0x7b

Since commit 1c123c5 ("bpf: Resolve fext program type when
checking map compatibility"), freplace prog can be used as tail-callee
of its target prog.
And the commit 3aac1ea ("bpf: Move prog->aux->linked_prog and
trampoline into bpf_link on attach") sets prog->aux->dst_prog as NULL
when attach freplace prog to its target.

Then, as for following example:

tailcall_freplace.c:

// SPDX-License-Identifier: GPL-2.0

\#include <linux/bpf.h>
\#include <bpf/bpf_helpers.h>
\#include "bpf_legacy.h"

struct {
	__uint(type, BPF_MAP_TYPE_PROG_ARRAY);
	__uint(max_entries, 1);
	__uint(key_size, sizeof(__u32));
	__uint(value_size, sizeof(__u32));
} jmp_table SEC(".maps");

int count = 0;

__noinline int
subprog(struct __sk_buff *skb)
{
	volatile int ret = 1;

	count++;

	bpf_tail_call_static(skb, &jmp_table, 0);

	return ret;
}

SEC("freplace")
int entry(struct __sk_buff *skb)
{
	return subprog(skb);
}

char __license[] SEC("license") = "GPL";

tc_bpf2bpf.c:

// SPDX-License-Identifier: GPL-2.0

\#include <linux/bpf.h>
\#include <bpf/bpf_helpers.h>
\#include "bpf_legacy.h"

__noinline int
subprog(struct __sk_buff *skb)
{
	volatile int ret = 1;

	return ret;
}

SEC("tc")
int entry(struct __sk_buff *skb)
{
	return subprog(skb);
}

char __license[] SEC("license") = "GPL";

And freplace entry prog's target is the tc subprog.

After loading, the freplace jmp_table's owner type is
BPF_PROG_TYPE_SCHED_CLS.

Next, after attaching freplace prog to tc subprog, its prog->aux->
dst_prog is NULL.

Next, when update freplace prog to jmp_table, bpf_prog_map_compatible()
returns false because resolve_prog_type() returns BPF_PROG_TYPE_EXT instead
of BPF_PROG_TYPE_SCHED_CLS.

With this patch, resolve_prog_type() returns BPF_PROG_TYPE_SCHED_CLS to
support updating attached freplace prog to PROG_ARRY map for this
example.

Fixes: f7866c3 ("bpf: Fix null pointer dereference in resolve_prog_type() for BPF_PROG_TYPE_EXT")
Cc: Toke Høiland-Jørgensen <toke@redhat.com>
Cc: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
kernel-patches-daemon-bpf bot pushed a commit that referenced this pull request Jul 29, 2024
The commit f7866c3 ("bpf: Fix null pointer dereference in
resolve_prog_type() for BPF_PROG_TYPE_EXT") fixed the following panic,
which was caused by updating attached freplace prog to PROG_ARRAY map.

But, it does not support updating attached freplace prog to PROG_ARRAY
map.

[309049.036402] BUG: kernel NULL pointer dereference, address: 0000000000000004
[309049.036419] #PF: supervisor read access in kernel mode
[309049.036426] #PF: error_code(0x0000) - not-present page
[309049.036432] PGD 0 P4D 0
[309049.036437] Oops: 0000 [#1] PREEMPT SMP NOPTI
[309049.036444] CPU: 2 PID: 788148 Comm: test_progs Not tainted 6.8.0-31-generic #31-Ubuntu
[309049.036465] Hardware name: VMware, Inc. VMware20,1/440BX Desktop Reference Platform, BIOS VMW201.00V.21805430.B64.2305221830 05/22/2023
[309049.036477] RIP: 0010:bpf_prog_map_compatible+0x2a/0x140
[309049.036488] Code: 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 49 89 fe 41 55 41 54 53 44 8b 6e 04 48 89 f3 41 83 fd 1c 75 0c 48 8b 46 38 48 8b 40 70 <44> 8b 68 04 f6 43 03 01 75 1c 48 8b 43 38 44 0f b6 a0 89 00 00 00
[309049.036505] RSP: 0018:ffffb2e080fd7ce0 EFLAGS: 00010246
[309049.036513] RAX: 0000000000000000 RBX: ffffb2e0807c1000 RCX: 0000000000000000
[309049.036521] RDX: 0000000000000000 RSI: ffffb2e0807c1000 RDI: ffff990290259e00
[309049.036528] RBP: ffffb2e080fd7d08 R08: 0000000000000000 R09: 0000000000000000
[309049.036536] R10: 0000000000000000 R11: 0000000000000000 R12: ffff990290259e00
[309049.036543] R13: 000000000000001c R14: ffff990290259e00 R15: ffff99028e29c400
[309049.036551] FS:  00007b82cbc28140(0000) GS:ffff9903b3f00000(0000) knlGS:0000000000000000
[309049.036559] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[309049.036566] CR2: 0000000000000004 CR3: 0000000101286002 CR4: 00000000003706f0
[309049.036573] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[309049.036581] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[309049.036588] Call Trace:
[309049.036592]  <TASK>
[309049.036597]  ? show_regs+0x6d/0x80
[309049.036604]  ? __die+0x24/0x80
[309049.036619]  ? page_fault_oops+0x99/0x1b0
[309049.036628]  ? do_user_addr_fault+0x2ee/0x6b0
[309049.036634]  ? exc_page_fault+0x83/0x1b0
[309049.036641]  ? asm_exc_page_fault+0x27/0x30
[309049.036649]  ? bpf_prog_map_compatible+0x2a/0x140
[309049.036656]  prog_fd_array_get_ptr+0x2c/0x70
[309049.036664]  bpf_fd_array_map_update_elem+0x37/0x130
[309049.036671]  bpf_map_update_value+0x1d3/0x260
[309049.036677]  map_update_elem+0x1fa/0x360
[309049.036683]  __sys_bpf+0x54c/0xa10
[309049.036689]  __x64_sys_bpf+0x1a/0x30
[309049.036694]  x64_sys_call+0x1936/0x25c0
[309049.036700]  do_syscall_64+0x7f/0x180
[309049.036706]  ? do_syscall_64+0x8c/0x180
[309049.036712]  ? do_syscall_64+0x8c/0x180
[309049.036717]  ? irqentry_exit+0x43/0x50
[309049.036723]  ? common_interrupt+0x54/0xb0
[309049.036729]  entry_SYSCALL_64_after_hwframe+0x73/0x7b

Since commit 1c123c5 ("bpf: Resolve fext program type when
checking map compatibility"), freplace prog can be used as tail-callee
of its target prog.
And the commit 3aac1ea ("bpf: Move prog->aux->linked_prog and
trampoline into bpf_link on attach") sets prog->aux->dst_prog as NULL
when attach freplace prog to its target.

Then, as for following example:

tailcall_freplace.c:

// SPDX-License-Identifier: GPL-2.0

\#include <linux/bpf.h>
\#include <bpf/bpf_helpers.h>
\#include "bpf_legacy.h"

struct {
	__uint(type, BPF_MAP_TYPE_PROG_ARRAY);
	__uint(max_entries, 1);
	__uint(key_size, sizeof(__u32));
	__uint(value_size, sizeof(__u32));
} jmp_table SEC(".maps");

int count = 0;

__noinline int
subprog(struct __sk_buff *skb)
{
	volatile int ret = 1;

	count++;

	bpf_tail_call_static(skb, &jmp_table, 0);

	return ret;
}

SEC("freplace")
int entry(struct __sk_buff *skb)
{
	return subprog(skb);
}

char __license[] SEC("license") = "GPL";

tc_bpf2bpf.c:

// SPDX-License-Identifier: GPL-2.0

\#include <linux/bpf.h>
\#include <bpf/bpf_helpers.h>
\#include "bpf_legacy.h"

__noinline int
subprog(struct __sk_buff *skb)
{
	volatile int ret = 1;

	return ret;
}

SEC("tc")
int entry(struct __sk_buff *skb)
{
	return subprog(skb);
}

char __license[] SEC("license") = "GPL";

And freplace entry prog's target is the tc subprog.

After loading, the freplace jmp_table's owner type is
BPF_PROG_TYPE_SCHED_CLS.

Next, after attaching freplace prog to tc subprog, its prog->aux->
dst_prog is NULL.

Next, when update freplace prog to jmp_table, bpf_prog_map_compatible()
returns false because resolve_prog_type() returns BPF_PROG_TYPE_EXT instead
of BPF_PROG_TYPE_SCHED_CLS.

With this patch, resolve_prog_type() returns BPF_PROG_TYPE_SCHED_CLS to
support updating attached freplace prog to PROG_ARRY map for this
example.

Fixes: f7866c3 ("bpf: Fix null pointer dereference in resolve_prog_type() for BPF_PROG_TYPE_EXT")
Cc: Toke Høiland-Jørgensen <toke@redhat.com>
Cc: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
kernel-patches-daemon-bpf bot pushed a commit that referenced this pull request Jul 29, 2024
The commit f7866c3 ("bpf: Fix null pointer dereference in
resolve_prog_type() for BPF_PROG_TYPE_EXT") fixed the following panic,
which was caused by updating attached freplace prog to PROG_ARRAY map.

But, it does not support updating attached freplace prog to PROG_ARRAY
map.

[309049.036402] BUG: kernel NULL pointer dereference, address: 0000000000000004
[309049.036419] #PF: supervisor read access in kernel mode
[309049.036426] #PF: error_code(0x0000) - not-present page
[309049.036432] PGD 0 P4D 0
[309049.036437] Oops: 0000 [#1] PREEMPT SMP NOPTI
[309049.036444] CPU: 2 PID: 788148 Comm: test_progs Not tainted 6.8.0-31-generic #31-Ubuntu
[309049.036465] Hardware name: VMware, Inc. VMware20,1/440BX Desktop Reference Platform, BIOS VMW201.00V.21805430.B64.2305221830 05/22/2023
[309049.036477] RIP: 0010:bpf_prog_map_compatible+0x2a/0x140
[309049.036488] Code: 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 49 89 fe 41 55 41 54 53 44 8b 6e 04 48 89 f3 41 83 fd 1c 75 0c 48 8b 46 38 48 8b 40 70 <44> 8b 68 04 f6 43 03 01 75 1c 48 8b 43 38 44 0f b6 a0 89 00 00 00
[309049.036505] RSP: 0018:ffffb2e080fd7ce0 EFLAGS: 00010246
[309049.036513] RAX: 0000000000000000 RBX: ffffb2e0807c1000 RCX: 0000000000000000
[309049.036521] RDX: 0000000000000000 RSI: ffffb2e0807c1000 RDI: ffff990290259e00
[309049.036528] RBP: ffffb2e080fd7d08 R08: 0000000000000000 R09: 0000000000000000
[309049.036536] R10: 0000000000000000 R11: 0000000000000000 R12: ffff990290259e00
[309049.036543] R13: 000000000000001c R14: ffff990290259e00 R15: ffff99028e29c400
[309049.036551] FS:  00007b82cbc28140(0000) GS:ffff9903b3f00000(0000) knlGS:0000000000000000
[309049.036559] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[309049.036566] CR2: 0000000000000004 CR3: 0000000101286002 CR4: 00000000003706f0
[309049.036573] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[309049.036581] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[309049.036588] Call Trace:
[309049.036592]  <TASK>
[309049.036597]  ? show_regs+0x6d/0x80
[309049.036604]  ? __die+0x24/0x80
[309049.036619]  ? page_fault_oops+0x99/0x1b0
[309049.036628]  ? do_user_addr_fault+0x2ee/0x6b0
[309049.036634]  ? exc_page_fault+0x83/0x1b0
[309049.036641]  ? asm_exc_page_fault+0x27/0x30
[309049.036649]  ? bpf_prog_map_compatible+0x2a/0x140
[309049.036656]  prog_fd_array_get_ptr+0x2c/0x70
[309049.036664]  bpf_fd_array_map_update_elem+0x37/0x130
[309049.036671]  bpf_map_update_value+0x1d3/0x260
[309049.036677]  map_update_elem+0x1fa/0x360
[309049.036683]  __sys_bpf+0x54c/0xa10
[309049.036689]  __x64_sys_bpf+0x1a/0x30
[309049.036694]  x64_sys_call+0x1936/0x25c0
[309049.036700]  do_syscall_64+0x7f/0x180
[309049.036706]  ? do_syscall_64+0x8c/0x180
[309049.036712]  ? do_syscall_64+0x8c/0x180
[309049.036717]  ? irqentry_exit+0x43/0x50
[309049.036723]  ? common_interrupt+0x54/0xb0
[309049.036729]  entry_SYSCALL_64_after_hwframe+0x73/0x7b

Since commit 1c123c5 ("bpf: Resolve fext program type when
checking map compatibility"), freplace prog can be used as tail-callee
of its target prog.
And the commit 3aac1ea ("bpf: Move prog->aux->linked_prog and
trampoline into bpf_link on attach") sets prog->aux->dst_prog as NULL
when attach freplace prog to its target.

Then, as for following example:

tailcall_freplace.c:

// SPDX-License-Identifier: GPL-2.0

\#include <linux/bpf.h>
\#include <bpf/bpf_helpers.h>
\#include "bpf_legacy.h"

struct {
	__uint(type, BPF_MAP_TYPE_PROG_ARRAY);
	__uint(max_entries, 1);
	__uint(key_size, sizeof(__u32));
	__uint(value_size, sizeof(__u32));
} jmp_table SEC(".maps");

int count = 0;

__noinline int
subprog(struct __sk_buff *skb)
{
	volatile int ret = 1;

	count++;

	bpf_tail_call_static(skb, &jmp_table, 0);

	return ret;
}

SEC("freplace")
int entry(struct __sk_buff *skb)
{
	return subprog(skb);
}

char __license[] SEC("license") = "GPL";

tc_bpf2bpf.c:

// SPDX-License-Identifier: GPL-2.0

\#include <linux/bpf.h>
\#include <bpf/bpf_helpers.h>
\#include "bpf_legacy.h"

__noinline int
subprog(struct __sk_buff *skb)
{
	volatile int ret = 1;

	return ret;
}

SEC("tc")
int entry(struct __sk_buff *skb)
{
	return subprog(skb);
}

char __license[] SEC("license") = "GPL";

And freplace entry prog's target is the tc subprog.

After loading, the freplace jmp_table's owner type is
BPF_PROG_TYPE_SCHED_CLS.

Next, after attaching freplace prog to tc subprog, its prog->aux->
dst_prog is NULL.

Next, when update freplace prog to jmp_table, bpf_prog_map_compatible()
returns false because resolve_prog_type() returns BPF_PROG_TYPE_EXT instead
of BPF_PROG_TYPE_SCHED_CLS.

With this patch, resolve_prog_type() returns BPF_PROG_TYPE_SCHED_CLS to
support updating attached freplace prog to PROG_ARRY map for this
example.

Fixes: f7866c3 ("bpf: Fix null pointer dereference in resolve_prog_type() for BPF_PROG_TYPE_EXT")
Cc: Toke Høiland-Jørgensen <toke@redhat.com>
Cc: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
kernel-patches-daemon-bpf bot pushed a commit that referenced this pull request Aug 1, 2025
…er dereference

A malicious HID device with quirk APPLE_MAGIC_BACKLIGHT can trigger a NULL
pointer dereference whilst the power feature-report is toggled and sent to
the device in apple_magic_backlight_report_set(). The power feature-report
is expected to have two data fields, but if the descriptor declares one
field then accessing field[1] and dereferencing it in
apple_magic_backlight_report_set() becomes invalid
since field[1] will be NULL.

An example of a minimal descriptor which can cause the crash is something
like the following where the report with ID 3 (power report) only
references a single 1-byte field. When hid core parses the descriptor it
will encounter the final feature tag, allocate a hid_report (all members
of field[] will be zeroed out), create field structure and populate it,
increasing the maxfield to 1. The subsequent field[1] access and
dereference causes the crash.

  Usage Page (Vendor Defined 0xFF00)
  Usage (0x0F)
  Collection (Application)
    Report ID (1)
    Usage (0x01)
    Logical Minimum (0)
    Logical Maximum (255)
    Report Size (8)
    Report Count (1)
    Feature (Data,Var,Abs)

    Usage (0x02)
    Logical Maximum (32767)
    Report Size (16)
    Report Count (1)
    Feature (Data,Var,Abs)

    Report ID (3)
    Usage (0x03)
    Logical Minimum (0)
    Logical Maximum (1)
    Report Size (8)
    Report Count (1)
    Feature (Data,Var,Abs)
  End Collection

Here we see the KASAN splat when the kernel dereferences the
NULL pointer and crashes:

  [   15.164723] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000006: 0000 [#1] SMP KASAN NOPTI
  [   15.165691] KASAN: null-ptr-deref in range [0x0000000000000030-0x0000000000000037]
  [   15.165691] CPU: 0 UID: 0 PID: 10 Comm: kworker/0:1 Not tainted 6.15.0 #31 PREEMPT(voluntary)
  [   15.165691] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
  [   15.165691] RIP: 0010:apple_magic_backlight_report_set+0xbf/0x210
  [   15.165691] Call Trace:
  [   15.165691]  <TASK>
  [   15.165691]  apple_probe+0x571/0xa20
  [   15.165691]  hid_device_probe+0x2e2/0x6f0
  [   15.165691]  really_probe+0x1ca/0x5c0
  [   15.165691]  __driver_probe_device+0x24f/0x310
  [   15.165691]  driver_probe_device+0x4a/0xd0
  [   15.165691]  __device_attach_driver+0x169/0x220
  [   15.165691]  bus_for_each_drv+0x118/0x1b0
  [   15.165691]  __device_attach+0x1d5/0x380
  [   15.165691]  device_initial_probe+0x12/0x20
  [   15.165691]  bus_probe_device+0x13d/0x180
  [   15.165691]  device_add+0xd87/0x1510
  [...]

To fix this issue we should validate the number of fields that the
backlight and power reports have and if they do not have the required
number of fields then bail.

Fixes: 394ba61 ("HID: apple: Add support for magic keyboard backlight on T2 Macs")
Cc: stable@vger.kernel.org
Signed-off-by: Qasim Ijaz <qasdev00@gmail.com>
Reviewed-by: Orlando Chamberlain <orlandoch.dev@gmail.com>
Tested-by: Aditya Garg <gargaditya08@live.com>
Link: https://patch.msgid.link/20250713233008.15131-1-qasdev00@gmail.com
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
kernel-patches-daemon-bpf bot pushed a commit that referenced this pull request Aug 2, 2025
Without the change `perf `hangs up on charaster devices. On my system
it's enough to run system-wide sampler for a few seconds to get the
hangup:

    $ perf record -a -g --call-graph=dwarf
    $ perf report
    # hung

`strace` shows that hangup happens on reading on a character device
`/dev/dri/renderD128`

    $ strace -y -f -p 2780484
    strace: Process 2780484 attached
    pread64(101</dev/dri/renderD128>, strace: Process 2780484 detached

It's call trace descends into `elfutils`:

    $ gdb -p 2780484
    (gdb) bt
    #0  0x00007f5e508f04b7 in __libc_pread64 (fd=101, buf=0x7fff9df7edb0, count=0, offset=0)
        at ../sysdeps/unix/sysv/linux/pread64.c:25
    #1  0x00007f5e52b79515 in read_file () from /<<NIX>>/elfutils-0.192/lib/libelf.so.1
    #2  0x00007f5e52b25666 in libdw_open_elf () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
    #3  0x00007f5e52b25907 in __libdw_open_file () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
    #4  0x00007f5e52b120a9 in dwfl_report_elf@@ELFUTILS_0.156 ()
       from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
    #5  0x000000000068bf20 in __report_module (al=al@entry=0x7fff9df80010, ip=ip@entry=139803237033216, ui=ui@entry=0x5369b5e0)
        at util/dso.h:537
    #6  0x000000000068c3d1 in report_module (ip=139803237033216, ui=0x5369b5e0) at util/unwind-libdw.c:114
    #7  frame_callback (state=0x535aef10, arg=0x5369b5e0) at util/unwind-libdw.c:242
    #8  0x00007f5e52b261d3 in dwfl_thread_getframes () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
    #9  0x00007f5e52b25bdb in get_one_thread_cb () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
    #10 0x00007f5e52b25faa in dwfl_getthreads () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
    #11 0x00007f5e52b26514 in dwfl_getthread_frames () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
    #12 0x000000000068c6ce in unwind__get_entries (cb=cb@entry=0x5d4620 <unwind_entry>, arg=arg@entry=0x10cd5fa0,
        thread=thread@entry=0x1076a290, data=data@entry=0x7fff9df80540, max_stack=max_stack@entry=127,
        best_effort=best_effort@entry=false) at util/thread.h:152
    #13 0x00000000005dae95 in thread__resolve_callchain_unwind (evsel=0x106006d0, thread=0x1076a290, cursor=0x10cd5fa0,
        sample=0x7fff9df80540, max_stack=127, symbols=true) at util/machine.c:2939
    #14 thread__resolve_callchain_unwind (thread=0x1076a290, cursor=0x10cd5fa0, evsel=0x106006d0, sample=0x7fff9df80540,
        max_stack=127, symbols=true) at util/machine.c:2920
    #15 __thread__resolve_callchain (thread=0x1076a290, cursor=0x10cd5fa0, evsel=0x106006d0, evsel@entry=0x7fff9df80440,
        sample=0x7fff9df80540, parent=parent@entry=0x7fff9df804a0, root_al=root_al@entry=0x7fff9df80440, max_stack=127, symbols=true)
        at util/machine.c:2970
    #16 0x00000000005d0cb2 in thread__resolve_callchain (thread=<optimized out>, cursor=<optimized out>, evsel=0x7fff9df80440,
        sample=<optimized out>, parent=0x7fff9df804a0, root_al=0x7fff9df80440, max_stack=127) at util/machine.h:198
    #17 sample__resolve_callchain (sample=<optimized out>, cursor=<optimized out>, parent=parent@entry=0x7fff9df804a0,
        evsel=evsel@entry=0x106006d0, al=al@entry=0x7fff9df80440, max_stack=max_stack@entry=127) at util/callchain.c:1127
    #18 0x0000000000617e08 in hist_entry_iter__add (iter=iter@entry=0x7fff9df80480, al=al@entry=0x7fff9df80440, max_stack_depth=127,
        arg=arg@entry=0x7fff9df81ae0) at util/hist.c:1255
    #19 0x000000000045d2d0 in process_sample_event (tool=0x7fff9df81ae0, event=<optimized out>, sample=0x7fff9df80540,
        evsel=0x106006d0, machine=<optimized out>) at builtin-report.c:334
    #20 0x00000000005e3bb1 in perf_session__deliver_event (session=0x105ff2c0, event=0x7f5c7d735ca0, tool=0x7fff9df81ae0,
        file_offset=2914716832, file_path=0x105ffbf0 "perf.data") at util/session.c:1367
    #21 0x00000000005e8d93 in do_flush (oe=0x105ffa50, show_progress=false) at util/ordered-events.c:245
    #22 __ordered_events__flush (oe=0x105ffa50, how=OE_FLUSH__ROUND, timestamp=<optimized out>) at util/ordered-events.c:324
    #23 0x00000000005e1f64 in perf_session__process_user_event (session=0x105ff2c0, event=0x7f5c7d752b18, file_offset=2914835224,
        file_path=0x105ffbf0 "perf.data") at util/session.c:1419
    #24 0x00000000005e47c7 in reader__read_event (rd=rd@entry=0x7fff9df81260, session=session@entry=0x105ff2c0,
    --Type <RET> for more, q to quit, c to continue without paging--
    quit
        prog=prog@entry=0x7fff9df81220) at util/session.c:2132
    #25 0x00000000005e4b37 in reader__process_events (rd=0x7fff9df81260, session=0x105ff2c0, prog=0x7fff9df81220)
        at util/session.c:2181
    #26 __perf_session__process_events (session=0x105ff2c0) at util/session.c:2226
    #27 perf_session__process_events (session=session@entry=0x105ff2c0) at util/session.c:2390
    #28 0x0000000000460add in __cmd_report (rep=0x7fff9df81ae0) at builtin-report.c:1076
    #29 cmd_report (argc=<optimized out>, argv=<optimized out>) at builtin-report.c:1827
    #30 0x00000000004c5a40 in run_builtin (p=p@entry=0xd8f7f8 <commands+312>, argc=argc@entry=1, argv=argv@entry=0x7fff9df844b0)
        at perf.c:351
    #31 0x00000000004c5d63 in handle_internal_command (argc=argc@entry=1, argv=argv@entry=0x7fff9df844b0) at perf.c:404
    #32 0x0000000000442de3 in run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:448
    #33 main (argc=<optimized out>, argv=0x7fff9df844b0) at perf.c:556

The hangup happens because nothing in` perf` or `elfutils` checks if a
mapped file is easily readable.

The change conservatively skips all non-regular files.

Signed-off-by: Sergei Trofimovich <slyich@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20250505174419.2814857-1-slyich@gmail.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
guidosarducci added a commit to guidosarducci/bpf-ci that referenced this pull request Sep 19, 2025
Test tries to build:

struct s {
     void __attribute__((btf_type_tag("void_tag"))) *p1;
     void __attribute__((void_attr)) *p2;
};

but hard-codes struct size to 8 while using default host-sized pointers
and overlaying both pointers at struct start.

Above is broken but only fails on 32-bit hosts:

[...]
test_ctx__dump_and_compare:FAIL:dump_and_compare unexpected dump_and_compare: actual 'struct s {
        void __attribute__((btf_type_tag("void_tag"))) *p1;
        void __attribute__((void_attr)) *p2;
        long: 32;
};

' != expected 'struct s {
        void __attribute__((btf_type_tag("void_tag"))) *p1;
        void __attribute__((void_attr)) *p2;
};

'
[...]
kernel-patches#31/9    btf_dump/btf_dump: type_tags:FAIL

Resolve by sizing struct to fit 2 default-sized pointers which are laid
out sequentially.

Fixes: 6c2d2a0 ("selftests/bpf: Add a btf_dump test for type_tags")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
guidosarducci added a commit to guidosarducci/bpf-ci that referenced this pull request Sep 19, 2025
In btf_dump_enum_data() an enum value is declared as 64-bit but passed for
dumping output using an 'int' format specifier. After a few macro layers
with var args, vsnprintf() then shorts reading the 64-bit value from its
va_list and further misreads later strings, resulting in garbage output
and failed tests in test_progs:

[...]
btf_dump_data:FAIL:ensure expected/actual match unexpected ensure expected/actual match: actual '(enum bpf_cmd)2000(null)' != expected '(enum bpf_cmd)2000'
[...]
btf_dump_data:FAIL:ensure expected/actual match unexpected ensure expected/actual match: actual '2000(null)' != expected '2000'
[...]
btf_dump_data:FAIL:ensure expected/actual match unexpected ensure expected/actual match: actual '(enum bpf_cmd)2000(null)' != expected '(enum bpf_cmd)2000'
[...]
kernel-patches#31/14   btf_dump/btf_dump: enum_data:FAIL

Resolve by explicitly using 64-bit output specifiers (e.g. "%llu").

Fixes: 920d16a ("libbpf: BTF dumper support for typed data")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
guidosarducci added a commit to guidosarducci/bpf-ci that referenced this pull request Sep 19, 2025
Allow the test dumping BTF of 'struct file_operations' to succeed on 32-bit
targets by accounting for varying pointer size in the expected dump output.

This avoids failures in test_progs like:

[...]
test_btf_dump_struct_data:FAIL:file_operations unexpected file_operations: actual '(struct file_operations){
        .owner = (struct module *)0xffffffff,
        .fop_flags = (fop_flags_t)4294967295,
        .llsee' != expected '(struct file_operations){
        .owner = (struct module *)0xffffffffffffffff,
        .fop_flags = (fop_flags_t)4294967295,'
[...]
kernel-patches#31/15   btf_dump/btf_dump: struct_data:FAIL

Fixes: 70a9241 ("selftests/bpf: Add dump type data tests to btf dump tests")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
guidosarducci added a commit to guidosarducci/bpf-ci that referenced this pull request Sep 19, 2025
Test tries to build:

struct s {
     void __attribute__((btf_type_tag("void_tag"))) *p1;
     void __attribute__((void_attr)) *p2;
};

but hard-codes struct size to 8 while using default host-sized pointers
and overlaying both pointers at struct start.

Above is broken but only fails on 32-bit hosts:

[...]
test_ctx__dump_and_compare:FAIL:dump_and_compare unexpected dump_and_compare: actual 'struct s {
        void __attribute__((btf_type_tag("void_tag"))) *p1;
        void __attribute__((void_attr)) *p2;
        long: 32;
};

' != expected 'struct s {
        void __attribute__((btf_type_tag("void_tag"))) *p1;
        void __attribute__((void_attr)) *p2;
};

'
[...]
kernel-patches#31/9    btf_dump/btf_dump: type_tags:FAIL

Resolve by sizing struct to fit 2 default-sized pointers which are laid
out sequentially.

Fixes: 6c2d2a0 ("selftests/bpf: Add a btf_dump test for type_tags")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
guidosarducci added a commit to guidosarducci/bpf-ci that referenced this pull request Sep 19, 2025
In btf_dump_enum_data() an enum value is declared as 64-bit but passed for
dumping output using an 'int' format specifier. After a few macro layers
with var args, vsnprintf() then shorts reading the 64-bit value from its
va_list and further misreads later strings, resulting in garbage output
and failed tests in test_progs:

[...]
btf_dump_data:FAIL:ensure expected/actual match unexpected ensure expected/actual match: actual '(enum bpf_cmd)2000(null)' != expected '(enum bpf_cmd)2000'
[...]
btf_dump_data:FAIL:ensure expected/actual match unexpected ensure expected/actual match: actual '2000(null)' != expected '2000'
[...]
btf_dump_data:FAIL:ensure expected/actual match unexpected ensure expected/actual match: actual '(enum bpf_cmd)2000(null)' != expected '(enum bpf_cmd)2000'
[...]
kernel-patches#31/14   btf_dump/btf_dump: enum_data:FAIL

Resolve by explicitly using 64-bit output specifiers (e.g. "%llu").

Fixes: 920d16a ("libbpf: BTF dumper support for typed data")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
guidosarducci added a commit to guidosarducci/bpf-ci that referenced this pull request Sep 19, 2025
Allow the test dumping BTF of 'struct file_operations' to succeed on 32-bit
targets by accounting for varying pointer size in the expected dump output.

This avoids failures in test_progs like:

[...]
test_btf_dump_struct_data:FAIL:file_operations unexpected file_operations: actual '(struct file_operations){
        .owner = (struct module *)0xffffffff,
        .fop_flags = (fop_flags_t)4294967295,
        .llsee' != expected '(struct file_operations){
        .owner = (struct module *)0xffffffffffffffff,
        .fop_flags = (fop_flags_t)4294967295,'
[...]
kernel-patches#31/15   btf_dump/btf_dump: struct_data:FAIL

Fixes: 70a9241 ("selftests/bpf: Add dump type data tests to btf dump tests")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
guidosarducci added a commit to guidosarducci/bpf-ci that referenced this pull request Sep 22, 2025
Test tries to build:

struct s {
     void __attribute__((btf_type_tag("void_tag"))) *p1;
     void __attribute__((void_attr)) *p2;
};

but hard-codes struct size to 8 while using default host-sized pointers
and overlaying both pointers at struct start.

Above is broken but only fails on 32-bit hosts:

[...]
test_ctx__dump_and_compare:FAIL:dump_and_compare unexpected dump_and_compare: actual 'struct s {
        void __attribute__((btf_type_tag("void_tag"))) *p1;
        void __attribute__((void_attr)) *p2;
        long: 32;
};

' != expected 'struct s {
        void __attribute__((btf_type_tag("void_tag"))) *p1;
        void __attribute__((void_attr)) *p2;
};

'
[...]
kernel-patches#31/9    btf_dump/btf_dump: type_tags:FAIL

Resolve by sizing struct to fit 2 default-sized pointers which are laid
out sequentially.

Fixes: 6c2d2a0 ("selftests/bpf: Add a btf_dump test for type_tags")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
guidosarducci added a commit to guidosarducci/bpf-ci that referenced this pull request Sep 22, 2025
In btf_dump_enum_data() an enum value is declared as 64-bit but passed for
dumping output using an 'int' format specifier. After a few macro layers
with var args, vsnprintf() then shorts reading the 64-bit value from its
va_list and further misreads later strings, resulting in garbage output
and failed tests in test_progs:

[...]
btf_dump_data:FAIL:ensure expected/actual match unexpected ensure expected/actual match: actual '(enum bpf_cmd)2000(null)' != expected '(enum bpf_cmd)2000'
[...]
btf_dump_data:FAIL:ensure expected/actual match unexpected ensure expected/actual match: actual '2000(null)' != expected '2000'
[...]
btf_dump_data:FAIL:ensure expected/actual match unexpected ensure expected/actual match: actual '(enum bpf_cmd)2000(null)' != expected '(enum bpf_cmd)2000'
[...]
kernel-patches#31/14   btf_dump/btf_dump: enum_data:FAIL

Resolve by explicitly using 64-bit output specifiers (e.g. "%llu").

Fixes: 920d16a ("libbpf: BTF dumper support for typed data")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
guidosarducci added a commit to guidosarducci/bpf-ci that referenced this pull request Sep 22, 2025
Allow the test dumping BTF of 'struct file_operations' to succeed on 32-bit
targets by accounting for varying pointer size in the expected dump output.

This avoids failures in test_progs like:

[...]
test_btf_dump_struct_data:FAIL:file_operations unexpected file_operations: actual '(struct file_operations){
        .owner = (struct module *)0xffffffff,
        .fop_flags = (fop_flags_t)4294967295,
        .llsee' != expected '(struct file_operations){
        .owner = (struct module *)0xffffffffffffffff,
        .fop_flags = (fop_flags_t)4294967295,'
[...]
kernel-patches#31/15   btf_dump/btf_dump: struct_data:FAIL

Fixes: 70a9241 ("selftests/bpf: Add dump type data tests to btf dump tests")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
guidosarducci added a commit to guidosarducci/bpf-ci that referenced this pull request Sep 23, 2025
Test tries to build:

struct s {
     void __attribute__((btf_type_tag("void_tag"))) *p1;
     void __attribute__((void_attr)) *p2;
};

but hard-codes struct size to 8 while using default host-sized pointers
and overlaying both pointers at struct start.

Above is broken but only fails on 32-bit hosts:

[...]
test_ctx__dump_and_compare:FAIL:dump_and_compare unexpected dump_and_compare: actual 'struct s {
        void __attribute__((btf_type_tag("void_tag"))) *p1;
        void __attribute__((void_attr)) *p2;
        long: 32;
};

' != expected 'struct s {
        void __attribute__((btf_type_tag("void_tag"))) *p1;
        void __attribute__((void_attr)) *p2;
};

'
[...]
kernel-patches#31/9    btf_dump/btf_dump: type_tags:FAIL

Resolve by sizing struct to fit 2 default-sized pointers which are laid
out sequentially.

Fixes: 6c2d2a0 ("selftests/bpf: Add a btf_dump test for type_tags")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
guidosarducci added a commit to guidosarducci/bpf-ci that referenced this pull request Sep 23, 2025
In btf_dump_enum_data() an enum value is declared as 64-bit but passed for
dumping output using an 'int' format specifier. After a few macro layers
with var args, vsnprintf() then shorts reading the 64-bit value from its
va_list and further misreads later strings, resulting in garbage output
and failed tests in test_progs:

[...]
btf_dump_data:FAIL:ensure expected/actual match unexpected ensure expected/actual match: actual '(enum bpf_cmd)2000(null)' != expected '(enum bpf_cmd)2000'
[...]
btf_dump_data:FAIL:ensure expected/actual match unexpected ensure expected/actual match: actual '2000(null)' != expected '2000'
[...]
btf_dump_data:FAIL:ensure expected/actual match unexpected ensure expected/actual match: actual '(enum bpf_cmd)2000(null)' != expected '(enum bpf_cmd)2000'
[...]
kernel-patches#31/14   btf_dump/btf_dump: enum_data:FAIL

Resolve by explicitly using 64-bit output specifiers (e.g. "%llu").

Fixes: 920d16a ("libbpf: BTF dumper support for typed data")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
guidosarducci added a commit to guidosarducci/bpf-ci that referenced this pull request Sep 23, 2025
Allow the test dumping BTF of 'struct file_operations' to succeed on 32-bit
targets by accounting for varying pointer size in the expected dump output.

This avoids failures in test_progs like:

[...]
test_btf_dump_struct_data:FAIL:file_operations unexpected file_operations: actual '(struct file_operations){
        .owner = (struct module *)0xffffffff,
        .fop_flags = (fop_flags_t)4294967295,
        .llsee' != expected '(struct file_operations){
        .owner = (struct module *)0xffffffffffffffff,
        .fop_flags = (fop_flags_t)4294967295,'
[...]
kernel-patches#31/15   btf_dump/btf_dump: struct_data:FAIL

Fixes: 70a9241 ("selftests/bpf: Add dump type data tests to btf dump tests")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
guidosarducci added a commit to guidosarducci/bpf-ci that referenced this pull request Sep 24, 2025
Test tries to build:

struct s {
     void __attribute__((btf_type_tag("void_tag"))) *p1;
     void __attribute__((void_attr)) *p2;
};

but hard-codes struct size to 8 while using default host-sized pointers
and overlaying both pointers at struct start.

Above is broken but only fails on 32-bit hosts:

[...]
test_ctx__dump_and_compare:FAIL:dump_and_compare unexpected dump_and_compare: actual 'struct s {
        void __attribute__((btf_type_tag("void_tag"))) *p1;
        void __attribute__((void_attr)) *p2;
        long: 32;
};

' != expected 'struct s {
        void __attribute__((btf_type_tag("void_tag"))) *p1;
        void __attribute__((void_attr)) *p2;
};

'
[...]
kernel-patches#31/9    btf_dump/btf_dump: type_tags:FAIL

Resolve by sizing struct to fit 2 default-sized pointers which are laid
out sequentially.

Fixes: 6c2d2a0 ("selftests/bpf: Add a btf_dump test for type_tags")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
guidosarducci added a commit to guidosarducci/bpf-ci that referenced this pull request Sep 24, 2025
In btf_dump_enum_data() an enum value is declared as 64-bit but passed for
dumping output using an 'int' format specifier. After a few macro layers
with var args, vsnprintf() then shorts reading the 64-bit value from its
va_list and further misreads later strings, resulting in garbage output
and failed tests in test_progs:

[...]
btf_dump_data:FAIL:ensure expected/actual match unexpected ensure expected/actual match: actual '(enum bpf_cmd)2000(null)' != expected '(enum bpf_cmd)2000'
[...]
btf_dump_data:FAIL:ensure expected/actual match unexpected ensure expected/actual match: actual '2000(null)' != expected '2000'
[...]
btf_dump_data:FAIL:ensure expected/actual match unexpected ensure expected/actual match: actual '(enum bpf_cmd)2000(null)' != expected '(enum bpf_cmd)2000'
[...]
kernel-patches#31/14   btf_dump/btf_dump: enum_data:FAIL

Resolve by explicitly using 64-bit output specifiers (e.g. "%llu").

Fixes: 920d16a ("libbpf: BTF dumper support for typed data")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
guidosarducci added a commit to guidosarducci/bpf-ci that referenced this pull request Sep 24, 2025
Allow the test dumping BTF of 'struct file_operations' to succeed on 32-bit
targets by accounting for varying pointer size in the expected dump output.

This avoids failures in test_progs like:

[...]
test_btf_dump_struct_data:FAIL:file_operations unexpected file_operations: actual '(struct file_operations){
        .owner = (struct module *)0xffffffff,
        .fop_flags = (fop_flags_t)4294967295,
        .llsee' != expected '(struct file_operations){
        .owner = (struct module *)0xffffffffffffffff,
        .fop_flags = (fop_flags_t)4294967295,'
[...]
kernel-patches#31/15   btf_dump/btf_dump: struct_data:FAIL

Fixes: 70a9241 ("selftests/bpf: Add dump type data tests to btf dump tests")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
kernel-patches-daemon-bpf bot pushed a commit that referenced this pull request Sep 25, 2025
…l_access()

Function copy_from_user() and copy_to_user() may sleep because of page
fault, and they cannot be called in spin_lock hold context. Here move
function calling of copy_from_user() and copy_to_user() before spinlock
context in function kvm_eiointc_ctrl_access().

Otherwise there will be possible warning such as:

BUG: sleeping function called from invalid context at include/linux/uaccess.h:192
in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 6292, name: qemu-system-loo
preempt_count: 1, expected: 0
RCU nest depth: 0, expected: 0
INFO: lockdep is turned off.
irq event stamp: 0
hardirqs last  enabled at (0): [<0000000000000000>] 0x0
hardirqs last disabled at (0): [<9000000004c4a554>] copy_process+0x90c/0x1d40
softirqs last  enabled at (0): [<9000000004c4a554>] copy_process+0x90c/0x1d40
softirqs last disabled at (0): [<0000000000000000>] 0x0
CPU: 41 UID: 0 PID: 6292 Comm: qemu-system-loo Tainted: G W 6.17.0-rc3+ #31 PREEMPT(full)
Tainted: [W]=WARN
Stack : 0000000000000076 0000000000000000 9000000004c28264 9000100092ff4000
        9000100092ff7b80 9000100092ff7b88 0000000000000000 9000100092ff7cc8
        9000100092ff7cc0 9000100092ff7cc0 9000100092ff7a00 0000000000000001
        0000000000000001 9000100092ff7b88 947d2f9216a5e8b9 900010008773d880
        00000000ffff8b9f fffffffffffffffe 0000000000000ba1 fffffffffffffffe
        000000000000003e 900000000825a15b 000010007ad38000 9000100092ff7ec0
        0000000000000000 0000000000000000 9000000006f3ac60 9000000007252000
        0000000000000000 00007ff746ff2230 0000000000000053 9000200088a021b0
        0000555556c9d190 0000000000000000 9000000004c2827c 000055556cfb5f40
        00000000000000b0 0000000000000007 0000000000000007 0000000000071c1d
Call Trace:
[<9000000004c2827c>] show_stack+0x5c/0x180
[<9000000004c20fac>] dump_stack_lvl+0x94/0xe4
[<9000000004c99c7c>] __might_resched+0x26c/0x290
[<9000000004f68968>] __might_fault+0x20/0x88
[<ffff800002311de0>] kvm_eiointc_ctrl_access.isra.0+0x88/0x380 [kvm]
[<ffff8000022f8514>] kvm_device_ioctl+0x194/0x290 [kvm]
[<900000000506b0d8>] sys_ioctl+0x388/0x1010
[<90000000063ed210>] do_syscall+0xb0/0x2d8
[<9000000004c25ef8>] handle_syscall+0xb8/0x158

Cc: stable@vger.kernel.org
Fixes: 1ad7efa ("LoongArch: KVM: Add EIOINTC user mode read and write functions")
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
kernel-patches-daemon-bpf bot pushed a commit that referenced this pull request Sep 25, 2025
…s_access()

Function copy_from_user() and copy_to_user() may sleep because of page
fault, and they cannot be called in spin_lock hold context. Here move
function calling of copy_from_user() and copy_to_user() before spinlock
context in function kvm_eiointc_ctrl_access().

Otherwise there will be possible warning such as:

BUG: sleeping function called from invalid context at include/linux/uaccess.h:192
in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 6292, name: qemu-system-loo
preempt_count: 1, expected: 0
RCU nest depth: 0, expected: 0
INFO: lockdep is turned off.
irq event stamp: 0
hardirqs last  enabled at (0): [<0000000000000000>] 0x0
hardirqs last disabled at (0): [<9000000004c4a554>] copy_process+0x90c/0x1d40
softirqs last  enabled at (0): [<9000000004c4a554>] copy_process+0x90c/0x1d40
softirqs last disabled at (0): [<0000000000000000>] 0x0
CPU: 41 UID: 0 PID: 6292 Comm: qemu-system-loo Tainted: G W 6.17.0-rc3+ #31 PREEMPT(full)
Tainted: [W]=WARN
Stack : 0000000000000076 0000000000000000 9000000004c28264 9000100092ff4000
        9000100092ff7b80 9000100092ff7b88 0000000000000000 9000100092ff7cc8
        9000100092ff7cc0 9000100092ff7cc0 9000100092ff7a00 0000000000000001
        0000000000000001 9000100092ff7b88 947d2f9216a5e8b9 900010008773d880
        00000000ffff8b9f fffffffffffffffe 0000000000000ba1 fffffffffffffffe
        000000000000003e 900000000825a15b 000010007ad38000 9000100092ff7ec0
        0000000000000000 0000000000000000 9000000006f3ac60 9000000007252000
        0000000000000000 00007ff746ff2230 0000000000000053 9000200088a021b0
        0000555556c9d190 0000000000000000 9000000004c2827c 000055556cfb5f40
        00000000000000b0 0000000000000007 0000000000000007 0000000000071c1d
Call Trace:
[<9000000004c2827c>] show_stack+0x5c/0x180
[<9000000004c20fac>] dump_stack_lvl+0x94/0xe4
[<9000000004c99c7c>] __might_resched+0x26c/0x290
[<9000000004f68968>] __might_fault+0x20/0x88
[<ffff800002311de0>] kvm_eiointc_regs_access.isra.0+0x88/0x380 [kvm]
[<ffff8000022f8514>] kvm_device_ioctl+0x194/0x290 [kvm]
[<900000000506b0d8>] sys_ioctl+0x388/0x1010
[<90000000063ed210>] do_syscall+0xb0/0x2d8
[<9000000004c25ef8>] handle_syscall+0xb8/0x158

Cc: stable@vger.kernel.org
Fixes: 1ad7efa ("LoongArch: KVM: Add EIOINTC user mode read and write functions")
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
kernel-patches-daemon-bpf bot pushed a commit that referenced this pull request Sep 25, 2025
…status_access()

Function copy_from_user() and copy_to_user() may sleep because of page
fault, and they cannot be called in spin_lock hold context. Here move
funtcion calling of copy_from_user() and copy_to_user() out of function
kvm_eiointc_sw_status_access().

Otherwise there will be possible warning such as:

BUG: sleeping function called from invalid context at include/linux/uaccess.h:192
in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 6292, name: qemu-system-loo
preempt_count: 1, expected: 0
RCU nest depth: 0, expected: 0
INFO: lockdep is turned off.
irq event stamp: 0
hardirqs last  enabled at (0): [<0000000000000000>] 0x0
hardirqs last disabled at (0): [<9000000004c4a554>] copy_process+0x90c/0x1d40
softirqs last  enabled at (0): [<9000000004c4a554>] copy_process+0x90c/0x1d40
softirqs last disabled at (0): [<0000000000000000>] 0x0
CPU: 41 UID: 0 PID: 6292 Comm: qemu-system-loo Tainted: G W 6.17.0-rc3+ #31 PREEMPT(full)
Tainted: [W]=WARN
Stack : 0000000000000076 0000000000000000 9000000004c28264 9000100092ff4000
        9000100092ff7b80 9000100092ff7b88 0000000000000000 9000100092ff7cc8
        9000100092ff7cc0 9000100092ff7cc0 9000100092ff7a00 0000000000000001
        0000000000000001 9000100092ff7b88 947d2f9216a5e8b9 900010008773d880
        00000000ffff8b9f fffffffffffffffe 0000000000000ba1 fffffffffffffffe
        000000000000003e 900000000825a15b 000010007ad38000 9000100092ff7ec0
        0000000000000000 0000000000000000 9000000006f3ac60 9000000007252000
        0000000000000000 00007ff746ff2230 0000000000000053 9000200088a021b0
        0000555556c9d190 0000000000000000 9000000004c2827c 000055556cfb5f40
        00000000000000b0 0000000000000007 0000000000000007 0000000000071c1d
Call Trace:
[<9000000004c2827c>] show_stack+0x5c/0x180
[<9000000004c20fac>] dump_stack_lvl+0x94/0xe4
[<9000000004c99c7c>] __might_resched+0x26c/0x290
[<9000000004f68968>] __might_fault+0x20/0x88
[<ffff800002311de0>] kvm_eiointc_sw_status_access.isra.0+0x88/0x380 [kvm]
[<ffff8000022f8514>] kvm_device_ioctl+0x194/0x290 [kvm]
[<900000000506b0d8>] sys_ioctl+0x388/0x1010
[<90000000063ed210>] do_syscall+0xb0/0x2d8
[<9000000004c25ef8>] handle_syscall+0xb8/0x158

Cc: stable@vger.kernel.org
Fixes: 1ad7efa ("LoongArch: KVM: Add EIOINTC user mode read and write functions")
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
kernel-patches-daemon-bpf bot pushed a commit that referenced this pull request Sep 25, 2025
…s_access()

Function copy_from_user() and copy_to_user() may sleep because of page
fault, and they cannot be called in spin_lock hold context. Here move
function calling of copy_from_user() and copy_to_user() out of spinlock
context in function kvm_pch_pic_regs_access().

Otherwise there will be possible warning such as:

BUG: sleeping function called from invalid context at include/linux/uaccess.h:192
in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 6292, name: qemu-system-loo
preempt_count: 1, expected: 0
RCU nest depth: 0, expected: 0
INFO: lockdep is turned off.
irq event stamp: 0
hardirqs last  enabled at (0): [<0000000000000000>] 0x0
hardirqs last disabled at (0): [<9000000004c4a554>] copy_process+0x90c/0x1d40
softirqs last  enabled at (0): [<9000000004c4a554>] copy_process+0x90c/0x1d40
softirqs last disabled at (0): [<0000000000000000>] 0x0
CPU: 41 UID: 0 PID: 6292 Comm: qemu-system-loo Tainted: G W 6.17.0-rc3+ #31 PREEMPT(full)
Tainted: [W]=WARN
Stack : 0000000000000076 0000000000000000 9000000004c28264 9000100092ff4000
        9000100092ff7b80 9000100092ff7b88 0000000000000000 9000100092ff7cc8
        9000100092ff7cc0 9000100092ff7cc0 9000100092ff7a00 0000000000000001
        0000000000000001 9000100092ff7b88 947d2f9216a5e8b9 900010008773d880
        00000000ffff8b9f fffffffffffffffe 0000000000000ba1 fffffffffffffffe
        000000000000003e 900000000825a15b 000010007ad38000 9000100092ff7ec0
        0000000000000000 0000000000000000 9000000006f3ac60 9000000007252000
        0000000000000000 00007ff746ff2230 0000000000000053 9000200088a021b0
        0000555556c9d190 0000000000000000 9000000004c2827c 000055556cfb5f40
        00000000000000b0 0000000000000007 0000000000000007 0000000000071c1d
Call Trace:
[<9000000004c2827c>] show_stack+0x5c/0x180
[<9000000004c20fac>] dump_stack_lvl+0x94/0xe4
[<9000000004c99c7c>] __might_resched+0x26c/0x290
[<9000000004f68968>] __might_fault+0x20/0x88
[<ffff800002311de0>] kvm_pch_pic_regs_access.isra.0+0x88/0x380 [kvm]
[<ffff8000022f8514>] kvm_device_ioctl+0x194/0x290 [kvm]
[<900000000506b0d8>] sys_ioctl+0x388/0x1010
[<90000000063ed210>] do_syscall+0xb0/0x2d8
[<9000000004c25ef8>] handle_syscall+0xb8/0x158

Cc: stable@vger.kernel.org
Fixes: d206d95 ("LoongArch: KVM: Add PCHPIC user mode read and write functions")
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
kernel-patches-daemon-bpf bot pushed a commit that referenced this pull request Sep 25, 2025
commit 2a6c727 ("cpufreq: Initialize cpufreq-based
frequency-invariance later") postponed the frequency invariance
initialization to avoid disabling it in the error case.
This isn't locking safe, instead move the initialization up before
the subsys interface is registered (which will rebuild the
sched_domains) and add the corresponding disable on the error path.

Observed lockdep without this patch:
[    0.989686] ======================================================
[    0.989688] WARNING: possible circular locking dependency detected
[    0.989690] 6.17.0-rc4-cix-build+ #31 Tainted: G S
[    0.989691] ------------------------------------------------------
[    0.989692] swapper/0/1 is trying to acquire lock:
[    0.989693] ffff800082ada7f8 (sched_energy_mutex){+.+.}-{4:4}, at: rebuild_sched_domains_energy+0x30/0x58
[    0.989705]
               but task is already holding lock:
[    0.989706] ffff000088c89bc8 (&policy->rwsem){+.+.}-{4:4}, at: cpufreq_online+0x7f8/0xbe0
[    0.989713]
               which lock already depends on the new lock.

Fixes: 2a6c727 ("cpufreq: Initialize cpufreq-based frequency-invariance later")
Signed-off-by: Christian Loehle <christian.loehle@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants