Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
zero out sample struct between each SPE record processing. Rationale …
…for instructions over branches inside. desktop:~/sort-inject.txt: 0000000000000998 <sort_array>: 998: d112c3ff sub sp, sp, #0x4b0 99c: 90000000 adrp x0, 0 <_init-0x6d0> 9a0: 912ae000 add x0, x0, #0xab8 9a4: 52802581 mov w1, #0x12c // torvalds#300 9a8: a9bd7bfd stp x29, x30, [sp, #-48]! 9ac: 910003fd mov x29, sp 9b0: f90013f5 str x21, [sp, torvalds#32] 9b4: 9100c3b5 add x21, x29, #0x30 9b8: a90153f3 stp x19, x20, [sp, torvalds#16] 9bc: 911383b4 add x20, x29, #0x4e0 9c0: aa1503f3 mov x19, x21 9c4: 97ffff6b bl 770 <printf@plt> 9c8: 97ffff5e bl 740 <rand@plt> 9cc: b8004660 str w0, [x19], #4 9d0: eb13029f cmp x20, x19 9d4: 54ffffa1 b.ne 9c8 <sort_array+0x30> // b.any 9d8: 9112b2a3 add x3, x21, #0x4ac 9dc: aa1503e0 mov x0, x21 9e0: 52800004 mov w4, #0x0 // #0 9e4: 29400402 ldp w2, w1, [x0] 5.33% <-----\ 9e8: 6b02003f cmp w1, w2 5.08% | 9ec: 5400006a b.ge 9f8 <sort_array+0x60> // b.tcont 5.16% >-\ | 9f0: 52800024 mov w4, #0x1 // #1 1.56% (swap) | | 9f4: 29000801 stp w1, w2, [x0] 1.37% (swap) | | 9f8: 91001000 add x0, x0, #0x4 5.35% <-/ | 9fc: eb00007f cmp x3, x0 5.11% | a00: 54ffff21 b.ne 9e4 <sort_array+0x4c> // b.any 5.21% >-----/ a04: 35fffec4 cbnz w4, 9dc <sort_array+0x44> a08: a94153f3 ldp x19, x20, [sp, torvalds#16] a0c: f94013f5 ldr x21, [sp, torvalds#32] a10: a8c37bfd ldp x29, x30, [sp], torvalds#48 a14: 9112c3ff add sp, sp, #0x4b0 a18: d65f03c0 ret a1c: 00000000 .inst 0x00000000 ; undefined Are we reporting instructions as branches? what does PT do? vvvvvvvv - above aren't all branches!! Samples: 12K of event 'branches', Event count (approx.): 12560 Children Self Command Shared Object Symbol 5.35% 5.35% :-1 [unknown] [.] 0x0000aaaaaaaaa9f8 ◆ 5.33% 5.33% :-1 [unknown] [.] 0x0000aaaaaaaaa9e4 ▒ 5.21% 5.21% :-1 [unknown] [.] 0x0000aaaaaaaaaa00 ▒ 5.16% 5.16% :-1 [unknown] [.] 0x0000aaaaaaaaa9ec ▒ 5.11% 5.11% :-1 [unknown] [.] 0x0000aaaaaaaaa9fc ▒ 5.08% 5.08% :-1 [unknown] [.] 0x0000aaaaaaaaa9e8 ▒ 1.56% 1.56% :-1 [unknown] [.] 0x0000aaaaaaaaa9f0 ▒ 1.37% 1.37% :-1 [unknown] [.] 0x0000aaaaaaaaa9f4 ▒ 0.40% 0.40% :-1 [unknown] [k] 0x00ff0000081b2b0c ▒ 0.33% 0.33% :-1 [unknown] [k] 0x00ff0000081b2b08 ▒ 0.33% 0.33% :-1 [unknown] [k] 0x00ff0000081b2b20 ▒ 0.30% 0.30% :-1 [unknown] [k] 0x00ff0000081b2b14 ▒ 0.21% 0.21% :-1 [unknown] [.] 0x0000ffffbf68568c ▒ (rest are non-aaaa's.) Intel-PT on sort: Available samples 0 intel_pt// ◆ 0 dummy:u ▒ 0 dummy:u ▒ 1K instructions ▒ 0 transactions ▒ 0 ptwrite ▒ 1 cbr ▒ Samples: 1K of event 'instructions', Event count (approx.): 85205326 Children Self Command Shared Object Symbol + 99.56% 0.00% sort libc-2.27.so [.] __libc_start_main + 99.44% 0.00% sort sort [.] _start + 99.36% 0.00% sort sort [.] main + 99.36% 99.07% sort sort [.] sort_array + 0.58% 0.00% sort [kernel.kallsyms] [k] __indirect_thunk_start 0.32% 0.00% sort ld-2.27.so [.] _dl_start_user 0.29% 0.00% sort [kernel.kallsyms] [k] page_fault 0.29% 0.00% sort [kernel.kallsyms] [k] do_page_fault clicking on sort_array goes to annotate, showing 'skid'? on the jge 90's next instruction (in the swap routine?): │ bubble_sort(): ▒ │ swap_flag = 0; ▒ │ xor %edx,%edx ▒ │ for (i = 1; i < n; i++) { ▒ │ mov $0x1,%ecx ▒ │ nop ▒ 10.28 │ 90: cmp %ecx,%ebx ▒ 3.43 │ ↓ jle 148 ▒ 26.48 │ 98: movslq %ecx,%rax ▒ 5.46 │ lea 0x0(%rbp,%rax,4),%rax ▒ │ if (a[i] < a[i - 1]) { ▒ 12.69 │ a0: mov (%rax),%esi ▒ 6.48 │ mov -0x4(%rax),%edi ▒ 6.39 │ add $0x1,%ecx ▒ 5.56 │ cmp %edi,%esi ▒ 7.31 │ ↑ jge 90 ▒ │ a[i] = a[i - 1]; ◆ 7.31 │ mov %edi,(%rax) ▒ │ a[i - 1] = temp; ▒ 1.39 │ mov %esi,-0x4(%rax) ▒ 1.48 │ add $0x4,%rax ▒ │ for (i = 1; i < n; i++) { ▒ 2.04 │ cmp %ecx,%ebx ▒ │ swap_flag = 1; ▒ 2.13 │ mov $0x1,%edx ▒ │ for (i = 1; i < n; i++) { ▒ 1.57 │ ↑ jg a0 ▒ │ xor %edx,%edx ▒ │ cmp $0x1,%ebx ▒ │ mov $0x1,%ecx ▒ │ ↑ jg 98 ▒ │ stop(): ▒ Anyway, branches aren't being reported, even with record -b -e intel_pt we get 'no samples in perf.data file'. And perf report --branch-stack doesnt' run if it sees record wasn't run with -b. So, code is wrong. Those samples are instructions, not branches, even though the instructions are branches. wrt symbols, vmlinux is being loaded and read, but no addresses are being reported as kernel addresses, so no symbols get used. In the aem-built-perf-report-vvvvv.out case, sort didn't match build-id-wise because I didn't carry the archive... I should start using --symfs, which souds like it's more embedded-friendly (android, e.g.).
- Loading branch information