addr2line taking an exorbitant amount of time #74

Licenser · 2020-02-20T12:08:22Z

Hi,
I've recently been using flamegraph on a Linux system and once the recording is done it takes an extremely long time. It seems to be invoking addr2line over and over again making the whole process quite slow.

The output is:

[ perf record: Woken up 682 times to write data ]
[ perf record: Captured and wrote 170,633 MB perf.data (21198 samples) ]

21k samples don't sound that much but if it's invoking a program for every sample it that seems to become very expensive.

The text was updated successfully, but these errors were encountered:

bjorn3 · 2020-02-20T12:40:01Z

This is perf trying to compute which functions are inlined at every stack frame for every sample. If you don't want this, you need to pass --no-inline to perf. The invocation can be found at

flamegraph/src/lib.rs

Lines 80 to 86 in 0b8d12d

    
           pub fn output() -> Vec<u8> { 
        
               Command::new("perf") 
        
                   .arg("script") 
        
                   .output() 
        
                   .expect("unable to call perf script") 
        
                   .stdout 
        
           }

tonyg · 2021-09-09T12:01:45Z

Hi, I wrote a patch for perf which uses a long-running addr2line process instead of one subprocess per address-to-look-up. It dramatically improves performance of perf, making flamegraphs on large samples with inlining a possibility again. See: https://eighty-twenty.org/2021/09/09/perf-addr2line-speed-improvement

djc · 2021-09-09T12:20:56Z

@tonyg very cool! Did you submit your patch to perf upstream? It seems like the kind of thing they might consider merging.

bjorn3 · 2021-09-09T12:24:36Z

The blog post links to https://lore.kernel.org/linux-perf-users/20210909112202.1947499-1-tonyg@leastfixedpoint.com/ which was submitted about an hour ago.

tonyg · 2021-09-09T12:37:51Z

Thanks @djc! As @bjorn3 said, I've sent it upstream, but it's too soon to say if it'll be acceptable or not. Another possible audience is the Debian maintainers, if the kernel folks won't take it. I'm not sure if it's just Debian that has the slowdown; certainly, other distributions are linking perf against libbfd which doesn't suffer from the problem.

Licenser · 2021-09-09T14:26:02Z

I can cofirm in that ubnutu (admittedly based on Debian) also suffers the perf problem with addr2line so it is not just (vanilla) debian

Geal · 2021-11-03T16:30:58Z

I am trying the solution of @tonyg and it's indeed a lot faster, but it generates flamegraphs that are missing a lot of information, like I can get a graph that only has kernel level traces, or has the application's call stack but does not show function names (debug info is properly generated). I'm on ubuntu, linux 5.13

Any advice on how I could debug this?

tonyg · 2021-11-04T09:19:49Z

I am trying the solution of @tonyg and it's indeed a lot faster, but it generates flamegraphs that are missing a lot of information, like I can get a graph that only has kernel level traces, or has the application's call stack but does not show function names (debug info is properly generated). I'm on ubuntu, linux 5.13

That's interesting! Does the same binary, run with the unpatched perf, yield better flamegraphs? (My advice, though I'm sure you've already done this, would be to double-check your Cargo.toml settings for profile.bench and profile.release, ensuring that debug = true and strip = false...)

Geal · 2021-11-04T14:10:44Z

I was not adding strip = false but debug=true was there for the release profile, and I verified that the generated binary had symbols using strings.
The unpatched perf yielded better flamegraphs yes

Geal · 2021-11-08T17:00:40Z

ok, so that was a stupid mistake on my part: libdw and others were not installed, so perf was built without the ability to read the symbols. That was written plainly right at the beginning of make's output 😑

osa1 · 2022-01-13T12:34:33Z

Could anyone update us about the perf patch mentioned above please? Is it merged to upstream? In the linked thread I don't see an email that announces that it's merged so I think it's not?

Currently on my application a recording of 30 seconds takes about an hour to render.

djc · 2022-01-13T14:21:43Z

It does appear in the current Linux tree which is easy to check, here is the commit on GitHub.

Looks like it's available as of 5.16.

kalradivyanshu · 2023-04-30T21:10:30Z

I spent a lot of time today trying to get this to work, I couldn't upgrade my kernel (for other reasons), so I finally cloned perf, added the patch, and built it from source. It worked, flamegraph is now created significantly quicker.

Adding instructions for anyone who is also stuck (and definitely also for me when I have to inevitably do this on another server/machine):
(instructions are for ubuntu 22.04 linux kernel 5.15 on arm64)

uninstall existing perf if any:

sudo apt-get remove linux-tools-generic

install perf dev dependencies:

sudo apt-get update
sudo apt-get install flex bison glibc-source libelf-dev libdw-dev libunwind-dev libnewt-dev libgtk2.0-dev binutils-dev libnuma-dev libbabeltrace-ctf-dev libperl-dev python2-dev libiberty-dev zlib1g-dev libzstd-dev libbabeltrace-dev

Now download the linux kernel source (replace 5.15 with your kernel version can be found by uname -r)

wget -c https://mirrors.edge.kernel.org/pub/linux/kernel/v5.x/linux-5.15.tar.gz
tar -xzf linux-5.15.tar.gz
cd linux-5.15/tools/perf

Now replace srcline.c in the util folder (perf/util) with the srcline.c from linux kernel 5.16 that has the patch mentioned above: link

now build perf and install:

make clean; make ARCH=arm64
sudo make install

finally copy perf binary to /usr/bin:

cp /usr/bin/

and now flamegraph should run significantly faster! 🥳

ilya-zlobintsev · 2023-12-29T15:21:59Z

~~I am still experiencing this issue despite being on an up-to-date Arch Linux system (binutils 2.41.0, kernel 6.6). Flamegraphs take 10+ minutes to generate due to slow addr2line calls.~~

Fixed by using https://github.com/gimli-rs/addr2line.

lixin-wei · 2024-01-25T06:09:16Z

https://github.com/gimli-rs/addr2line is awesome! My time cost boosted from 3min to 10s after using it.

git clone https://github.com/gimli-rs/addr2line
cd addr2line
cargo build --release --examples
sudo cp /usr/bin/addr2line /usr/bin/addr2line-bak
sudo cp target/release/examples/addr2line /usr/bin/addr2line

wez · 2024-05-01T22:10:01Z

Driving by from a more general perf + rust problem and found this thread super helpful!

gimli's addr2line build is a bit different today:

cargo build --release --bin addr2line --features=bin

Rather than replace the system install, I just update my PATH when invoking eg: perf to find this binary.

PATH=/home/wez/Downloads/addr2line/target/release:$PATH perf report -g --stdio -G

inteon mentioned this issue Feb 24, 2021

Add no-inline option #127

Merged

bors bot closed this as completed in ee462f2 Mar 23, 2021

dforsten mentioned this issue Sep 10, 2022

Performance Profiling DMDcoin/openethereum-3.x#97

Closed

glpuga mentioned this issue Feb 6, 2023

Add profiling instructions Ekumen-OS/beluga#96

Merged

6 tasks

glpuga mentioned this issue May 25, 2023

Replace perf with faster patched built-from-source version Ekumen-OS/beluga#198

Closed

7 tasks

djc mentioned this issue Oct 31, 2023

Very long analysis due to slow addr2line calls #294

Closed

MrCroxx mentioned this issue Aug 22, 2024

ci: install addr2line-rs binary along with flamegraph risingwavelabs/risingwave#18171

Merged

9 tasks

Kobzol mentioned this issue Nov 24, 2024

"x.py check" that does nothing feels like it has gotten slower rust-lang/rust#133162

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

addr2line taking an exorbitant amount of time #74

addr2line taking an exorbitant amount of time #74

Licenser commented Feb 20, 2020

bjorn3 commented Feb 20, 2020 •

edited

Loading

tonyg commented Sep 9, 2021

djc commented Sep 9, 2021

bjorn3 commented Sep 9, 2021

tonyg commented Sep 9, 2021

Licenser commented Sep 9, 2021

Geal commented Nov 3, 2021

tonyg commented Nov 4, 2021

Geal commented Nov 4, 2021

Geal commented Nov 8, 2021

osa1 commented Jan 13, 2022

djc commented Jan 13, 2022 •

edited

Loading

kalradivyanshu commented Apr 30, 2023 •

edited

Loading

ilya-zlobintsev commented Dec 29, 2023 •

edited

Loading

lixin-wei commented Jan 25, 2024

wez commented May 1, 2024

addr2line taking an exorbitant amount of time #74

addr2line taking an exorbitant amount of time #74

Comments

Licenser commented Feb 20, 2020

bjorn3 commented Feb 20, 2020 • edited Loading

tonyg commented Sep 9, 2021

djc commented Sep 9, 2021

bjorn3 commented Sep 9, 2021

tonyg commented Sep 9, 2021

Licenser commented Sep 9, 2021

Geal commented Nov 3, 2021

tonyg commented Nov 4, 2021

Geal commented Nov 4, 2021

Geal commented Nov 8, 2021

osa1 commented Jan 13, 2022

djc commented Jan 13, 2022 • edited Loading

kalradivyanshu commented Apr 30, 2023 • edited Loading

ilya-zlobintsev commented Dec 29, 2023 • edited Loading

lixin-wei commented Jan 25, 2024

wez commented May 1, 2024

bjorn3 commented Feb 20, 2020 •

edited

Loading

djc commented Jan 13, 2022 •

edited

Loading

kalradivyanshu commented Apr 30, 2023 •

edited

Loading

ilya-zlobintsev commented Dec 29, 2023 •

edited

Loading