|
| 1 | +# Overview - Linux Trace and Log Capture |
| 2 | + |
| 3 | +This provides a quick start on how to capture logs on Linux. |
| 4 | + |
| 5 | +Logs: |
| 6 | + |
| 7 | +- [LTTng](https://lttng.org) system trace (requires customized image for boot scenario) |
| 8 | +- Perf CPU Sampling |
| 9 | +- [cloud-init.log](https://cloud-init.io/) |
| 10 | + - Automatically logged by cloud-init to /var/log/cloud-init.log |
| 11 | +- [dmesg.iso.log](https://en.wikipedia.org/wiki/Dmesg) |
| 12 | + - Standard auto dmesg log doesn't output in absolute time needed for correlation with other logs |
| 13 | + - dmesg --time-format iso > dmesg.iso.log |
| 14 | +- [waagent.log](https://github.com/Azure/WALinuxAgent) |
| 15 | + - Automatically logged by WaLinuxAgent to /var/log/waagent.log |
| 16 | + - LogLevel can be turned more verbose in custom image |
| 17 | + - /etc/waagent.conf |
| 18 | + - Logs.Verbose=y |
| 19 | + |
| 20 | +# LTTng |
| 21 | +[LTTng](https://lttng.org) (Kernel CPU scheduling, Processes, Threads, Block IO/Disk, Syscalls, File events, etc) |
| 22 | + |
| 23 | +[LTTng Docs](https://lttng.org/docs/v2.10/) [LTTng](https://lttng.org/) is an open source tracing framework for Linux. Installation instructions for your Linux distro can be found in the docs. |
| 24 | + |
| 25 | +Supports: |
| 26 | +- Threads and Processes |
| 27 | +- Context Switches / CPU Usage |
| 28 | +- Syscalls |
| 29 | +- File related events |
| 30 | +- Block IO / Disk Activity |
| 31 | +- Diagnostic Messages |
| 32 | + |
| 33 | +Once you have everything set up you just need to decide what kind of information you are looking for and begin tracing. |
| 34 | + |
| 35 | +In this example we are looking at process scheduler events. We might use this to determine process lifetime and identify dependencies. You can learn more about what kind of "events" you can enable [here](https://lttng.org/man/1/lttng-enable-event/v2.8/). |
| 36 | +```bash |
| 37 | + root@xenial:~/tracing# lttng list --kernel # Gives a list of all Kernel events you can trace |
| 38 | + root@xenial:~/tracing# lttng list --kernel --syscall # Gives a list of all traceable Linux system calls |
| 39 | + |
| 40 | + root@xenial:~/tracing# lttng create my-kernel-session --output=/tmp/my-kernel-trace |
| 41 | + root@xenial:~/tracing# lttng enable-event --kernel sched_process* |
| 42 | + root@xenial:~/tracing# lttng start |
| 43 | + root@xenial:~/tracing# lttng stop |
| 44 | + root@xenial:~/tracing# lttng destroy |
| 45 | +``` |
| 46 | + |
| 47 | +## Recommended LTTng Tracing |
| 48 | + |
| 49 | +### Install the tracing software: |
| 50 | +Example on Ubuntu: |
| 51 | +```bash |
| 52 | +$ sudo apt-get install lttng-tools lttng-modules-dkms liblttng-ust-dev |
| 53 | +``` |
| 54 | +For more examples see [LTTng Download docs](https://lttng.org/download/) |
| 55 | + |
| 56 | +### Create a session: |
| 57 | +```bash |
| 58 | +$ sudo lttng create my-kernel-session --output=lttng-kernel-trace |
| 59 | +``` |
| 60 | + |
| 61 | +### Add the desired events to be recorded: |
| 62 | +```bash |
| 63 | +$ sudo lttng enable-event --kernel block_rq_complete,block_rq_insert,block_rq_issue,printk_console,sched_wak*,sched_switch,sched_process_fork,sched_process_exit,sched_process_exec,lttng_statedump* |
| 64 | +$ sudo lttng enable-event --kernel --syscall –-all |
| 65 | +``` |
| 66 | + |
| 67 | +### Add context fields to the channel: |
| 68 | +```bash |
| 69 | +$ sudo lttng add-context --kernel --channel=channel0 --type=tid |
| 70 | +$ sudo lttng add-context --kernel --channel=channel0 --type=pid |
| 71 | +$ sudo lttng add-context --kernel --channel=channel0 --type=procname |
| 72 | +``` |
| 73 | + |
| 74 | +### Start the recording: |
| 75 | +```bash |
| 76 | +$ sudo lttng start |
| 77 | +``` |
| 78 | + |
| 79 | +### Save the session: |
| 80 | +```bash |
| 81 | +$ sudo lttng regenerate statedump <- Better correlation / info in Microsoft-Performance-Tools-Linux |
| 82 | +$ sudo lttng stop |
| 83 | +$ sudo lttng destroy |
| 84 | +``` |
| 85 | + |
| 86 | +# Perf |
| 87 | +Perf is used to collect CPU Sampling (cpu-clock) events as LTTng doesn't support capturing these yet. Note: Stacks may require symbol setup |
| 88 | + |
| 89 | +[perf](https://perf.wiki.kernel.org/) CPU Sampling(cpu-clock) |
| 90 | + |
| 91 | +If you want to trace .NET Core then you need [perfcollect](http://aka.ms/perfcollect) which capture CPU sampling and more |
| 92 | + |
| 93 | +## Perf Install |
| 94 | +```bash |
| 95 | +$ sudo apt-get install linux-tools-common |
| 96 | +``` |
| 97 | + |
| 98 | +## User-Mode (UM) Symbols Install |
| 99 | +KM symbols are automatically resolved. If you wish to resolve UM cpu sample functions and stacks, you may need to install debug packages for the binary you are profiling |
| 100 | + |
| 101 | +For example, [Debug Symbol Packages on Ubuntu](https://wiki.ubuntu.com/Debug%20Symbol%20Packages) |
| 102 | + |
| 103 | +## Record a trace |
| 104 | +```bash |
| 105 | +$ sudo /usr/bin/perf record -g -a -F 999 -e cpu-clock,sched:sched_stat_sleep,sched:sched_switch,sched:sched_process_exit -o perf_cpu.data |
| 106 | +``` |
| 107 | + |
| 108 | +## Stop the Trace |
| 109 | +```bash |
| 110 | +$ Ctrl-C |
| 111 | +``` |
| 112 | + |
| 113 | +## Convert trace to text format |
| 114 | +This is to useful along-side the CTF trace to resolve UM IP/Symbols. Similar to what [perfcollect](https://raw.githubusercontent.com/microsoft/perfview/master/src/perfcollect/perfcollect) uses |
| 115 | + |
| 116 | +```bash |
| 117 | +$ sudo perf inject -v -s -i perf_cpu.data -o perf.data.merged |
| 118 | + |
| 119 | +# There is a breaking change where the capitalization of the -f parameter changed. |
| 120 | +$ sudo perf script -i perf.data.merged -F comm,pid,tid,cpu,time,period,event,ip,sym,dso,trace > perf.data.txt |
| 121 | + |
| 122 | +if [ $? -ne 0 ] |
| 123 | +then |
| 124 | + $ sudo perf script -i perf.data.merged -f comm,pid,tid,cpu,time,period,event,ip,sym,dso,trace > perf.data.txt |
| 125 | +fi |
| 126 | + |
| 127 | +# If the dump file is zero length, try to collect without the period field, which was added recently. |
| 128 | +if [ ! -s perf.data.txt ] |
| 129 | +then |
| 130 | + $ sudo perf script -i perf.data.merged -f comm,pid,tid,cpu,time,event,ip,sym,dso,trace > perf.data.txt |
| 131 | +fi |
| 132 | +``` |
| 133 | + |
| 134 | +## Capture trace timestamp start |
| 135 | +Perf.data.txt only contains relative timestamps. If you want correct absolute timestamps in UI then you will need to know the trace start time. |
| 136 | + |
| 137 | +```bash |
| 138 | +$ sudo perf report --header-only -i perf_cpu.data | grep "captured on" |
| 139 | +``` |
| 140 | + |
| 141 | +Place the "captured on" timestamp for example "Thu Oct 17 15:37:36 2019" in a timestamp.txt file next to the trace folder. The timestamp will be interpreted as UTC |
| 142 | + |
| 143 | +### Convert to CTF (Optional) (requires CTF enabled perf) |
| 144 | +We have optional support for perf CTF conversion. It it currently NOT RECOMMENDED to use this though as you get less features (like KM/UM stacks) than perf.data.txt support which resolves callstacks on the box. |
| 145 | +This only supports KM symbols (for now) supplied by kallsyms. Microsoft-Linux-Perf-Tools support for the perf CTF trace is experimental given lack of UM symbols |
| 146 | + |
| 147 | +```bash |
| 148 | +$ perf data convert -i perf_cpu.data --all --to-ctf ./perf_cpu.data-ctf |
| 149 | +``` |
| 150 | + |
| 151 | +You will need the perf file in converted CTF format which you can do with a custom compiled perf (unless some distro compiled the support in). [Custom build instructions here](https://stackoverflow.com/questions/43576997/building-perf-with-babeltrace-for-perf-to-ctf-conversion) |
| 152 | + |
| 153 | +## Save Kernel Symbols (Optional) (for use with CTF enabled perf) |
| 154 | +This is only needed for a perf CTF trace |
| 155 | + |
| 156 | +```bash |
| 157 | +$ sudo cat /proc/kallsyms > kallsyms |
| 158 | +``` |
| 159 | + |
| 160 | +# Transferring the files to Windows UI (optional) |
| 161 | +You then need to transfer the perf files to a Windows box where WPA runs. The most important file is perf.data.txt |
| 162 | + |
| 163 | +```bash |
| 164 | +$ sudo chmod 777 -R perf* |
| 165 | +``` |
| 166 | + |
| 167 | +- Copy files from Linux to Windows box with WinSCP/SCP OR |
| 168 | +```bash |
| 169 | +$ tar -czvf perf_cpu-ctf.tar.gz perf* |
| 170 | +``` |
| 171 | +- (Optional if you want absolute timestamps) Place timestamp.txt next to perf.data.txt |
| 172 | +- Open perf.data.txt with WPA |
| 173 | +- For the perf CTF file (optional) On Windows, Zip the folder up and rename to .ctf extension. E.g. perf_cpu-ctf.ctf (which is really a .zip file) |
| 174 | + - CTF (Optional) Kallsyms needs to be on your Desktop |
| 175 | + |
| 176 | + |
| 177 | +# Presentations |
| 178 | + |
| 179 | +If you want to see a demo or get more in-depth info on using these tools check out a talk given at the [Linux Tracing Summit](https://www.tracingsummit.org/ts/2019/): |
| 180 | +>Linux & Windows Perf Analysis using WPA, ([slides](https://www.tracingsummit.org/ts/2019/files/Tracingsummit2019-wpa-berg-gibeau.pdf)) ([video](https://youtu.be/HUbVaIi-aaw)) |
0 commit comments