Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update -with-profile and add some profiling documentation #7601

Merged
merged 3 commits into from
Mar 30, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion configure.ac
Original file line number Diff line number Diff line change
Expand Up @@ -1993,7 +1993,7 @@ if test "x${ac_cv_member_struct_tcp_info_tcpi_data_segs_out}" = "xyes"; then
fi

if test "x${with_profiler}" = "xyes"; then
AC_CHECK_HEADERS([google/profiler.h \
AC_CHECK_HEADERS([gperftools/profiler.h \
], [], [])
fi

Expand Down
33 changes: 33 additions & 0 deletions doc/developer-guide/debugging/profiling.en.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,3 +22,36 @@
Profiling
*********

There are two main options for performance profiling: perf and gperf.

perf
====

The perf top option is useful to quickly identify functions that are taking a larger than expected
portion of the execution time.::

sudo perf top -p `pidof traffic_server`

For more details use the record subcommand to gather profiling data on the traffic_server process. Using
the -g option will gather call stack information. Compiling with -ggdb and -fno-omit-frame-pointer
will make it more likely that perf record will gather complete callstacks.::

sudo perf record -g -p `pidof traffic_server`

After gathering profilng data with perf record, use perf report to display the call stacks with their corresponding
contribution to total execution time.::

sudo perf report

gperf
=====

Gperftools also provides libraries to statistically sample the callstacks of a process. The --with-profile=yes option for configure will
link with the gperftools profiling library and add profile stop and profile dump function calls at the beginning and end of the traffic_server
main function. The profilng data will be dumped in /tmp/ts.prof.

Once the profiling data file is present, you can use the pprof tool to generate a pdf callgraph of the data to see which
call stacks contribute most to the execution time.::

pprof --pdf /opt/trafficserver/9.0/bin/traffic_server ts.prof > prof.pdf