Build Profiling, main branch (2024.09.22.) #713

krasznaa · 2024-09-22T10:07:30Z

As we've been discussing between a few of us, the build of the project is getting a bit out of hand by now. 😦 Building generally takes a long time, and also a surprisingly large amount of memory.

To help with this, first we need to understand exactly which steps of the build take the longest time and the largest amount of memory. To do this, I hijacked the CTEST_USE_LAUNCHERS feature of ctest. Which is the technique we use also in AtlasCMake for saving "package specific" build logs in ATLAS offline builds.

What happens is that if one specifies -DCTEST_USE_LAUNCHERS=TRUE in the CMake configuration command, the newly introduced traccc-ctest.sh script gets set up to intercept each and every build command. Including all linking, and any other technical commands. The script then runs the commands through GNU time, to get detailed statistics about every build command. It saves the the results of this into a file called traccc_build_performance.log in the build directory. Which would have entries like:

        Command being timed: "/home/krasznaa/software/kitware/cmake-3.30.2/x86_64-ubuntu2204-gcc11-opt/bin/ctest --launch --target-name Vc --build-dir /data/ssd-1tb/projects/traccc/build/_deps/vc-build --output /data/ssd-1tb/projects/traccc/build/_deps/vc-build/trigonometric_SSSE3.cpp -- /home/krasznaa/software/kitware/cmake-3.30.2/x86_64-ubuntu2204-gcc11-opt/bin/cmake -E copy src/trigonometric.cpp /data/ssd-1tb/projects/traccc/build/_deps/vc-build/trigonometric_SSSE3.cpp"
        User time (seconds): 0.00
        System time (seconds): 0.01
        Percent of CPU this job got: 90%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.01
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 9728
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 963
        Voluntary context switches: 6
        Involuntary context switches: 4
        Swaps: 0
        File system inputs: 0
        File system outputs: 48
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0

Leading up to some of the really "heavy" commands, like:

        Command being timed: "/home/krasznaa/software/kitware/cmake-3.30.2/x86_64-ubuntu2204-gcc11-opt/bin/ctest --launch --target-name traccc_test_cuda --build-dir /data/ssd-1tb/projects/traccc/build/tests/cuda --output CMakeFiles/traccc_test_cuda.dir/test_ckf_combinatorics_telescope.cpp.o --source /data/ssd-1tb/projects/traccc/traccc/tests/cuda/test_ckf_combinatorics_telescope.cpp --language CXX -- /usr/bin/c++ -DACTS_CONCEPTS_SUPPORTED -DALGEBRA_PLUGINS_INCLUDE_ARRAY -DBOOST_ALL_NO_LIB -DCOVFIE_QUIET -DDETRAY_ALGEBRA_ARRAY -DDETRAY_ALGEBRA_EIGEN -DDETRAY_ALGEBRA_VC -DDETRAY_CUSTOM_SCALARTYPE=float -DTHRUST_DEVICE_SYSTEM=THRUST_DEVICE_SYSTEM_CUDA -DTHRUST_HOST_SYSTEM=THRUST_HOST_SYSTEM_CPP -DTRACCC_CUSTOM_SCALARTYPE=float -DVECMEM_DEBUG_MSG_LVL=0 -DVECMEM_HAVE_PMR_MEMORY_RESOURCE -DVECMEM_SOURCE_DIR_LENGTH=37 -DVECMEM_SUPPORT_POSIX_ATOMIC_REF -I/data/ssd-1tb/projects/traccc/build/_deps/cccl-src/thrust/thrust/cmake/../.. -I/data/ssd-1tb/projects/traccc/build/_deps/cccl-src/libcudacxx/lib/cmake/libcudacxx/../../../include -I/data/ssd-1tb/projects/traccc/build/_deps/cccl-src/cub/cub/cmake/../.. -I/data/ssd-1tb/projects/traccc/build/_deps/vecmem-build/cuda/CMakeFiles -I/data/ssd-1tb/projects/traccc/build/_deps/vecmem-src/cuda/include -I/data/ssd-1tb/projects/traccc/build/_deps/vecmem-build/core/CMakeFiles -I/data/ssd-1tb/projects/traccc/build/_deps/vecmem-src/core/include -I/data/ssd-1tb/projects/traccc/build/_deps/dfelibs-src -I/data/ssd-1tb/projects/traccc/traccc/core/include -I/data/ssd-1tb/projects/traccc/traccc/plugins/algebra/array/include -I/data/ssd-1tb/projects/traccc/build/_deps/algebraplugins-src/frontend/array_cmath/include -I/data/ssd-1tb/projects/traccc/build/_deps/algebraplugins-src/common/include -I/data/ssd-1tb/projects/traccc/build/_deps/algebraplugins-src/storage/array/include -I/data/ssd-1tb/projects/traccc/build/_deps/algebraplugins-src/math/cmath/include -I/data/ssd-1tb/projects/traccc/build/_deps/algebraplugins-src/math/common/include -I/data/ssd-1tb/projects/traccc/traccc/plugins/algebra/vecmem/include -I/data/ssd-1tb/projects/traccc/build/_deps/algebraplugins-src/frontend/vecmem_cmath/include -I/data/ssd-1tb/projects/traccc/build/_deps/algebraplugins-src/storage/vecmem/include -I/data/ssd-1tb/projects/traccc/traccc/plugins/algebra/eigen/include -I/data/ssd-1tb/projects/traccc/build/_deps/algebraplugins-src/frontend/eigen_eigen/include -I/data/ssd-1tb/projects/traccc/build/_deps/algebraplugins-src/storage/eigen/include -I/data/ssd-1tb/projects/traccc/build/_deps/algebraplugins-src/math/eigen/include -I/data/ssd-1tb/projects/traccc/traccc/plugins/algebra/vc/include -I/data/ssd-1tb/projects/traccc/build/_deps/algebraplugins-src/frontend/vc_vc/include -I/data/ssd-1tb/projects/traccc/build/_deps/algebraplugins-src/storage/vc/include -I/data/ssd-1tb/projects/traccc/build/_deps/algebraplugins-src/math/vc/include -I/data/ssd-1tb/projects/traccc/build/_deps/algebraplugins-src/frontend/vc_cmath/include -I/data/ssd-1tb/projects/traccc/traccc/device/common/include -I/data/ssd-1tb/projects/traccc/traccc/device/cuda/include -I/data/ssd-1tb/projects/traccc/traccc/performance/include -I/data/ssd-1tb/projects/traccc/traccc/io/include -I/data/ssd-1tb/projects/traccc/traccc/simulation/include -I/data/ssd-1tb/projects/traccc/traccc/tests/common -isystem /home/krasznaa/software/nvidia/cuda-12.6.1/x86_64/targets/x86_64-linux/include -isystem /data/ssd-1tb/projects/traccc/build/_deps/googletest-src/googletest/include -isystem /data/ssd-1tb/projects/traccc/build/_deps/googletest-src/googletest -isystem /data/ssd-1tb/projects/traccc/build/_deps/detray-src/core/include -isystem /data/ssd-1tb/projects/traccc/build/_deps/detray-build/core/CMakeFiles -isystem /data/ssd-1tb/projects/traccc/build/_deps/detray-src/io/include -isystem /data/ssd-1tb/projects/traccc/build/_deps/nlohmann_json-src/include -isystem /data/ssd-1tb/projects/traccc/build/_deps/covfie-src/lib/core -isystem /data/ssd-1tb/projects/traccc/build/_deps/detray-src/tests/include -isystem /data/ssd-1tb/projects/traccc/build/_deps/detray-src/detectors/include -isystem /data/ssd-1tb/projects/traccc/build/_deps/detray-src/plugins/svgtools/include -isystem /data/ssd-1tb/projects/traccc/build/_deps/actsvg-src/core/include -isystem /data/ssd-1tb/projects/traccc/build/_deps/actsvg-src/meta/include -isystem /data/ssd-1tb/projects/traccc/build/_deps/eigen3-src -isystem /data/ssd-1tb/projects/traccc/build/_deps/detray-src/plugins/algebra/array/include -isystem /data/ssd-1tb/projects/traccc/build/_deps/detray-src/plugins/algebra/eigen/include -isystem /data/ssd-1tb/projects/traccc/build/_deps/vc-src -isystem /data/ssd-1tb/projects/traccc/build/_deps/detray-src/plugins/algebra/vc/include -isystem /data/ssd-1tb/projects/traccc/build/_deps/acts-src/Core/include -isystem /data/ssd-1tb/projects/traccc/build/_deps/acts-build/Core -Wall -Wextra -Wshadow -Wunused-local-typedefs -pedantic -Wold-style-cast -Wfloat-conversion -Werror -O3 -g -std=c++20 -MD -MT tests/cuda/CMakeFiles/traccc_test_cuda.dir/test_ckf_combinatorics_telescope.cpp.o -MF CMakeFiles/traccc_test_cuda.dir/test_ckf_combinatorics_telescope.cpp.o.d -o CMakeFiles/traccc_test_cuda.dir/test_ckf_combinatorics_telescope.cpp.o -c /data/ssd-1tb/projects/traccc/traccc/tests/cuda/test_ckf_combinatorics_telescope.cpp"
        User time (seconds): 51.10
        System time (seconds): 3.26
        Percent of CPU this job got: 99%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:54.37
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 2780640
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 973157
        Voluntary context switches: 11
        Involuntary context switches: 97
        Swaps: 0
        File system inputs: 0
        File system outputs: 478944
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0

This PR just writes such a log file, we will still need to write a bit of scripting to extract useful information out of these log files. I imagine that it would be relatively doable to produce let's say a CSV file from this log file with the info we're interested in, and then we could use even something as simple as a spreadsheet application to understand where we need to concentrate our efforts. 🤔

paulgessinger · 2024-09-23T07:22:53Z

@krasznaa just to point it out again: I literally wrote a tool that does this exact thing for you given a compilation database. It does structured printout and gives you nice csv.

I would encourage you to at least give it a try before reimplementing it.

krasznaa · 2024-09-23T07:26:12Z

I did have a look Paul. But that one still doesn't do any linking steps, does it? And unfortunately our build can be expensive during linking as well.

As you can see, this PR is literally just a couple of lines long. So collecting the information is really not the complicated part here. But if we can re-use your code's logic for organizing the information, that would be very useful indeed.

When using CTEST_USE_LAUNCHERS=TRUE, the build now sends each command through a specific script, which would profile those commands using the "time" executable. Saving all collected output into a file called "traccc_build_performance.log" in the build directory.

cmake/traccc-ctest.sh.in

Co-authored-by: Stephen Nicholas Swatman <stephenswat@gmail.com>

cmake/traccc-ctest.sh.in

Also added a warning in case somebody tries to use the build profiling on something other than Linux or macOS.

sonarqubecloud · 2024-10-22T08:50:18Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarCloud

krasznaa added the build This relates to the build system label Sep 22, 2024

krasznaa requested review from paulgessinger and stephenswat September 22, 2024 10:07

krasznaa force-pushed the BuildProfiling-main-20240922 branch from e6f5acb to de1473c Compare September 22, 2024 10:09

krasznaa force-pushed the BuildProfiling-main-20240922 branch from de1473c to 1d6f7b8 Compare October 22, 2024 07:45

stephenswat reviewed Oct 22, 2024

View reviewed changes

cmake/traccc-ctest.sh.in Outdated Show resolved Hide resolved

Make sure that the built-in time command couldn't be used.

579317c

Co-authored-by: Stephen Nicholas Swatman <stephenswat@gmail.com>

stephenswat approved these changes Oct 22, 2024

View reviewed changes

stephenswat enabled auto-merge (squash) October 22, 2024 08:05

stephenswat disabled auto-merge October 22, 2024 08:11

stephenswat requested changes Oct 22, 2024

View reviewed changes

cmake/traccc-ctest.sh.in Outdated Show resolved Hide resolved

krasznaa mentioned this pull request Oct 22, 2024

De-Template Track Finding, main branch (2024.10.01.) #722

Merged

stephenswat approved these changes Oct 22, 2024

View reviewed changes

stephenswat enabled auto-merge (squash) October 22, 2024 08:46

Added macOS support for the build profiling.

1ac8524

Also added a warning in case somebody tries to use the build profiling on something other than Linux or macOS.

krasznaa force-pushed the BuildProfiling-main-20240922 branch from 1c9f666 to 1ac8524 Compare October 22, 2024 08:49

stephenswat merged commit 27fe150 into acts-project:main Oct 22, 2024
23 checks passed

krasznaa deleted the BuildProfiling-main-20240922 branch October 22, 2024 09:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build Profiling, main branch (2024.09.22.) #713

Build Profiling, main branch (2024.09.22.) #713

krasznaa commented Sep 22, 2024

paulgessinger commented Sep 23, 2024

krasznaa commented Sep 23, 2024

sonarqubecloud bot commented Oct 22, 2024

Build Profiling, main branch (2024.09.22.) #713

Build Profiling, main branch (2024.09.22.) #713

Conversation

krasznaa commented Sep 22, 2024

paulgessinger commented Sep 23, 2024

krasznaa commented Sep 23, 2024

sonarqubecloud bot commented Oct 22, 2024

Quality Gate passed