C++ Library Include Times (CPP-LIT 🔥)

This repo answers the question: how much time is added to my compile time by a single inclusion of <header>? Featuring all C++ Standard Library headers, their C++20 module versions, windows.h and a couple of other third party libraries. All times are for Visual Studio 2019 16.11.0 Preview 2.0. The red entries are new C++20 headers.

Interpretation & notes

The numbers are measured with care, but are easy to misinterpret. Note:

This analysis was done by including the headers into .cpp files. That accurately measures inclusion time, but is only a lower limit for actual compile times. The measurement code doesn't do anything with those headers. The cost of heavy template usage for example might outweigh the cost of merely including a header.
Headers can themselves include other headers. Since each header is at most included once (#pragma once) per TU, the include time of any header depends on whether any of its sub-includes have been included before. The worst case is when no other headers were included before and the best case is when all sub-headers were already included. This analysis compares zero includes vs the include of any header - so it's the worst case. This is a somewhat realistic scenario for sparse use of the standard library. Not so much when there's already a lenghty include list.
When compiling a project with multiple translation units and with /MP enabled, MSVC compiles TUs in parallel. In such cases, the include times above can be misleading. Example: the time for one <regex> is around half a second for one thread. During that half second, the other n-1 cores can compile other translation units, reducing the actual added include time to about t/n, with t being the include time as listed and n being the number of threads compiling.
Especially with some of the third party libraries, the situation is more complex than can be summarized with a single value. Some libraries offer different versions with better compile performance, like spdlog, which explicitly recommends against the use of the single-header version which was used here. Others like GLM are modularized: I used the heavy glm.hpp - using a smaller subset will be faster. This was done for simplicity and not to portray any of those libraries in a bad light.
There is no use of PCH, header units or any form of caching. The tests were done on a fast SSD and a Ryzen 3800X. All tests were done with a warmup phase and on an otherwise idle system, so real-world numbers will probably come down higher than this.
The standard library in module form was so fast to include that it can barely be measured or seen, at least in comparison with the rest. That's not a mistake.

Libraries

windows.h [LAM] refers to the common

#define WIN32_LEAN_AND_MEAN
#include <windows.h>

Tracy v0.7.8, obviously with the TRACY_ENABLE define.
spdlog v1.8.5 using header-only version with only spdlog.h included. Note that the readme recommends to use the static lib version instead for faster compile times.
{fmt} v7.1.3 including only fmt/core.h.
JSON for Modern C++ v3.9.1. Note that this is split into the main header (nl_json - json.hpp) and the forward include header (nl_json_fwd - json_fwd.hpp). The latter is what you would include more often.
GLM v0.9.9.8. GLM is modular, this repo measures the include of glm.hpp - which might be more than what would typically include.
vulkan.h and vulkan.hpp (not to be confused!) from Vulkan SDK v1.2.162.1.
All boost libraries are v1.76.0. Note that Boost.JSON is being measured in its header-only mode.
stb headers are from 2020-09-14.
EnTT v3.5.0.

Methodology

All reported times are based on release builds. The complete compile command without includes is:

cl.exe /O2 /GL /Oi /MD /D NDEBUG /std:c++latest /experimental:module /EHsc /nologo <sources> /link /MACHINE:X64 /LTCG:incremental

To measure the cost of an include, the difference was taken between a build with that include and one without. All measurements were taken after a warm-up phase and with lots of data points. The plotted errors are the standard deviations of those numbers. The results are computed as a difference. Error propagation tells us the resulting standard deviation is:

σ = sqrt(σ_A² + σ_B²)

The sources being compiled consist of 10 identical translation units with the resulting time being divided by 10 to get the individual cost.

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
build_project		build_project
.gitignore		.gitignore
install_vcpkg_dependencies.bat		install_vcpkg_dependencies.bat
plot.py		plot.py
readme.md		readme.md
run_numbers.ps1		run_numbers.ps1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

C++ Library Include Times (CPP-LIT 🔥)

Interpretation & notes

Libraries

Methodology

About

Languages

s9w/cpp-lit

Folders and files

Latest commit

History

Repository files navigation

C++ Library Include Times (CPP-LIT 🔥)

Interpretation & notes

Libraries

Methodology

About

Topics

Resources

Stars

Watchers

Forks

Languages