-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fmt::string_view ctor under Clang (in pre c++17) much slower than of gcc #914
Comments
The problem is that |
I tried the below and it fixes the issue: FMT_CONSTEXPR size_t length(const char *s)
{
#if FMT_GCC_VERSION
return std::strlen(s);
#elif defined(__clang__) && __has_builtin(__builtin_strlen)
return __builtin_strlen(s);
#else
return length<char>(s); //fallback
#endif
} I can issue a PR if it's ok. |
Confirmed. Same performance for clang and g++ now. |
@gabime 180 -> 1 ns seems like excessively big difference though. Out of curiosity, could you give the details about your setup: clang version, what optimization level you used and test code? |
Sure. Here is the test: #include "benchmark/benchmark.h" //google benchmark
#include "fmt/core.h"
void bench_sv(benchmark::State& state)
{
const char *msg = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vestibulum pharetra metus cursus "
"lacus placerat congue. Nulla egestas, mauris a tincidunt tempus, enim lectus volutpat mi, eu consequat sem "
"libero nec massa. In dapibus ipsum a diam rhoncus gravida. Etiam non dapibus eros. Donec fringilla dui sed "
"augue pretium, nec scelerisque est maximus. Nullam convallis, sem nec blandit maximus, nisi turpis ornare "
"nisl, sit amet volutpat neque massa eu odio. Maecenas malesuada quam ex, posuere congue nibh turpis duis.";
for (auto _ : state)
{
fmt::string_view sv(msg);
benchmark::DoNotOptimize(sv);
}
}
BENCHMARK(bench_sv);
BENCHMARK_MAIN(); Compiled with:
|
Version: fmt 5.2.1, Clang 6.0
Summary
Constructing
fmt::string_view
withconst char*
of size 515 bytes takes about ~180 ns under Clang, while under gcc it takes less than 1ns.Cause
The cause is that https://github.com/fmtlib/fmt/blob/master/include/fmt/core.h@L217
is only enabled for gcc (why?) so clang ends up using the much slower implementation (https://github.com/fmtlib/fmt/blob/master/include/fmt/core.h@L210)
Just to be sure I tried to enable for clang as well, and performance went from 180 to 1ns:
The text was updated successfully, but these errors were encountered: