Building with clang's libc++ fails #569

adisuissa · 2020-11-09T14:46:15Z

PR #549 modified Nighthawk's use of MonotonicTime to SystemTime, which causes a failure to build with clang's/llvm's libc++ (--config=libc++), and gives the following error:

ERROR: /home/.../git/nighthawk/source/client/BUILD:11:17: C++ compilation of rule '//source/client:nighthawk_client_lib' failed (Exit 1) clang failed: error executing command /home/.../clang10/bin/clang -U_FORTIFY_SOURCE -fstack-protector -Wall -Wthread-safety -Wself-assign -fcolor-diagnostics -fno-omit-frame-pointer '-stdlib=libc++' -MD -MF ... (remaining 765 argument(s) skipped)

Use --sandbox_debug to see verbose messages from the sandbox
In file included from source/client/process_impl.cc:1:
In file included from bazel-out/k8-fastbuild/bin/source/client/_virtual_includes/nighthawk_client_lib/client/process_impl.h:3:
In file included from /home/.../clang10/bin/../include/c++/v1/map:479:
In file included from /home/.../clang10/bin/../include/c++/v1/__tree:15:
/home/.../clang10/bin/../include/c++/v1/memory:3028:32: error: no matching constructor for initialization of 'Nighthawk::Client::ClientWorkerImpl'
    return unique_ptr<_Tp>(new _Tp(_VSTD::forward<_Args>(__args)...));
                               ^   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
source/client/process_impl.cc:188:29: note: in instantiation of function template specialization 'std::__1::make_unique<Nighthawk::Client::ClientWorkerImpl, Envoy::Api::Api &, Envoy::ThreadLocal::InstanceImpl &, std::__1::unique_ptr<Envoy::Upstream::ClusterManager, std::__1::default_delete<Envoy::Upstream::ClusterManager> > &, const Nighthawk::Client::BenchmarkClientFactoryImpl &, const Nighthawk::Client::TerminationPredicateFactoryImpl &, const Nighthawk::Client::SequencerFactoryImpl &, const Nighthawk::Client::RequestSourceFactoryImpl &, Envoy::Stats::ThreadLocalStoreImpl &, int &, std::__1::chrono::time_point<std::__1::chrono::system_clock, std::__1::chrono::duration<long long, std::__1::ratio<1, 1000000000> > >, std::__1::shared_ptr<Envoy::Tracing::HttpTracer> &, Nighthawk::Client::ClientWorkerImpl::HardCodedWarmupStyle>' requested here
    workers_.push_back(std::make_unique<ClientWorkerImpl>(
                            ^
bazel-out/k8-fastbuild/bin/source/client/_virtual_includes/nighthawk_client_lib/client/client_worker_impl.h:29:3: note: candidate constructor not viable: no known conversion from 'time_point<[...], duration<[...], ratio<[...], 1000000000>>>' to 'const time_point<[...], duration<[...], ratio<[...], 1000000>>>' for 10th argument
  ClientWorkerImpl(Envoy::Api::Api& api, Envoy::ThreadLocal::Instance& tls,
  ^
bazel-out/k8-fastbuild/bin/source/client/_virtual_includes/nighthawk_client_lib/client/client_worker_impl.h:25:7: note: candidate constructor (the implicit copy constructor) not viable: requires 1 argument, but 12 were provided
class ClientWorkerImpl : public WorkerImpl, virtual public ClientWorker {
      ^
1 error generated.
Target //source/client:nighthawk_client_lib failed to build

After executing:

bazel build //source/client:nighthawk_client_lib --config=libc++

The cause seems to be the different resolution of system_clock by llvm and gcc:

llvm uses a milliseconds resolution;
gcc uses a nanoseconds resolution.
and the nighthawk code requires a nanoseconds resolution.

Possible ways to solve:

Revert, and change back to MonotonicTime.
Use MonotonicTime in this part of the code.
Change the required resolution in this code to milliseconds instead of nanoseconds.
Change Envoy's definition of SystemTime to explicitly use a nanoseconds, i.e.:
using SystemTime = std::chrono::time_point<std::chrono::system_clock, std::chrono::nanoseconds>;, and update the necessary places in Envoy's code to correctly cast when system_clock is used (std::chrono::time_point_cast<std::chrono::system_clock::duration>(time)).

@oschaaf Am I missing anything? What's your opinion about this?

The text was updated successfully, but these errors were encountered:

oschaaf · 2020-11-09T15:50:50Z

Thanks for the great analysis. The reason for making this change is preparing for something like master...oschaaf:schedule-date. That would allow
one to set a specific schedule date/time for starting load test execution, and this would in turn be useful
when synchronizing the starts of multiple clients on the premise that machines have their system times
synchronized using external tooling. This functionality intersects with workers of a single client synchronizing
their start times, which happens in the code you pointed out.
Loosing precision in these paths seems suboptimal to me. I need to think about this for a bit to determine what
I think would be the best resolution.

oschaaf · 2020-11-11T10:18:56Z

After discussing on Slack with @dubious90, and mulling this over, I think the problem can be best described as that we're overloading two things.

Set a schedule for execution. A millisecond resolution for this is fine.
Set an inter-worker delay. This needs the highest resolution we can get, and therefore using monotonic time for this is probably more applicable now that we learned system time is problematic. The inter-worker start delay needs to be high-res, because it is used to inject an offset to worker starting times so that request-release timings get spread evenly across them.

In terms of the listed options to resolve above, I'd say this is best represented by 3) plus a change to decouple scheduling execution and offsetting worker-starts. I propose to address this as part of the forthcoming change that will add the actual scheduling feature.
@adisuissa does that sound good?

adisuissa · 2020-11-11T13:37:32Z

Yes, I can see how this could fulfill the requirement for synchronized start across machines.
SGTM.

kushthedude · 2020-11-14T12:13:25Z

Does this mean we are going to see scheduling of load-tests in nighthawk soon @oschaaf?

Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>

Adds a new schedule option to the proto api., which allows specifying a date/time at which execution should start. Should also fix #569 by reducing reliance on `Envoy::SystemTime` to a minimum: Based on the value schedule option, a monotonic time will be computed and propagated. This makes us use `Envoy::MonotonicTime` again in places where sub ms resolution is a must have. (The fix will need confirmation, as I couldn't get `--config=libc++` to work for other reasons.) Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>

List of changes: - Update Envoy to 8188e232a9e0b15111d30f4724cbc7bf77d3964a (Dec 8th) - Sync .bazelrc - Modify extensions_build_config.bzl, add required DISABLED_BY_DEFAULT_EXTENSIONS - Rework our h1 connection pool derivation to keep our own prefetching feature working. Note: We enable the allow_prefetch runtime feature for this. - Changes to reflect changed Envoy constructors and method signatures: OnPoolReady(), ProdClusterManagerFactory(), allocateConnPool() - Modified log filtering in the integration test: `SIGTERM` -> `ENVOY_SIGTERM`. - Dropped include of `ares.h` as small cleanup, we no longer need that. - In `options_impl.cc` there's a small change to how we define `elapsed_since_epoch` to unbreak building with `libc++`: a regression of #569. Filed #594 to avoid regression. Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>

oschaaf added a commit to oschaaf/nighthawk that referenced this issue Nov 16, 2020

Changes to address envoyproxy#569

6657217

Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>

oschaaf mentioned this issue Nov 16, 2020

Schedule option #573

Merged

dubious90 closed this as completed in #573 Dec 7, 2020

This was referenced Dec 11, 2020

CI: test building with clang's libc++ #594

Open

Update Envoy to 8188e232a9e0b15111d30f4724cbc7bf77d3964a #587

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Building with clang's libc++ fails #569

Building with clang's libc++ fails #569

adisuissa commented Nov 9, 2020

oschaaf commented Nov 9, 2020

oschaaf commented Nov 11, 2020

adisuissa commented Nov 11, 2020

kushthedude commented Nov 14, 2020

Building with clang's libc++ fails #569

Building with clang's libc++ fails #569

Comments

adisuissa commented Nov 9, 2020

oschaaf commented Nov 9, 2020

oschaaf commented Nov 11, 2020

adisuissa commented Nov 11, 2020

kushthedude commented Nov 14, 2020