Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add global log handler #877

Merged
merged 18 commits into from
Jul 30, 2021
Merged

Conversation

lalitb
Copy link
Member

@lalitb lalitb commented Jun 25, 2021

Fixes #408

As per specs:
SDK implementations MUST allow end-users to change the library's default error handling behavior for relevant errors.

Changes

Examples:

The logging macro supports logging using C++ stream format, and key-value pair:

OTEL_INTERNAL_LOG_ERROR(" Connection failed. Error string " << error_str << " Error Num: " << errorno);
OTEL_INTERNAL_LOG_ERROR(" Connection failed." , {{"error message: " : error_str},{"error number": errorno}});
OTEL_INTERNAL_LOG_DEBUG(" Connection Established Successfully. Headers:", {{"url", url},{"content-length", len}, {"content-type", type}});
  • This PR adds a global log handler and updates SDK components to use it. The default global log handler is configured with a console logger ( std::cout).
  • The logs handler is meant to capture errors and warnings arising from SDK, not supposed to be used for the application errors.
  • It's header-only to ensure that it is usable for header-only exporters ( eg ETW).
  • The LogHandler interface is minimalistic, with a virtual Handler function which is supposed to be overridden by custom handlers.
class LogHandler
{
public:
  virtual void Handle(level, filename, lineno, log_message, attributes) noexcept = 0;
};

The default log handler is set to write the messages to std::cout.

  • The log levels are supported - Error, Warn, Info and Debug. The default log level is Warn ( to dump both Error and Warn) and it can be changed at compile time.

For significant contributions please make sure you have completed the following items:

  • CHANGELOG.md updated for non-trivial changes
  • Unit tests have been added
  • Changes in public API reviewed

@lalitb lalitb requested a review from a team June 25, 2021 10:58
@lalitb lalitb changed the title Global error handler add global error handler Jun 25, 2021
@codecov
Copy link

codecov bot commented Jun 25, 2021

Codecov Report

Merging #877 (e82bb8f) into main (914df66) will decrease coverage by 0.09%.
The diff coverage is 74.08%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #877      +/-   ##
==========================================
- Coverage   95.44%   95.36%   -0.08%     
==========================================
  Files         158      159       +1     
  Lines        6749     6776      +27     
==========================================
+ Hits         6441     6461      +20     
- Misses        308      315       +7     
Impacted Files Coverage Δ
...nclude/opentelemetry/ext/http/client/http_client.h 93.34% <ø> (ø)
...lude/opentelemetry/sdk/common/global_log_handler.h 72.00% <72.00%> (ø)
sdk/src/trace/tracer_provider.cc 73.69% <100.00%> (+1.47%) ⬆️

@maxgolov
Copy link
Contributor

Minor ask: can we have something like -

#if defined(HAVE_CONSOLE_LOG) && !defined(OTEL_LOG_DEBUG)
// Log to console if there's no standard log facility defined
#  include <cstdio>
#  ifndef OTEL_LOG_DEBUG
#    define OTEL_LOG_DEBUG(fmt_, ...) printf(" " fmt_ "\n", ##__VA_ARGS__)
#    define OTEL_LOG_TRACE(fmt_, ...) printf(" " fmt_ "\n", ##__VA_ARGS__)
#    define OTEL_LOG_INFO(fmt_, ...) printf(" " fmt_ "\n", ##__VA_ARGS__)
#    define OTEL_LOG_WARN(fmt_, ...) printf(" " fmt_ "\n", ##__VA_ARGS__)
#    define OTEL_LOG_ERROR(fmt_, ...) printf(" " fmt_ "\n", ##__VA_ARGS__)
#  endif
#endif

#ifndef OTEL_LOG_DEBUG
// Don't log anything if there's no standard log facility defined
#  define OTEL_LOG_DEBUG(fmt_, ...)
#  define OTEL_LOG_TRACE(fmt_, ...)
#  define OTEL_LOG_INFO(fmt_, ...)
#  define OTEL_LOG_WARN(fmt_, ...)
#  define OTEL_LOG_ERROR(fmt_, ...)
#endif

?

This, technically, fully satisfies the requirement of customizing the error handler: customers may define their own macro by passing CXXFLAGS=-DOTEL_LOG_DEBUG + friends. It's also C-signature. You do not force anyone to use C++ for an error handler. You also don't bind it to concrete implementation.

I think we should not rush it. And invest a bit more into the actual OTEL_LOG_ERROR handling (registration of callback, multiple callbacks, thread-safety of it, describe what it means when your SDK thread calls into that, etc.) in a follow-up PR. And show how to plug that in on top of above compile-time macro-based model..

It is quite expected of a logging library to have several logging levels. Not just error. Not everything is an error. We could've handled that in uniform fashion. I do not have a strong preference re. macro names, but the set above should generally be fairly complete even for a sophisticated project.

# include "opentelemetry/exporters/elasticsearch/es_log_exporter.h"
# include "opentelemetry/exporters/elasticsearch/es_log_recordable.h"
# include "opentelemetry/sdk/common/global_error_handler.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may append the standard set of logging macros somewhere once in api/include/opentelemetry/config.h - to avoid explicitly including the global error handler in each exporter.

@@ -74,23 +77,19 @@ class ResponseHandler : public http_client::EventHandler
switch (state)
{
case http_client::SessionState::ConnectFailed:
if (console_debug_)
std::cout << "Connection to elasticsearch failed\n";
OTEL_ERROR("[ES Trace Exporter] Connection to elasticsearch failed\n")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would've used something like OTEL_LOG_ERROR + reserved the entire set of corresponding macros for warnings, notices, etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Will change that.

std::cout << responseBody << std::endl;
}

std::stringstream ss;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One issue with a macro that accepts only std::string or const char *, and not vararg parameters:

  • you lose the knowledge of concrete strongly typed values by blending them all into one single string (so that your callback inherently loses all relevant details, e.g. you can't pass a NUMBER or ADDRESS to the callback).

Value of having strongly-typed parameters in logging: you set a breakpoint, and/or even conditional breakpoint to catch when param equals to certain value. For example, OTEL_LOG_ERROR("http status code=%d", 403); - see, that way your callback actually knows that it's 403 and does not need to parse. I would've also added an overload to the OTEL_LOG_ERROR function to pass context info, such as done here:
https://github.com/microsoft/cpp_client_telemetry/blob/7dd687c71297e1c72f6126896662f9c1e70d690f/lib/include/public/DebugEvents.hpp#L137

You can pass almost an entire call stack knowledge using a class like this:

    /// <summary>The DebugEvent class represents a debug event object.</summary>
    class DebugEvent
    {

    public:
        /// <summary>The debug event sequence number.</summary>
        uint64_t seq;
        /// <summary>The debug event timestamp.</summary>
        uint64_t ts;
        /// <summary>The debug event type.</summary>
        DebugEventType type;
        /// <summary>[optional] Parameter 1 (depends on debug event type).</summary>
        size_t param1;
        /// <summary>[optional] Parameter 2 (depends on debug event type).</summary>
        size_t param2;
        /// <summary>[optional] The debug event data (depends on debug event type).</summary>
        void* data;
        /// <summary>[optional] The size of the debug event data (depends on debug event type).</summary>
        size_t size;

        /// <summary>DebugEvent The default DebugEvent constructor.</summary>
        DebugEvent() : seq(0), ts(0), type(EVT_UNKNOWN), param1(0), param2(0), data(NULL), size(0) {};

        DebugEvent(DebugEventType type) : seq(0), ts(0), type(type), param1(0), param2(0), data(NULL), size(0) {};

        DebugEvent(DebugEventType type, size_t param1, size_t param2 = 0, void* data = nullptr, size_t size = 0) :
            seq(0), ts(0), type(type), param1(param1), param2(param2), data(data), size(size) {};
    };

You can borrow this model entirely. It's Apache License v2.0. Works great elsewhere. 😄

@@ -3,10 +3,11 @@

#include "opentelemetry/exporters/otlp/otlp_grpc_exporter.h"
#include "opentelemetry/exporters/otlp/otlp_recordable.h"
#include "opentelemetry/sdk/common/global_error_handler.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, you are forcing the code to know of the global error handler. It does not need to know. It'd only need to include config.h , which should then give the rest of details. I think config.h is already included at this point to define the OPENTELEMETRY_ macros, just move it in there.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Will change that.

class DefaultErrorHandler : public ErrorHandler
{
public:
void Handle(nostd::string_view error) noexcept override
Copy link
Contributor

@maxgolov maxgolov Jun 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this would be sufficient to cover the error cases. This would be of extremely limited value, given that you are manually instrumenting every spot with a logging macro - it could have been something better (e.g. some form of call context object) than a pointer to string.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please elaborate on the error cases it won't cover. Thanks :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example: If you are posting HTTP request and collector responds with status=400, you need to pass the post body to error callback, so you can dump it in there and understand what's wrong with it. There could be other examples. You can find more practical examples where you need more than just a string - here.

@lalitb
Copy link
Member Author

lalitb commented Jun 25, 2021

It is quite expected of a logging library to have several logging levels. Not just error. Not everything is an error. We could've handled that in uniform fashion.

Logging levels are not added purposefully. Probably I should have clarified the scope of work in the description. sorry about that. This is not the logging library, but a custom error logging interface, and including a default error handler. The purpose is to allow app developer to hook their own error handlers. Again, this is not to debug the application, but to catch the errors from opentelemetry components. We already have an opentelemetry logging specs implementation and the purpose here is not to create another of that.

@maxgolov
Copy link
Contributor

maxgolov commented Jun 25, 2021

This is not the logging library, but a custom error logging interface, and including a default error handler.

I think this error handler structure does not quite cover the practical needs to debug a complex real-world application. We could improve it. I'd suggest to stop at macros and provide an empty macro - for the customers to decide how to handle the error.

SDK implementations MUST allow end users to change the library's default error handling behavior for relevant errors.

It's sufficient to provide an empty macro stub for that, without imposing concrete implementation of the error handler. You can combine strongly-typed / object logging with a callback. You do not need a dedicated error handler if your object logging handles these aspects. Such as it's done in glog, for example. Or as it's done in the library I linked in my comments. Your logging itself triggers the fatal or error handler. You do not need to call it explicitly. You need to abstract away the calls to error handler.

I'd rather keep it completely opaque at this point. And invest more thought into how to make it better. As a minimum:

  • accepting variable arguments, as in strongly-typed logging - is better than accepting a string.
  • accepting event and/or call context as an argument when invoking the error handler - is better to diagnose & debug the errors by passing event that contains call context. Similar how it's done with std::exception , where you can provide additional information on the object. But in a synchronous way: your OTEL_LOG_ERROR(error_event) carries the info that gets bubbled up to the listener.

The relevant fields that may be necessary, based on prior debug experience:

  • error seq number (atomic).
  • error timestamp.
  • error parameters: be that a pointer to string; or additional arguments.
  • call stack - this is not something that you implement at all. This is something that the customer may add via macro. Such as a macro that invokes the backtrace and having debug symbols, prints where the problem happened. Not something we should cover in our implementation though.

Another concrete example - the logging library that invokes android_printAssert , which itself may be customized. Accepts variable arguments. Also delivers on a promise of allowing you to add your own custom macros / your own error handler.

@lalitb
Copy link
Member Author

lalitb commented Jun 25, 2021

I'd rather keep it completely opaque at this point. And invest more thought into how to make it better. As a minimum:

  • accepting variable arguments, as in strongly-typed logging - is better than accepting a string.
  • accepting event and/or call context as an argument when invoking the error handler - is better to diagnose & debug the errors by passing event that contains call context.

Sure, thanks for the comments and references from cpp_client_telementry :) Will see to improve the current implementation without having a performance impact.

@maxgolov
Copy link
Contributor

maxgolov commented Jun 25, 2021

What I am suggesting is - keep your current macros. It looks good. Provide empty macros. We can discuss how to improve it in the next iteration in another PR? I also think it needs some live discussion / one-pager. Maybe something we can discuss during our next community meeting?

@lalitb lalitb changed the title add global error handler add global log handler Jun 28, 2021
@lalitb
Copy link
Member Author

lalitb commented Jun 28, 2021

@maxgolov -
I have added different logs levels, along with it's configuration at compile time as below ( taking inspiration from opencv2 logger )

#define OTEL_INTERNAL_LOG_LEVEL_ERROR 0
#define OTEL_INTERNAL_LOG_LEVEL_WARN 1
#define OTEL_INTERNAL_LOG_LEVEL_INFO 2
#define OTEL_INTERNAL_LOG_LEVEL_DEBUG 3  // to be disabled in release

#ifndef OTEL_INTERNAL_LOG_LEVEL
#  define OTEL_INTERNAL_LOG_LEVEL OTEL_INTERNAL_LOG_LEVEL_DEBUG  // ERROR, WARN , INFO AND DEBUG
#endif

Also created macros for logging at given log level:

#  define OTEL_INTERNAL_LOG_ERROR(...)
#  define OTEL_INTERNAL_LOG_WARN(...)
#  define OTEL_INTERNAL_LOG_DEBUG(...)
#  define OTEL_INTERNAL_LOG_INFO(...)

As of now, it supports logging using stream format, and key-value pair:

OTEL_INTERNAL_LOG_ERROR(" Connection failed. Error string " << error_str << " Error Num: " << errorno);
OTEL_INTERNAL_LOG_ERROR(" Connection failed." , {{"error message: " : error_str},{"error number": errorno}});
OTEL_INTERNAL_LOG_DEBUG(" Connection Established Successfully. Headers:", {{"url", url},{"content-length", len}, {"content-type", type}});

Threading safety, adding timestamp information is proposed to be handled by custom log handlers, and is also handled in the default console ( std::cout ) log-handler.

This may not be still as per your thoughts, we can discuss this further.

@lalitb
Copy link
Member Author

lalitb commented Jul 20, 2021

@maxgolov would it be possible to review it one last time, as it covers most of the requirements you suggested - with exception of using ostream + key-value pair instead of print style for logging. Thanks.

@lalitb lalitb mentioned this pull request Jul 28, 2021
3 tasks
@@ -51,7 +54,7 @@ void TUDPTransport::open()

if (error)
{
// TODO: log error
OTEL_INTERNAL_LOG_ERROR("Jaeger Exporter: getaddrinfo failed with error: " << error);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This syntax looks a little weird. Could it be defined as macro with variadic parameters, then pass the current parameter as 2 parameters?

Copy link
Member Author

@lalitb lalitb Jul 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree, but probably would be more familiar for C++ developers as ostream format. It's used extensively with opencv2 (https://github.com/opencv/opencv/blob/15ba3e123f1efa5444951b638ce6bd200054820e/modules/highgui/src/backend.cpp#L78) so picked it up.
Also, OTEL_INTERNAL_LOG_ERROR is defined as a macro with the variadic format, as it can take a single argument ( ostream) or two arguments ( ostream and k-v attribute list).

@lalitb lalitb merged commit ef16d00 into open-telemetry:main Jul 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Self-Diagnostics ( Error handling ) support
3 participants