Structured JSON Logging #14

CMCDragonkai · 2022-07-18T11:53:44Z

Description

This introduces structured JSON logging. As an intermediate step towards tracing and machine parseable data that can be used to construct contexts and traces.

We already have a LogFormatter type. Any thing that produces a string given a LogRecord is sufficient.

Usually we construct formatters with the formatting.format function, which takes template strings and produces a string.

We can provide something like:

formatting.format`${formatting.json}`

That produces a formatter function. The json symbol simply means the entire data structure of the LogRecord is captured.

The other issue is the schema of the keys of the JSON record, we should align it with some standard. Possibly close to opentracing standard or structlog standard.

Issues Fixed

Fixes Integrate Structured Logging #3

Tasks

1. Introduced jsonFormatter as a default JSON LogFormatter
2. Changed symbol trace to stack to avoid confusion with tracing in Integrate Tracing (derived from OpenTelemetry) #15
3. Allowed stack and keys to be lazily created if the renderer calls them
4. Enabled user to submit LogData for arbitrary data to fit into the log
5. The LogData enables lazy values by wrapping values in a lambda
6. Log level is now checked by each logger in the hierarchy, so their level is applied to filter any logs
7. Introduced jest snapshots for testing the JSON formatter https://jestjs.io/docs/snapshot-testing, the snapshots are committed to source
8. Added jest.useRealTimers(); to close off the timer mocking in tests
9. The msg is no longer mandatory, if is not passed, the msg will be empty

Final checklist

…sh` and all jobs use `nix-shell --run`

ghost · 2022-07-18T11:54:33Z

👇 Click on the image for a new way to code review

Make big changes easier — review code in small groups of related files
Know where to start — see the whole change at a glance
Take a code tour — explore the change with an interactive tour
Make comments and review — all fully sync’ed with github

Try it now!

Legend

CMCDragonkai · 2022-07-18T12:22:45Z

So the issue is that log data is actually allowed to be an arbitrary object, whereas here the info... etc, all take ToString as the type.

It converts them to strings first.

Structured logging benefits from potentially arbitrary keys, and to allow nested structure, not just strings.

This means, the Logger.info and related functions should be taking a structured object as data instead of just ToString. It can still take a string of course if that's what is needed, but anything else will instead become a structured object.

CMCDragonkai · 2022-07-18T12:29:34Z

Ok let's not do formatting.json it should be possible to do just JSON.stringify as the formatter. We can just special case it.

However as for the schema, the GELF can work: https://docs.graylog.org/docs/gelf#gelf-payload-specification. Not sure about other ones.

Now the only thing left is the fact that data structures should be usable with logger.info as well.

The structlog takes a string as the message, and additional arbitrary fields.

With this concept, it's also possible to build a context for further logging. And that just means a prefilled dictionary. Somewhat similar to creating child loggers as we do right now.

https://github.com/bitc/structured-logging-schema#message-and-payload

So it does seem all we need to do is to enable the usage of a payload structure.

logger.info('some message', { some: 'structure' }, JSON.stringify)

Then we have a default structure associated to the log... think of it as "global" context, like time and such. But then further context can be created for a subset of loggers...

And "canonical logging" means that the log gets built up and then output at the end. However that doesn't cover a span. You want to know when it started and stopped too.

It's possible that we combine the structure with the message too, but that might a bit messy. It really depends on whether a log always requires a message or if it can be optional. For example:

logger.info({ message: '...' }, JSON.stringify);

Then a default key can be mandated.

CMCDragonkai · 2022-07-18T12:37:12Z

Another example https://www.elastic.co/guide/en/ecs/current/ecs-reference.html. In this case, it does seem that special casing a message is the wrong way to go, if an object is used... you just need to specify your own schema like msg: 'hello world'.

CMCDragonkai · 2022-07-18T13:10:42Z

Won't build in any special schemas. This can be dictated custom by users. PK application can choose to use something custom. Both for formatter and the structured data.

Contexts for spans and traces can be done through having a prefixed object context. This can also be added to a child logger during construction.

CMCDragonkai · 2022-07-19T03:32:56Z

Well this was quite easy. All we needed to do is to add:

const jsonFormatter: LogFormatter = (record: LogRecord) => {
  return JSON.stringify({
    key: record.key,
    date: record.date,
    msg: record.msg,
    level: record.level
  });
};

Then when constructing logger:

    const logger = new Logger('root', LogLevel.NOTSET, [
      new ConsoleOutHandler(
        formatting.jsonFormatter
      ),
    ]);

    logger.debug('DEBUG MESSAGE');
    logger.info('INFO MESSAGE');
    logger.warn('WARN MESSAGE');
    logger.error('ERROR MESSAGE');

And that's it, you get JSON records being output now.

CMCDragonkai · 2022-07-19T03:42:26Z

The main issue now is schema standardisation. There's no real one single standard for logging, common ones include graylog GELF, the ECS fields above. Libraries like structlog don't really specify any standards for the payload. OpenTelemetry does have a standard for their properties.

There are some "base"/core properties that these libraries tend to all use:

level - we use numbers from 1 DEBUG, 2 INFO,... etc.
msg or message - msg seems more popular, although I'm not sure if if is actually "necessary"
time or timestamp - time seems more popular - always uses ISO8601 format

We have these pieces of information as well in the LogRecord.

To set a base-schema, we can use follow ECS's core properties:

https://www.elastic.co/guide/en/ecs/current/ecs-guidelines.html

https://www.elastic.co/guide/en/ecs/current/ecs-base.html

https://github.com/elastic/ecs/blob/main/generated/csv/fields.csv

These are the base fields for ECS:

@timestamp
message
labels
tags

The log level is placed in a subobject: https://www.elastic.co/guide/en/ecs/current/ecs-log.html, intended to be a keyword, not a number.

CMCDragonkai · 2022-07-19T04:08:15Z

Hmm to make this elegant, we need to do some changes:

The makeRecord should create a lazy dictionary, as in properties are functions so they can instead be called on-demand when they are required.
This enables a formatter with special symbols applied to call the relevant function to acquire the value. So the formatter dictates the control flow, and this is set on the handlers.
I noticed that callHandlers propagate to parent callHandlers, but the level check isn't done here, it's done on the Logger.log method, and this means parent log levels don't filter out any child log records.
Notice that makeRecord is called by log, which means we should not recreate a log record upon going to the parent. Instead the same record should be kept.
Let's suppose level check occurs in callHandlers, while log precreates the LogRecord, and the LogRecord is just a bunch of functions. Then it should work well.
The format function which takes symbols, can then call functions against the LogRecord. Thing's like

CMCDragonkai · 2022-07-19T05:09:24Z

It appears open telemetry considers logging out to stderr as the telemetry/trace exporter to be used only for "debugging". Production exporters all use TCP, GRPC or HTTP 1.1 to send traces to a centralised trace collector like zipkin or whatever.

However this seems to be against how traditional orchestration is supposed to work, where this data goes to STDERR, and then collected by the orchestrator to be sent to centralised loggers. Why did the open telemetry people decide to create their own exporters? Was it to get around STDERR, was it because STDERR logging is considered orthogonal?

It's all done via http or grpc. As if networked sends is better than stderr. Maybe interleaving with stderr isn't right. However for decentralised systems, does it make sense to assume that there is a collector somewhere else? Obviously we would say that the CLI/GUI is a viable collector in order to debug the system. Whereas our hosted testnet may have be pointed somewhere else.

It does seem that stderr/stdout is being deprecated for more "network-centric" observability. I guess it provides more flexibility.

CMCDragonkai · 2022-07-19T12:22:29Z

Otel is way too complex. Here's apparently the "basic" example of using Otel and exporting directly to the console: https://github.com/open-telemetry/opentelemetry-js/blob/main/examples/basic-tracer-node/index.js.

Why does it require so many different packages?

const opentelemetry = require('@opentelemetry/api');
const { Resource } = require('@opentelemetry/resources');
const { SemanticResourceAttributes } = require('@opentelemetry/semantic-conventions');
const { BasicTracerProvider, ConsoleSpanExporter, SimpleSpanProcessor } = require('@opentelemetry/sdk-trace-base');
const { JaegerExporter } = require('@opentelemetry/exporter-jaeger');

It seems over-engineered. The spec however might be useful.

CMCDragonkai · 2022-07-19T12:49:44Z

For most language specific instrumentation libraries you have exporters for popular backends and OTLP. You might wonder,

under what circumstances does one use a collector to send data, as opposed to having each service send directly to the backend?

For trying out and getting started with OpenTelemetry, sending your data directly to a backend is a great way to get value quickly. Also, in a development or small-scale environment you can get decent results without a collector.

However, in general we recommend using a collector alongside your service, since it allows your service to offload data quickly and the collector can take care of additional handling like retries, batching, encryption or even sensitive data filtering.

So in-process sending is only for adoption as I suspected.

However the community has obviously selected against relying on STDERR, because they view traces not as human readable information, but as information intended for machines to parse, and thus did not bother writing to STDERR, except as an afterthought. This makes sense, because for them once the span-data is written to stderr what happens to it afterwards? It's sort of useless as textual information (not really but one can understand). So there's nothing built to "forward" the stderr logs to a trace receiver or OTLP collector. So even with the OTLP collector it's all designed to send logs over a network connection rather than relying on simple STDERR.

I think what I'll do is examine the spec of opentracing with respect to the log data itself, and then just add it to our existing structured logging, which continues to be send to STDERR. For future sending to other places, that could be done via custom handlers.

For viewing and debugging the traces, if we maintain compatibility with the spec, it should be possible to send the stderr logs directly into a trace collector via HTTP 1.1: https://opentelemetry.io/docs/concepts/sdk-configuration/otlp-exporter-configuration/. Also see https://github.com/opentracing-contrib/java-span-reporter

CMCDragonkai · 2022-07-19T13:25:18Z

An example of manually sending a trace:

Get a Zipkin payload to test. For example create a file called trace.json that contains:

[
  {
    "traceId": "5982fe77008310cc80f1da5e10147519",
    "parentId": "90394f6bcffb5d13",
    "id": "67fae42571535f60",
    "kind": "SERVER",
    "name": "/m/n/2.6.1",
    "timestamp": 1516781775726000,
    "duration": 26000,
    "localEndpoint": {
      "serviceName": "api"
    },
    "remoteEndpoint": {
      "serviceName": "apip"
    },
    "tags": {
      "data.http_response_code": "201"
    }
  }
]

With the Collector running, send this payload to the Collector. For example:

$ curl -X POST localhost:9411/api/v2/spans -H'Content-Type: application/json' -d @trace.json

CMCDragonkai · 2022-07-19T16:39:32Z

Opentelemetry details are moved into new issue #15.

Because the telemetry schema doesn't actually require a mandatory message, it makes sense to change our logging methods to take a string or object. If string, then that's assumed to be a regular msg, if object, a message is not necessary and can be empty.

It's up to users to provide the necessary message with msg and appropriate logger formatter for their application usecase.

I imagine tracing would be built on top of js-logger, so this library is kept simple. Tracing would be PK-specific.

…ng for bash scripts

CMCDragonkai · 2022-07-20T05:19:20Z

This is now done. Going to update the README.md with some example usages.

ci: integrated scripts/choco-install.ps1 and `scripts/brew-install.…

f9fe5ce

…sh` and all jobs use `nix-shell --run`

CMCDragonkai mentioned this pull request Jul 19, 2022

Integrate Tracing (derived from OpenTelemetry) #15

Closed

CMCDragonkai force-pushed the feature-structured-logging branch 2 times, most recently from 1dd9eac to 879cb4f Compare July 19, 2022 16:46

CMCDragonkai self-assigned this Jul 19, 2022

CMCDragonkai added 8 commits July 20, 2022 15:01

ci: always cache even for failed jobs

ff4797a

ci: comprehensive junit reports

1adfa99

build: strip internal

369e705

style: bring in shellcheck for lint-shell

69522a3

nix: do not run prepare when doing npm install in shell.nix

36e05a5

ci: added ci parameter to shell.nix to enable strict error checki…

51d9150

…ng for bash scripts

chore: updated jest and ts-node related packages

52c4a33

feat: introducing structured logging with a custom JSON formatter

66f9e5d

CMCDragonkai force-pushed the feature-structured-logging branch from 29fe85a to 66f9e5d Compare July 20, 2022 05:01

CMCDragonkai added 3 commits July 20, 2022 15:14

test: added tests for undefined messages

200f512

test: updated structured logging tests to include a number key

570bea1

docs: updated docs

c46a8ed

CMCDragonkai merged commit a27442c into staging Jul 20, 2022

emmacasolin mentioned this pull request Jul 22, 2022

CLI option --format=json should change the STDERR logging to output JSON structured logging MatrixAI/Polykey#421

Merged

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Structured JSON Logging #14

Structured JSON Logging #14

Uh oh!

CMCDragonkai commented Jul 18, 2022 •

edited

Loading

Uh oh!

ghost commented Jul 18, 2022 •

edited by ghost

Loading

Uh oh!

CMCDragonkai commented Jul 18, 2022

Uh oh!

CMCDragonkai commented Jul 18, 2022 •

edited

Loading

Uh oh!

CMCDragonkai commented Jul 18, 2022

Uh oh!

CMCDragonkai commented Jul 18, 2022 •

edited

Loading

Uh oh!

CMCDragonkai commented Jul 19, 2022

Uh oh!

CMCDragonkai commented Jul 19, 2022 •

edited

Loading

Uh oh!

CMCDragonkai commented Jul 19, 2022

Uh oh!

CMCDragonkai commented Jul 19, 2022 •

edited

Loading

Uh oh!

CMCDragonkai commented Jul 19, 2022

Uh oh!

CMCDragonkai commented Jul 19, 2022

Uh oh!

CMCDragonkai commented Jul 19, 2022

Uh oh!

CMCDragonkai commented Jul 19, 2022

Uh oh!

CMCDragonkai commented Jul 20, 2022

Uh oh!

Uh oh!

Structured JSON Logging #14

Structured JSON Logging #14

Uh oh!

Conversation

CMCDragonkai commented Jul 18, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Issues Fixed

Tasks

Final checklist

Uh oh!

ghost commented Jul 18, 2022 • edited by ghost Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Legend

Uh oh!

CMCDragonkai commented Jul 18, 2022

Uh oh!

CMCDragonkai commented Jul 18, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CMCDragonkai commented Jul 18, 2022

Uh oh!

CMCDragonkai commented Jul 18, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CMCDragonkai commented Jul 19, 2022

Uh oh!

CMCDragonkai commented Jul 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CMCDragonkai commented Jul 19, 2022

Uh oh!

CMCDragonkai commented Jul 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CMCDragonkai commented Jul 19, 2022

Uh oh!

CMCDragonkai commented Jul 19, 2022

Uh oh!

CMCDragonkai commented Jul 19, 2022

Uh oh!

CMCDragonkai commented Jul 19, 2022

Uh oh!

CMCDragonkai commented Jul 20, 2022

Uh oh!

Uh oh!

CMCDragonkai commented Jul 18, 2022 •

edited

Loading

ghost commented Jul 18, 2022 •

edited by ghost

Loading

CMCDragonkai commented Jul 18, 2022 •

edited

Loading

CMCDragonkai commented Jul 18, 2022 •

edited

Loading

CMCDragonkai commented Jul 19, 2022 •

edited

Loading

CMCDragonkai commented Jul 19, 2022 •

edited

Loading