This is an example project to show how to integrate logging in an application using terse-logback and Blacklite.
It is a Play application written using Play's Java API.
There is an example application running at https://terse-logback-showcase.fly.dev/ which has pictures of cats.
You will need sbt installed to run.
$ sbt
This gets you into the Play console.
To start the play application, type run
, and then click on http://localhost:9000/ in a browser when prompted.
[terse-logback-showcase] $ run
--- (Running the application, auto-reloading is enabled) ---
[info] p.c.s.AkkaHttpServer - Listening for HTTP on /0:0:0:0:0:0:0:0:9000
(Server started, use Enter to stop and go back to the console...)
FfRhBqPfTYg6O0Qbm7EAAA 20:41:50.040 [INFO ] p.a.d.DefaultDBApi - Database [logging] initialized at jdbc:h2:mem:terse-logback;DB_CLOSE_DELAY=-1
FfRhBqPfTZU6O0Qbm7EAAA 20:41:50.054 [INFO ] p.a.d.HikariCPConnectionPool - Creating Pool for datasource 'logging'
FfRhBqPfTaE6O0Qbm7EAAA 20:41:50.065 [INFO ] c.z.h.HikariDataSource - HikariPool-1 - Starting...
FfRhBqPfTaE6O0Qbm7EAAB 20:41:50.065 [INFO ] c.z.h.HikariDataSource - jdbc-appender-pool-1582000909357 - Starting...
FfRhBqPfTbE6O0Qbm7EAAA 20:41:50.081 [INFO ] c.z.h.HikariDataSource - HikariPool-1 - Start completed.
FfRhBqPfTbg6O0Qbm7EAAA 20:41:50.089 [INFO ] c.z.h.HikariDataSource - jdbc-appender-pool-1582000909357 - Start completed.
FfRhBqPfTf46O0Qbm7EAAA 20:41:50.159 [INFO ] a.e.s.Slf4jLogger - Slf4jLogger started
FfRhBqPfTkI6O0Qbm7EAAA 20:41:50.227 [INFO ] p.a.h.EnabledFilters - Enabled Filters (see <https://www.playframework.com/documentation/latest/Filters>):
play.filters.csrf.CSRFFilter
play.filters.headers.SecurityHeadersFilter
play.filters.hosts.AllowedHostsFilter
FfRhBqPfTmg6O0Qbm7EAAA 20:41:50.264 [INFO ] play.api.Play - Application started (Dev) (no global state)
You'll be able to see the web page and see the logs page as well.
There are four logging appenders configured.
- Text: this writes out line oriented output to
logs/application.log
. - JSON: this writes out JSON output to
logs/application.json
. - Console: this writes ANSI (colored) line output to stdout.
- Blacklite: This writes out JSON to SQLite through the Blacklite appender.
The root logging level is set to TRACE
, and the Blacklite appender stores diagnostic logging (DEBUG
& TRACE
) events, offloading using a background thread and configuring SQLite with memory-mapped I/O and a write-ahead-log for better write performance.
This means that you get the complete logs for a request on an error. I am calling this feature "backtracing", because you can go back and see the TRACE and DEBUG messages for a request when necessary.
Because SQLite has built in JSON support, you can query logs by correlation id, and the appender will automatically reap older log entries past the default size.
All logging events have a flake id generated and rendered in the appropriate format across different channels. This allows you to get a general overview from console and text files, and be able to drill down to a full view with JSON data.
For example, in application.log
, you'll see a single line that starts with FfwJtsNHYSw6O0Qbm7EAAA
:
FfwJtsNHYSw6O0Qbm7EAAA 2020-03-14T05:30:14.965+0000 [INFO ] play.api.db.HikariCPConnectionPool in play-dev-mode-akka.actor.default-dispatcher-7 - Creating Pool for datasource 'logging'
You can search for this string in application.json
and see more detail on the log record:
{"id":"FfwJtsNHYSw6O0Qbm7EAAA","relative_ns":20921024,"tse_ms":1584163814965,"start_ms":null,"@timestamp":"2020-03-14T05:30:14.965Z","@version":"1","message":"Creating Pool for datasource 'logging'","logger_name":"play.api.db.HikariCPConnectionPool","thread_name":"play-dev-mode-akka.actor.default-dispatcher-7","level":"INFO","level_value":20000}
Flake IDs are also k-ordered, meaning that they are "roughly time-ordered when sorted lexicographically."
All logging events also contain a nanosecond increment, using System.nanoTime
as the base, relative to the JVM start time. This value may be negative to begin with, but always increments.
LoggingEvent
already has a timestamp associated with it, but that timestamp is generated by System.currentTimeMillis
. For most appenders, this would be fine, but appending to Blacklite is so fast that several hundred events can be appended in the same millisecond, exceeding the resolution and making direct time comparison difficult. Adding a relative_ns
field provides resolution down to the nanosecond, sort of.
Technically, sorting on the flake id would also sort correctly, but would not solve the problem of determining how much after the second record was, and so the best thing to do here is add an extra field.
For example, here's two different records with the same millisecond.
{"id":"FfwJtsNLLWo6O0Qbm7EAAA","relative_ns":11808036,"tse_ms":1584163603315,"start_ms":null,"@timestamp":"2020-03-14T05:26:43.315Z","@version":"1","message":"HikariPool-2 - Start completed.","logger_name":"com.zaxxer.hikari.HikariDataSource","thread_name":"play-dev-mode-akka.actor.default-dispatcher-7","level":"INFO","level_value":20000}
{"id":"FfwJtsNLLWo6O0Qbm7EAAB","relative_ns":11981656,"tse_ms":1584163603315,"start_ms":null,"@timestamp":"2020-03-14T05:26:43.315Z","@version":"1","message":"jdbc-appender-pool-1584163602961 - Start completed.","logger_name":"com.zaxxer.hikari.HikariDataSource","thread_name":"logback-appender-ASYNC_JDBC-2","level":"INFO","level_value":20000}
Note that the timestamp is 2020-03-14T05:26:43.315Z
and the time since epoch is 1584163603315
. The flake ids distinguish between log entries by using a counter when millisecond precision is exceeded, so the first record is FfwJtsNLLWo6O0Qbm7EAAA
ending in A
and the second record is FfwJtsNLLWo6O0Qbm7EAAB
ending in B
. The relative time tells us exactly how much time has elapsed between the two events: 11981656 - 11808036
is 0.17362 milliseconds.
One of the more fun things you can do with Terse Logback is to instrument jar files to add logging entry/exit statements at run time. The full documentation is here.
Note that will need to run Play in production mode to load the byte-buddy agent. The easiest way to do this is to run sbt stage
and then run the script:
export PLAY_APP_SECRET=some-long-secret-to-appease-the-entropy-gods
target/universal/stage/bin/terse-logback-showcase -Dconfig.resource=application.prod.conf
See the Procfile
for reference.
So, as an example, let's say that we suspect that there's a bug in the assets builder code base. We instrument the controllers.AssetsBuilder
class by adding the following to the logback.conf
file:
logback.bytebuddy {
service-name = "terse-logback-showcase"
tracing {
# This class doesn't have any tracing built into it, but we can add it using instrumentation
"controllers.AssetsBuilder" = ["*"]
}
}
This means that AssetsBuilder will generate log records at TRACE
level with messages like:
entering: controllers.AssetsBuilder.play$api$mvc$Results$_setter_$Continue_$eq(play.api.mvc.Result) with arguments=[Result(100, Map())] from source Assets.scala:715
exiting: controllers.AssetsBuilder.play$api$mvc$Results$_setter_$Continue_$eq(play.api.mvc.Result) with arguments=[Result(100, Map())] => (return_type=void return_value=null) from source Assets.scala:715
As with the other trace messages, these will all be written out to the blacklite appender. Note that these traces will not have a correlation id identifying them as part of the request, as the correlation id is added in application code.
Error Reporting is done through the error handlers. There are two integrations, Sentry and Honeycomb.
Error reporting makes use of diagnostic log entries tied to the request id, which is treated as the correlation id.
The SentryHandler
uses the out of the box sentry client to send error information to Sentry.
Backtraces are sent by querying the LogEntryFinder
for log entries matching the request id, and mapping them into breadcrumbs. The Breadcrumb API doesn't have a "TRACE" level, so all breadcrumbs must be logged at "DEBUG".
The HoneycombHandler
uses the event API through terse-logback honeycomb client.
All errors are sent in as traces to Honeycomb using manual tracing, with the error as the root span. Backtraces are sent by sending span events, since log entries do not have a duration in themselves.
Make sure you have flyctl installed.
The fly.toml
file is already set up, will need to change HONEYCOMB_DATASET
and the app name to your instance. Fly will handle the HTTPS automatically.
Set the secrets in fly.io appropriately. You can get some random output by running head -c 32 /dev/urandom | base64
.
fly secrets set PLAY_APP_SECRET=<secret>
fly secrets set SENTRY_DSN=$SENTRY_DSN
fly secrets set HONEYCOMB_API_KEY=$HONEYCOMB_API_KEY
To deploy or launch on fly.io:
sbt clean stage docker:publishLocal
cd target/docker/stage
# fly launch with copied fly.toml if not already running...
fly deploy --app terse-logback-showcase
Starting and stopping the machine is a bit odd, because the id is only available from the "Allocations" section on the monitoring page, and you still need to include the --app
flag.
fly machine stop 651b3f43 --app terse-logback-showcase