Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(trace): reimplement Entry to reduce allocations and simplify code #307

Merged
merged 5 commits into from
Oct 15, 2024

Conversation

dnut
Copy link
Contributor

@dnut dnut commented Oct 8, 2024

This changes the Entry type so that it directly contains any logger "fields" within the struct as normal struct fields instead of allocating strings repeatedly along the way. This unlocked some simplifications to the code and allows the code to be more organized. It also reduces allocations which theoretically could have a performance benefit.

There are no meaningful changes to the logger api. You still use it in exactly the same way as before. Note that I haven't changed anything outside the trace module. While it's true the Entry type is different, the entry type is not intended for use outside the logger, so it's not really an api change in my opinion.

Explanation of Change

The basic unchanged idea of the logger is that you call some method like logger.info() and then you're supposed to directly chain that with other method calls. For normal logging, you would just chain that with a call to log or logf as in logger.info().log("hello world"). If you want to include fields in the log, you would put some calls to field in between info and logf, as in logger.info().field("src_addr", src_ip).log("message receive").

The way this already worked is with the builder pattern. The info method returns an item of type Entry which is responsible for tracking all the accumulating state. Then when you finish up by calling Entry.log, this function is responsible for calling back to the logger to actually log the message.

That describes how it was, and how it still is. So, what have I changed?

Previously Entry was a tagged union that kept a bunch of state relevant to the specific logger implementation, with different variants for different logger implementations. The field functions perform allocations to construct the field string in the way that this particular logger wants to have a field string look. Then the log function implements the approach to logging for this particular logger. So basically, each logger implementation had its implementation fractured across two structs, its logger (e.g. ChannelPrintLogger) and its Entry (e.g. ChannelPrintEntry).

With the new change, Entry has only one responsibility. All it does is store the fields. And it stores them exactly as they were passed, instead of formatting them for the specific logger. Likewise, it's just a struct with a single implementation, instead of being a tagged union representing different implementations. When you call log, it just passes the fields directly to the logger. So the loggers now include their own log function instead of that logic being defined in various Entry implementations.

To enable the Entry to store the data as is, without needing to format it or produce any allocations, the Entry struct needs to be generic. The entry contains a generic Fields struct that contains whatever fields were added so far. Each time you add a field to the Entry, it returns a new version of Entry that has a fields struct with one more field.

Reduce Allocations

This reduces allocations to either 0 or 1 allocation per log message.

For the direct print logger, there are zero allocations needed. The fields and the log message are directly formatted by the stderr writer.

For the channel logger, you only need a single allocation. The fields and the log message are written to a single allocated string. That string is sent over the channel, and then written to stderr.

Code Simplification

Several layers of the code were able to be removed. For example I was able to remove some code from each logger implementation that was redundant with the ScopedLogger interface. I was also able to simplify log formatting because there's no longer a need to bundle up all the parameters into a struct and pass that around before formatting it. The data can be directly formatted to a single string.

The main simplification is that the Entry is just a basic struct containing fields. It doesn't need to do anything more than add fields to a struct. The tradeoff here is that you need to understand comptime code to fully grasp how the entry works under the hood. But the comptime code is fully encapsulated in a single place that doesn't concern itself with any other responsibilities, so that knowledge is not needed unless you're interested making an improvement to that very mechanism.

Code Clarity & Organization

The code is clearer and more organized because the logger implementations fully encapsulate what it means to log with that logger. You no longer need to fracture a logger implementation halfway between the logger and halfway in some struct that represents an undefined "entry" concept. Previously the entry was just a spillover of the logger and it was unclear why it needs to exist, and just caused confusion until you fully understood the whole logger paradigm. Now it's very clear. The entry holds the fields, and nothing else. The logger implements logging in its entirety.

The code is also clearer now in what it means to "implement a logger". All you need to do is implement a struct with a log method conforming to the function signature expected by ScopedLogger.private_log. Implement that one method, and you have a fully functional logger.

@Rexicon226 Rexicon226 self-requested a review October 8, 2024 23:46
src/trace/entry.zig Outdated Show resolved Hide resolved
@InKryption
Copy link
Contributor

By the way, interesting approach with the entry names used as the field names directly. It seems like it's the right way to go, and gives us the nice property that entry names are statically verified at compile time to each be unique (I assume we don't need multiple same-named entries anyway).

@dnut dnut marked this pull request as ready for review October 10, 2024 15:26
@dnut dnut requested a review from 0xNineteen October 10, 2024 15:42
Copy link
Contributor

@Rexicon226 Rexicon226 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple points to fix up.

To add to my "removing inline" comments, inline is a very specific keyword that should only be used to 1. propagate comptime-ness in specific functions or 2. there is an obvious and clear benefit from it, either by i.e removing call frames, LLVM failing to inline the call itself, or something else.

Just putting inline everywhere is terrible and will cause slowdowns in runtime and compile time. By using inline, you're making the optimizer optimize every single callsite individually, instead of optimizing the function itself and then inlining the callsites as needed. None of the functions here that used inline would benefit or have a clear usecase for inline.
A small rant to explain for future usages as well :)

src/trace/entry.zig Outdated Show resolved Hide resolved
src/trace/entry.zig Outdated Show resolved Hide resolved
src/trace/entry.zig Outdated Show resolved Hide resolved
src/trace/entry.zig Outdated Show resolved Hide resolved
src/trace/log.zig Show resolved Hide resolved
src/trace/log.zig Outdated Show resolved Hide resolved
@0xNineteen 0xNineteen removed their request for review October 11, 2024 14:17
@0xNineteen
Copy link
Contributor

removing my review on this - seems like theres good discussion on this and plenty of reviews - trust your guys choice

@dnut dnut requested a review from Rexicon226 October 11, 2024 15:22
@dnut dnut self-assigned this Oct 11, 2024
Copy link
Contributor

@Rexicon226 Rexicon226 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall looks good to me

Copy link
Contributor

@InKryption InKryption left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@dnut dnut merged commit 4c01fb2 into main Oct 15, 2024
6 checks passed
@InKryption InKryption deleted the dnut/new-logger-improvements branch October 15, 2024 16:23
@dnut dnut restored the dnut/new-logger-improvements branch December 20, 2024 12:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

5 participants