refactor(trace): reimplement Entry to reduce allocations and simplify code #307

dnut · 2024-10-08T22:18:17Z

This changes the Entry type so that it directly contains any logger "fields" within the struct as normal struct fields instead of allocating strings repeatedly along the way. This unlocked some simplifications to the code and allows the code to be more organized. It also reduces allocations which theoretically could have a performance benefit.

There are no meaningful changes to the logger api. You still use it in exactly the same way as before. Note that I haven't changed anything outside the trace module. While it's true the Entry type is different, the entry type is not intended for use outside the logger, so it's not really an api change in my opinion.

Explanation of Change

The basic unchanged idea of the logger is that you call some method like logger.info() and then you're supposed to directly chain that with other method calls. For normal logging, you would just chain that with a call to log or logf as in logger.info().log("hello world"). If you want to include fields in the log, you would put some calls to field in between info and logf, as in logger.info().field("src_addr", src_ip).log("message receive").

The way this already worked is with the builder pattern. The info method returns an item of type Entry which is responsible for tracking all the accumulating state. Then when you finish up by calling Entry.log, this function is responsible for calling back to the logger to actually log the message.

That describes how it was, and how it still is. So, what have I changed?

Previously Entry was a tagged union that kept a bunch of state relevant to the specific logger implementation, with different variants for different logger implementations. The field functions perform allocations to construct the field string in the way that this particular logger wants to have a field string look. Then the log function implements the approach to logging for this particular logger. So basically, each logger implementation had its implementation fractured across two structs, its logger (e.g. ChannelPrintLogger) and its Entry (e.g. ChannelPrintEntry).

With the new change, Entry has only one responsibility. All it does is store the fields. And it stores them exactly as they were passed, instead of formatting them for the specific logger. Likewise, it's just a struct with a single implementation, instead of being a tagged union representing different implementations. When you call log, it just passes the fields directly to the logger. So the loggers now include their own log function instead of that logic being defined in various Entry implementations.

To enable the Entry to store the data as is, without needing to format it or produce any allocations, the Entry struct needs to be generic. The entry contains a generic Fields struct that contains whatever fields were added so far. Each time you add a field to the Entry, it returns a new version of Entry that has a fields struct with one more field.

Reduce Allocations

This reduces allocations to either 0 or 1 allocation per log message.

For the direct print logger, there are zero allocations needed. The fields and the log message are directly formatted by the stderr writer.

For the channel logger, you only need a single allocation. The fields and the log message are written to a single allocated string. That string is sent over the channel, and then written to stderr.

Code Simplification

Several layers of the code were able to be removed. For example I was able to remove some code from each logger implementation that was redundant with the ScopedLogger interface. I was also able to simplify log formatting because there's no longer a need to bundle up all the parameters into a struct and pass that around before formatting it. The data can be directly formatted to a single string.

The main simplification is that the Entry is just a basic struct containing fields. It doesn't need to do anything more than add fields to a struct. The tradeoff here is that you need to understand comptime code to fully grasp how the entry works under the hood. But the comptime code is fully encapsulated in a single place that doesn't concern itself with any other responsibilities, so that knowledge is not needed unless you're interested making an improvement to that very mechanism.

Code Clarity & Organization

The code is clearer and more organized because the logger implementations fully encapsulate what it means to log with that logger. You no longer need to fracture a logger implementation halfway between the logger and halfway in some struct that represents an undefined "entry" concept. Previously the entry was just a spillover of the logger and it was unclear why it needs to exist, and just caused confusion until you fully understood the whole logger paradigm. Now it's very clear. The entry holds the fields, and nothing else. The logger implements logging in its entirety.

The code is also clearer now in what it means to "implement a logger". All you need to do is implement a struct with a log method conforming to the function signature expected by ScopedLogger.private_log. Implement that one method, and you have a fully functional logger.

src/trace/entry.zig

InKryption · 2024-10-08T23:55:51Z

By the way, interesting approach with the entry names used as the field names directly. It seems like it's the right way to go, and gives us the nice property that entry names are statically verified at compile time to each be unique (I assume we don't need multiple same-named entries anyway).

src/trace/entry.zig

Rexicon226

A couple points to fix up.

To add to my "removing inline" comments, inline is a very specific keyword that should only be used to 1. propagate comptime-ness in specific functions or 2. there is an obvious and clear benefit from it, either by i.e removing call frames, LLVM failing to inline the call itself, or something else.

Just putting inline everywhere is terrible and will cause slowdowns in runtime and compile time. By using inline, you're making the optimizer optimize every single callsite individually, instead of optimizing the function itself and then inlining the callsites as needed. None of the functions here that used inline would benefit or have a clear usecase for inline.
A small rant to explain for future usages as well :)

src/trace/entry.zig

src/trace/log.zig

…lining the init function's code into the only place it is used

0xNineteen · 2024-10-11T14:18:00Z

removing my review on this - seems like theres good discussion on this and plenty of reviews - trust your guys choice

Rexicon226

overall looks good to me

InKryption

lgtm

dnut added 2 commits October 8, 2024 18:14

perf(trace): new Entry to reduce allocations and simplify code

4d3d320

refactor(trace): remove unused imports

03ae91a

Rexicon226 self-requested a review October 8, 2024 23:46

InKryption reviewed Oct 8, 2024

View reviewed changes

src/trace/entry.zig Outdated Show resolved Hide resolved

dadepo reviewed Oct 9, 2024

View reviewed changes

src/trace/entry.zig Show resolved Hide resolved

dadepo reviewed Oct 9, 2024

View reviewed changes

src/trace/entry.zig Show resolved Hide resolved

dadepo reviewed Oct 9, 2024

View reviewed changes

src/trace/entry.zig Show resolved Hide resolved

rafactor(trace): simplify entry field struct definition

80e603c

dnut marked this pull request as ready for review October 10, 2024 15:26

dnut requested a review from 0xNineteen October 10, 2024 15:42

Rexicon226 requested changes Oct 10, 2024

View reviewed changes

dnut added 2 commits October 10, 2024 13:03

fix(trace): use monotonic instead of unordered

d9dc8e7

refactor(trace): uninline Entry functions. remove init by manually in…

f0c6b32

…lining the init function's code into the only place it is used

0xNineteen removed their request for review October 11, 2024 14:17

dnut requested a review from Rexicon226 October 11, 2024 15:22

dnut self-assigned this Oct 11, 2024

Rexicon226 approved these changes Oct 15, 2024

View reviewed changes

InKryption approved these changes Oct 15, 2024

View reviewed changes

dnut merged commit 4c01fb2 into main Oct 15, 2024
6 checks passed

InKryption deleted the dnut/new-logger-improvements branch October 15, 2024 16:23

This was referenced Oct 15, 2024

refactor(core,accountsdb,bincode,net): Fixes, improvements, renames + discuss #315

Merged

feat(accountsdb): improved struct/var naming #320

Merged

Rexicon226 mentioned this pull request Nov 25, 2024

feat(turbine): implement retransmit stage #247

Merged

22 tasks

dnut restored the dnut/new-logger-improvements branch December 20, 2024 12:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(trace): reimplement Entry to reduce allocations and simplify code #307

refactor(trace): reimplement Entry to reduce allocations and simplify code #307

dnut commented Oct 8, 2024 •

edited

Loading

InKryption commented Oct 8, 2024

Rexicon226 left a comment

0xNineteen commented Oct 11, 2024

Rexicon226 left a comment

InKryption left a comment

refactor(trace): reimplement Entry to reduce allocations and simplify code #307

refactor(trace): reimplement Entry to reduce allocations and simplify code #307

Conversation

dnut commented Oct 8, 2024 • edited Loading

Explanation of Change

Reduce Allocations

Code Simplification

Code Clarity & Organization

InKryption commented Oct 8, 2024

Rexicon226 left a comment

Choose a reason for hiding this comment

0xNineteen commented Oct 11, 2024

Rexicon226 left a comment

Choose a reason for hiding this comment

InKryption left a comment

Choose a reason for hiding this comment

dnut commented Oct 8, 2024 •

edited

Loading