-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(trace): improve trace #249
Conversation
} | ||
var fmt_message = std.io.fixedBufferStream(buf); | ||
const writer = fmt_message.writer(); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't passing self.fba_bytes
to this function mean that the logger will request an allocation for the entire backing buffer behind the fba? At that point is there any reason to use an allocator instead of passing the entire buffer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the formatted message, used std.fmt.count
to determine how much allocation to ask from the allocator...https://github.com/Syndica/sig/pull/249/files#diff-a5f543f6d9dfe65e2e547cdf7bd2cd49d8e85d4afccc81a3c51568b0208dfeb7R29
For the k/v fields on the other hand, I have not found how to determine the size. Currently hardcoding it to 512 here https://github.com/Syndica/sig/pull/249/files#diff-a5f543f6d9dfe65e2e547cdf7bd2cd49d8e85d4afccc81a3c51568b0208dfeb7R108
src/trace_ng/log.zig
Outdated
std.time.sleep(std.time.ns_per_ms * 5); | ||
const messages = self.channel.drain() orelse { | ||
// channel is closed | ||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use std.fmt.count
to determine the number of bytes you need to allocate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/trace_ng/log.zig
Outdated
comptime maybe_fmt: ?[]const u8, | ||
args: anytype, | ||
) logfmt.LogMsg { | ||
// obtain a memory to write to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally you should also realloc the buffer to a smaller size.
When you call free in the other thread, it's going to pass in a len that is smaller than the actual size of the allocation. For the RecycleFBA it does not actually make a difference. But with other allocators it may cause a problem. The GPA will complain about this.
test "invalid free" {
const x = try std.testing.allocator.alloc(u8, 100);
std.testing.allocator.free(x[0..50]);
}
src/trace_ng/log.zig
Outdated
pub fn deinit(self: *Self) void { | ||
self.channel.close(); | ||
if (self.handle) |*handle| { | ||
std.time.sleep(std.time.ns_per_ms * 5); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not 100% sure if this is the best way to fix this, but the issue I found was that in the test case where the scope of the logger was switched (ie essentially having two loggers...the one used by the parent struct and the child struct) it often happens that the log from the child does not make it to the std err before the process dies. It seems the handle.join() sometimes does not work...but if I introduce a std.debug.print
(or the sleep above) it consistently shows up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the scope of the logger was switched (ie essentially having two loggers
This wouldn't be an issue if you use only a single instance of the underlying logger.
src/trace_ng/log.zig
Outdated
defer logger.deinit(); | ||
|
||
var stuff = Stuff.init(&logger); | ||
defer stuff.deinit(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One consequence of the design of having the logger as a state in the struct, is that any struct that makes use of the standard error logger would need to have a deinit
method (in other to be able to free deinit the allocator/channel). Not sure if it's a pro or con, although it does feel like extra responsibility to have to call the deinit method? But then again this is a common practice in Zig, so maybe not that bad?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the typical usage pattern would be that you initialize StdErrLogger somewhere near the main function of your application. You defer deinit in that scope only. Then you convert it to a Logger, which contains a pointer to StdErrLogger. The logger gets passed around everywhere by value, but it never needs to be deinited because it's just a pointer to the logger owned by main, which will be deinited when main returns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For scoped logger, does this not bring back the possibility for error where a scoped logger is passed and used in a different scope than was intended?
I felt having the logger be part of the state of the struct, ie
const Foo = struct {
logger: ScopedLogger(Foo Scope)
self.logger.info(..)
}
Would ensure that logging from Foo
would always be scoped to Foo Scope
.
And I am not sure that can be accomplished with the approach you described.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Scope is not a part of the state. It's an immutable part of the type. Copying the logger to another place and using a new scope in that place is not going to make any changes to the scope of the original logger. The original logger will continue using the scope as specified in its type.
src/trace_ng/log.zig
Outdated
channel: *Channel(logfmt.LogMsg), | ||
handle: ?std.Thread, | ||
|
||
pub fn init(config: Config) !*Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a particular reason why this this returns a pointer to self? It could just return Self
instead of *Self
, which would simplify the function. Then the calling scope has the choice of how to allocate it. Currently it's forced into allocating the logger with the same allocator that's used for log messages, which may not always be a desirable constraint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a particular reason why this this returns a pointer to self?
If the union
holds pointers, ie
pub fn ScopedLogger(comptime scope: ?[]const u8) type {
return union(LogKind) {
...
standard: *StandardErrLogger(scope),
testing: *TestingLogger(scope),
...
}
And ScopedLogger.init(...)
needs to be able to create one of the union variants, then it needs to return a a pointer to self.
With the suggestion here #249 (comment) might be possible to remove them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I see your reasoning now. I think I saw this function before I noticed ScopedLogger.init.
But actually, even if you keep ScopedLogger.init, you could still have this return Self
. ScopedLogger.init could allocate the pointer, and set it to the value returned by this function.
src/trace_ng/log.zig
Outdated
pub fn unscoped(self: Self) Logger { | ||
return .{ | ||
.allocator = self.allocator, | ||
.recycle_fba = self.log_allocator_state, | ||
.max_buffer = self.max_buffer, | ||
.max_level = self.max_level, | ||
.exit_sig = self.exit_sig, | ||
.channel = self.channel, | ||
.handle = self.handle, | ||
}; | ||
} | ||
|
||
pub fn withScope(self: Self, comptime new_scope: anytype) ScoppedLogger(new_scope) { | ||
return .{ | ||
.allocator = self.allocator, | ||
.recycle_fba = self.log_allocator_state, | ||
.max_buffer = self.max_buffer, | ||
.max_level = self.max_level, | ||
.exit_sig = self.exit_sig, | ||
.channel = self.channel, | ||
.handle = self.handle, | ||
}; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These functions would fail to compile if they were actually used. ScoppedLogger
is a union that holds only a pointer to the logger implementation.
For these to work it just needs to create the union from the pointer to self. Also I might change the function names to clarify that it's returning a different type. This is consistent with the interface pattern used in zig std.
pub fn unscoped(self: Self) Logger { | |
return .{ | |
.allocator = self.allocator, | |
.recycle_fba = self.log_allocator_state, | |
.max_buffer = self.max_buffer, | |
.max_level = self.max_level, | |
.exit_sig = self.exit_sig, | |
.channel = self.channel, | |
.handle = self.handle, | |
}; | |
} | |
pub fn withScope(self: Self, comptime new_scope: anytype) ScoppedLogger(new_scope) { | |
return .{ | |
.allocator = self.allocator, | |
.recycle_fba = self.log_allocator_state, | |
.max_buffer = self.max_buffer, | |
.max_level = self.max_level, | |
.exit_sig = self.exit_sig, | |
.channel = self.channel, | |
.handle = self.handle, | |
}; | |
} | |
pub fn logger(self: *Self) Logger { | |
return .{ .standard = self }; | |
} | |
pub fn scopedLogger(self: Self, comptime new_scope: anytype) ScoppedLogger(new_scope) { | |
return .{ .standard = self }; | |
} |
src/trace_ng/log.zig
Outdated
/// A ScopedLogger could either be: | ||
/// - A StandardErrLogger | ||
/// - A TestingLogger | ||
pub fn ScoppedLogger(comptime scope: ?[]const u8) type { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pub fn ScoppedLogger(comptime scope: ?[]const u8) type { | |
pub fn ScopedLogger(comptime scope: ?[]const u8) type { |
src/trace_ng/log.zig
Outdated
pub fn init(config: Config) !Self { | ||
switch (config.kind) { | ||
.standard => { | ||
return .{ .standard = try StandardErrLogger(scope).init(.{ | ||
.allocator = config.allocator, | ||
.max_level = config.max_level, | ||
.max_buffer = config.max_buffer, | ||
}) }; | ||
}, | ||
.testing, .noop => { | ||
return .{ .testing = TestingLogger(scope).init(.{ | ||
.allocator = config.allocator, | ||
.max_level = config.max_level, | ||
.max_buffer = config.max_buffer, | ||
}) }; | ||
}, | ||
} | ||
} | ||
|
||
pub fn deinit(self: *const Self) void { | ||
switch (self.*) { | ||
.standard => |*logger| { | ||
logger.*.deinit(); | ||
}, | ||
.testing => |*logger| { | ||
logger.*.deinit(); | ||
}, | ||
.noop => {}, | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need these functions? The caller could instead initialize whatever logger implementation they want, and convert it to the interface.
const std_logger = StdErrLogger.init(...);
const logger = std_logger.logger();
This approach is more flexible because it doesn't couple all the logger init
functions together to have a unified set of dependencies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea behind the current approach is to have the caller call one init function (specifying the kind of logger they want) and they good to go ie:
const std_logger = Logger.init(.{.. .kind = LogKind.standard,})
Your approach would be two calls, but it might indeed be more flexible since it removes the init from the interface. I'll play around with it and see.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One reason I mentioned this approach is because in zig this is the typical pattern used by interfaces like Allocator, Random, Reader, Writer, etc. It's also common in other languages like rust and java to initialize a concrete type before casting it as the trait or interface.
I have also seen an approach like yours before, where you init the interface directly. This is basically the factory pattern. But the main difference is that the factory pattern usually doesn't give the caller direct choice over which type is going to be used for the implementation. In my experience, I've seen the factory pattern used to abstract away the logic that chooses which implementation to use. This allows the caller to just say "give me any logger", and the factory method has some special knowledge to make the correct decision about what logger to provide.
src/trace_ng/log.zig
Outdated
pub fn unscoped(self: *const Self) !Logger { | ||
switch (self.*) { | ||
.standard => |logger| { | ||
return Logger.init(.{ | ||
.allocator = logger.*.allocator, | ||
.max_buffer = logger.*.max_buffer, | ||
.kind = LogKind.standard, | ||
}); | ||
}, | ||
.testing => |logger| { | ||
return Logger.init(.{ | ||
.allocator = logger.*.allocator, | ||
.kind = LogKind.testing, | ||
}); | ||
}, | ||
.noop => { | ||
@panic("Cannot scope noop"); | ||
}, | ||
} | ||
} | ||
|
||
pub fn withScope(self: *const Self, comptime new_scope: []const u8) !ScoppedLogger(new_scope) { | ||
switch (self.*) { | ||
.standard => |*logger| { | ||
return ScoppedLogger(new_scope).init(.{ | ||
.allocator = logger.*.allocator, | ||
.max_buffer = logger.*.max_buffer, | ||
.kind = LogKind.standard, | ||
}) catch @panic("message: []const u8"); | ||
}, | ||
.testing => |*logger| { | ||
return ScoppedLogger(new_scope).init(.{ | ||
.allocator = logger.*.allocator, | ||
.kind = LogKind.testing, | ||
}) catch @panic("message: []const u8"); | ||
}, | ||
.noop => { | ||
@panic("Cannot scope noop"); | ||
}, | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just use the existing logger instead of creating a new one? Also what's wrong with scoping noop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just use the existing logger instead of creating a new one?
That is due to the pattern of passing in a logger, and allowing the struct use that to create a new one that is scoped.
const StuffChild = struct {
const StuffChild = @This();
logger: ScopedLogger(@typeName(StuffChild)), // <- Scoped logger needed by struct
pub fn init(logger: *const Logger) StuffChild {
// ↓ New scoped logger created ↓
return .{ .logger = logger.withScope(@typeName(StuffChild)) catch {
@panic("Init logger failed");
} };
}
Also if this is not done this way, ie, if the struct uses existing one, won't there be the risk of changing the scope externally?
Also what's wrong with scoping noop?
Nothing. It could technically be scoped, just did not pay much attention to it yet and defaulted to panic. But can update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The instance of the underlying logger implementation can remain the same while changing the scope.
pub fn unscoped(self: Self) Logger {
return switch (self) {
.standard => |logger| .{ .standard = logger },
.testing => |logger| .{ .testing = logger },
.noop => .noop,
};
}
pub fn withScope(self: Self, comptime new_scope: []const u8) !ScoppedLogger(new_scope) {
return switch (self) {
.standard => |logger| .{ .standard = logger },
.testing => |logger| .{ .testing = logger },
.noop => .noop,
};
}
src/trace_ng/log.zig
Outdated
pub const Logger = ScoppedLogger(null); | ||
|
||
/// An instance of `ScopedLogger` that logs to the standard err. | ||
pub fn StandardErrLogger(comptime scope: ?[]const u8) type { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could make scope
a parameter for the log functions instead of being a parameter for the logger implementation types. The logger interface would still need it, but the implementations of the interface do not need it.
Personally I feel that it simplifies the code because this can be defined as a normal struct instead of a generic struct. It cleans up an unnecessary layer of abstraction
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if/how that would work since the implementation contains logic that refers to the scope. For example:
pub fn log(self: *Self, level: Level, message: []const u8) void {
if (@intFromEnum(level) > @intFromEnum(self.max_level)) {
// noop
return;
}
const maybe_scope = if (scope) |s| s else null; // <---- The scope needed here
const log_msg = logfmt.LogMsg{
.level = level,
.maybe_scope = maybe_scope,
.maybe_msg = message,
.maybe_fields = null,
.maybe_fmt = null,
};
self.channel.send(log_msg) catch |err| {
std.debug.print("Send msg through channel failed with err: {any}", .{err});
return;
};
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless the method on the implementation be modified to take scope like this
pub fn log(self: *Self, comptime scope: ?[]const u8, level: Level, message: []const u8) void
and then modify the interface to supply the scope when calling the implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 460cf27
if (message.maybe_fields) |fields| { | ||
self.log_allocator.free(fields); | ||
} | ||
if (message.maybe_fmt) |fmt_msg| { | ||
self.log_allocator.free(fmt_msg); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These frees occur while holding the stderr lock. But you don't actually need to hold the lock to execute this. To reduce contention, you could ensure the lock is only held while actually calling writeLog.
src/trace_ng/log.zig
Outdated
}; | ||
defer self.channel.allocator.free(messages); | ||
for (messages) |message| { | ||
const writer = std.io.getStdErr().writer(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if it makes any difference, but you could also create this outside the loop. It might improve performance. Is there an advantage of creating it inside the loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think same comment here #249 (comment) applies here too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see an advantage of acquiring the lock within the loop, but I think this is different. What's the advantage of instantiating the writer inside the loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually misread the comment to be referring to the lock acquisition.
What's the advantage of instantiating the writer inside the loop?
I don't think there is any. Removed in 45c4535
src/trace_ng/log.zig
Outdated
defer self.channel.allocator.free(messages); | ||
for (messages) |message| { | ||
const writer = std.io.getStdErr().writer(); | ||
std.debug.lockStdErr(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This acquires and releases the lock once for every log message. Another approach could acquire the lock once, log all messages, then release the lock.
I'm not sure what's best. Here's the tradeoff that I'm imagining...
Acquiring the lock only once for all messages means other threads are blocked for more time before they can access stderr. This could delay other important messages from being seen.
Acquiring the lock separately for each message uses more computation because it needs to execute atomic operations on every loop iteration.
For that tradeoff, I think the best answer depends on how many threads are writing to stderr, and how responsive you need them to be. If there are multiple threads writing to stderr, and there is likely to be a large number of messages in here that will take a long time to process, then it might be best to acquire the lock separately for each message. But if this is the only thread that you expect to write to stderr, or if you know that message batches will typically be small, then it might be optimal to only acquire the lock once for the entire loop. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the base case is to acquire the lock for each message written. This is what the std lib log does (alongside using buffered writer), and also debug.print. And given that we are offloading the writing out of the log message off to another thread, hopefully the computation needed to acquire/release the std err does not easily affect other working threads.
Also giving that the stderr is a shared resource, i'll err on the side of not holding on to it too much. Even though the likelihood of that happening is low in this case, if there is a bug in the logger, where the loop runs for more time than needed, there is the possibility of hogging unto the shared resource for more than needed.
So maybe acquire the lock for each message (as it is currently) but if in practice, this proves to have a noticeable performance impact, then it can be revisited?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm - we'll want this fully integrated (production ready) before we merge this to main - to make reviewing easier (like you mentioned previously), could you create another PR which branches off this PR which integrates the new logger?
once integration is reviewed and approved well merge both into main
Since the API has been updated in #277 I'll close this one and have that target main intead. |
Key points:
Make formatter pluggableFor laterMake output (writer) pluggableFor laterOutput currently