-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Adopts *Writable* BaggageContext (Better) #167
Conversation
Whoop I PRed from master based branc... fixing |
e0e97f9
to
0d6d56b
Compare
0d6d56b
to
9ecf61a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like this change – and it is totally in line with what I expected the Baggage change to look like! There are two questions about the use of var _logger
?
var baggage: Baggage | ||
|
||
let invokedFunctionARN: String | ||
let deadline: DispatchWallTime |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deadlines we'll eventually be able to handle in a distributed way -- so this will eventually move into the baggage.
See: Future: Deadlines and Cancelation patterns slashmo/gsoc-swift-baggage-context#24
// utility | ||
_logger: logger, | ||
eventLoop: eventLoop, | ||
allocator: allocator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iirc evenloop and allocator are immutable and dont need to be at the storage level imo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or is this or allow them to be as part of the by-ref trick?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this is to keep the outer Context type small -- just a single pointer, otherwise we'd grow the struct and each time it gets let
or passed somewhere those references would be copied -- this way, everything in storage, nothing gets copied other than the single pointer just when passing the context around.
cognitoIdentity: cognitoIdentity, | ||
clientContext: clientContext, | ||
// utility | ||
_logger: logger, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure why we need underscore for logger or Storage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For logger because it's "underlying logger, don't use this, use logger
on context" for Storage happy to remove the underscore (done)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can relate to doing so for the member, but not for the ctor argument
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah i see, got it will do
/// | ||
/// - note: The `LogLevel` can be configured using the `LOG_LEVEL` environment variable. | ||
public let logger: Logger | ||
public var logger: Logger { | ||
self.storage._logger.with(self.baggage) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't remember what with() does... does it mean every logger access is a new instance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, every context.logger
is a "new logger" but a logger is a cheap value type so that's not terrible.
This has to be like that because we need the logger to point to the "latest" baggage values. As everything here has value semantics, we can't just pass a reference to the context or baggage into the handler, and have to pass it when we're about to use it.
The with
created LogHandler is lazy though -- it does not create Logger.Metadata yet unless a log statement is actually triggered.
Alternatively, this could be done on a baggage { didSet { ...
if you'd prefer that but I think that's actually hard to implement correctly without growing memory forever... we'd need to notice "aha, this logger already has a baggage logger, so we need to unwrap it and then wrap again with a baggage aware one" to prevent it growing infinitely with Logger(BaggageLogHandler(BaggageLogHandler(BaggageLogHandler(BaggageLogHandler(BaggageLogHandler(BaggageLogHandler(...
if you know what I mean... This is hard to implement because handler
is private in logging, so we can't check for it... Implementing it that way may require making the handler public in SwiftLog
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm I think this could actually add overhead for cases that log allot. I think keeping state on when baggage changed otherwise returning the existing logger could be a good optimization. maybe even coded into BaggageContext so users dont have to reinvent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I revisited this and will try to implement via an "materialize once" into logging metadata. You're right that this is painful if there's a lot of logging, and we should not make that painful :\
It won't be doable as a log handler I guess, so that goes out the window; really I guess what would solve the needless copying is to make Logging depend on Baggage, and then we can remove its storage for metadata and replace it with baggage, avoiding the copying alltogether.
Not sure when to pitch this, it'd be a logging 2.0 I guess?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed this -- metadata is now set eagerly and we won't get into crazy copying anymore.
Thanks for reminding me about this, it's kind of obvious in hindsight but I missed it.
Some numbers how the copying "eagerly" amortizes (that's for 10k logs); it's just bad for "never logs" cases but one can assume those don't really happen in a large "actual" system (vs a library) tbh.
The logger.with
lazy-pattern remains available when one wants to use it instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(orange/red are the logger.with(baggage)
)
deadline: DispatchWallTime, | ||
cognitoIdentity: String?, | ||
clientContext: String?, | ||
_logger: Logger, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove underscore from here
…s-lambda-runtime into wip-writable-safe-context
self.storage.baggage | ||
} | ||
set { | ||
if isKnownUniquelyReferenced(&self.storage) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please correct me if i am wrong, as far as I understand it:
-
everything (new storage object) is copied every time baggage is updated, which will happen every time new span is created = often
-
this does implement copy on write however changing of the storage is not thread safe which I think is not a nice API for user who would need to sync the changes himself
see discussion here https://gist.github.com/drewmccormack/b1c4487935cf3c3e0a5feaf488a95ebd#gistcomment-2826755
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I'm aware of that rule, it could be that it's misapplied here after all... I'll dig into it again.
With the latest changes in context perhaps actually we can get away without the Storage completely after all actually. Meh we'd want to keep CoW here to keep the lambda context small. I'll look into this shortly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pokryfka, so no. This is implementation correct. (You had me worried there for a second! 😉)
This is as thread safe as an Int
or any other value semantics type is.
isKnownUniquelyReferenced
is commonly misunderstood, so I understand where your confusion is coming from. However such implementation is safe by itself. Now how someone uses this type may or may not be thread-safe -- and that is what the example you're linking is doing wrong. The same kind of race can be exhibited on even a plain integer value. This type provides value semantics guarantees, and no more.
everything (new storage object) is copied every time baggage is updated, which will happen every time new span is created = often
I disagree about the "often" quantification used here. It is by far not as often as a raw struct would be copying -- each passing around and reassigning to a new variable even would technically then be copying.
We're getting the best of both worlds; and simply following the known CoW semantics that are frequent in Swift.
this does implement copy on write however changing of the storage is not thread safe which I think is not a nice API for user who would need to sync the changes himself
The operations in this piece of code are safe. If they weren't then Array and all of Swift's collections are screwed as well 😉 We are providing the same semantics as those types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
everything (new storage object) is copied every time baggage is updated, which will happen every time new span is created = often
I disagree about the "often" quantification used here. It is by far not as often as a raw struct would be copying -- each passing around and reassigning to a new variable even would technically then be copying.
We're getting the best of both worlds; and simply following the known CoW semantics that are frequent in Swift.
it used to be a "const reference" and so only the reference (pointer) was copied.
I dont know how expensive is copying of a small struct, and how much more expensive is allocating the space for it on heap rather than on stack; perhaps its not a real issue at all.
if the goal was to avoid copying of the lambda context, perhaps Context
should have a reference to "static storage" and a reference to CoW "mutable storage" (the baggage). That way 2 references would always be copied but the content of the "static storage" would not need to be copied every time the baggage changes.
this does implement copy on write however changing of the storage is not thread safe which I think is not a nice API for user who would need to sync the changes himself
The operations in this piece of code are safe. If they weren't then Array and all of Swift's collections are screwed as well 😉 We are providing the same semantics as those types.
Thank you for rechecking it.
I do find it a bit confusing...
if I understand that it will be safe because set
is mutating
.
struct Context {
func notSafeUpdateBaggage(_ baggage: Baggage) {
// update storage
}
// need to make a copy which will increase the storage reference counter
mutating func updateBaggage(_ baggage: Baggage) {
// update storage
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the goal was to avoid copying of the lambda context, perhaps Context should have a reference to "static storage" and a reference to CoW "mutable storage" (the baggage). That way 2 references would always be copied but the content of the "static storage" would not need to be copied every time the baggage changes.
Yeah if it becomes a problem I think we could do that 🤔 👍
I do find it a bit confusing...
if I understand that it will be safe because set is mutating.
Maybe this example can illustrate: Because this is a struct, in order to change it the function must be mutating (or set {}
which implicitly is).
Then if anyone has:
let context = ...
they can't mutate it unless a second reference is created:
let context = ...
var changeMe = context
which ensures that we'll have the +1 count there and will copy on mutation.
It still is possible to make bugs like:
class Me {
var onlyReference: Lambda.Context = ...
for t in threads {
t.run { onlyReference.mutate() }
}
that remains illegal as usual, but not because of the CoW or Context type being wrong but how the user uses it being unsafe. The same example is true for e.g. var int: Int
which you'd try to mutate +=1
from many threads. There's nothing wrong about the Int type, but about the unsafe modifications to where we store it.
closing in inactive PRs, feel free to re-open if still relevant |
This adopts the baggage API in the shape of slashmo/gsoc-swift-baggage-context#34
I have discussed with @fabianfett and @tomerd a little bit about this.
This adoption is preferable I believe, because of comments that @adam-fowler raised in slashmo/gsoc-swift-baggage-context#34 (comment) In general, this means is also ready for the
Context
requiring aset/get
on the baggage, making lifes easier for other folks.Specifically, we talked with @fabianfett why the
Lambda.Context
context type was a class -- since it was immutable, it avoided copying around things a lot. We can achieve the same niceness by making it a CoW type, rather than just a struct. It's a bit of work but worth it in the long run as we get the best of both worlds.