-
Notifications
You must be signed in to change notification settings - Fork 460
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update Visitor traversal logic to use statically defined Field descriptors #1504
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before I dive into the details of the PR can we please split out the any
changes into a separate PR. These changes generate a ton of noise.
24a323f
to
04f0eaf
Compare
|
Re the size changes - It's been a while since I looked into this, but at one point, I believe for readability and maybe for (debug?) code sizes, we were special casing generation for when a |
Do you see any impact to runtime memory use? |
I'm debating whether your approach is preferable to changing the existing traverse methods to visit every field unconditionally. For example, I wonder if duration.pb.swift should change from
to
The latter form would likely be similar performance to yours, with similar code size and memory usage to the existing code. Specifically, your code introduces several new generic types, which might need to be instantiated at runtime, and it adds some new large data structures that might get lazily allocated at runtime. For large projects with lots of protos, those could be problems. (I'm not certain how these different impacts will play out, but it's something I'd like to measure before committing to a particular approach.) |
I haven't done any memory performance benchmarking yet. I can work on that tonight and tomorrow. I expect the addition of these static fields will cause an increase in memory usage, although I don't expect the impact to be too high since I think your outlined The main downside of that approach is that it makes it impossible to make changes to the implementation of |
I'm not sure yet, that's why I'd like to get more information about the overhead of different approaches. I recall experimenting with a lot of different variations before settling on the current The future possibilities you list don't really sound very compelling to me: visiting default/unset fields is something we can implement for 2.0 as a breaking change in a couple of different ways. (And note that your change does break the API between the generated code and the library, so we can only adopt it for 2.0.) No other new options have been suggested in a few years now, so I'm not too worried about trying to allow for future expansion there. Parallelizing also doesn't seem very appealing -- I doubt the overhead would be worth it. The most interesting aspect of your proposal to me is that it could let us re-implement |
I never really thought about it until looking at this snippet, we fetch the value twice in these calls, what does the compiler do in these cases? Does it optimize it down to a single fetch, are they really two property accesses? Is it worth generating different code to use a local to only fetch the value once? Is that smaller or larger? Faster?
I do like this idea, not having to generate equality would be nice. Having fields is a list also makes it that much more likely to be able to revisit the binary serialization and do it in reverse to remove the sizing pass. |
Since all the code here is in the same source file and the getters are pretty trivial, the compiler should routinely inline the getter and the usual compiler optimizations should do the right thing from that point on. If anyone sees otherwise (in Release builds), the compiler team would appreciate a bug report. |
I recently managed to convince myself that reversing the order of fields is not a problem. So I'm just waiting for some spare time to sit down and actually implement this. |
I did some memory testing and the proposed solution appears to use ~1.1x more memory. Testing strategy:
Existing implementation used ~15.3 mb, the proposed implementation used ~16.7 mb |
I want to go forward with this, since it gives us a lot of nice things:
The tricky work will be experimenting with different ways to implement the list of fields to properly balance code size, performance, and memory usage. A 10% growth in memory usage is probably more of a concern than the minor performance loss described above, so I think that's what we should focus the most attention on. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are a bunch more questions for you. These are mostly to help me better understand what options you've already considered and why you decided this particular approach was best. I also hope to find some time to do more of my own testing of this PR soon.
Generally speaking, I'm most interested in reducing code size and memory usage at this point. Our performance is already pretty good, so I'm willing to trade some of that for reduced code size and/or memory usage. Code in the library itself is relatively cheap; if 10 copies of a function in the library will let us drop one function from the generated code or save a few kilobytes for each message in memory, it's almost certainly worth it. Our library is relatively small compared to the volume of generated code for people who have hundreds of protobuf messages.
As I mentioned, I think the really big win here will come from reimplementing ==
using the field information to traverse both objects together.
} | ||
}, | ||
.singularEnum({ $0.requestedOutputFormat }, fieldNumber: 3, defaultValue: .unspecified), | ||
.singularString({ $0.messageType }, fieldNumber: 4), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does performance and memory usage change if you use a key path here (\.messageType
) instead of an explicit closure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My original approach used keypaths but they seemed to have slightly worse runtime performance than closures which I found surprising. I haven't looked at memory performance though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Speculatively, a lot of folks use key paths, so there's pressure on the compiler folks to speed those up. Using key paths would let us benefit from those improvements. Hmmm.... That assumes they actually happen, of course, and we don't know how much improvement might come about someday, so it's not obvious that we should go down that path right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I'm indifferent, I think both are fine I just picked the one that was faster at runtime.
return nil | ||
} | ||
}, | ||
.singularEnum({ $0.requestedOutputFormat }, fieldNumber: 3, defaultValue: .unspecified), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For Proto3, the default enum value is always the zero value. Does it help to split out the proto2 and proto3 enum cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't special case based on Syntax alone. We likely can reduce code size by making the cases with presence clear, and then for the ones that don't have presence, special casing zero likely also makes sense.
return nil | ||
} | ||
}, | ||
.singularBool({ $0.printUnknownFields }, fieldNumber: 9), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it help to combine this with the _protobuf_nameMap
information ? Superficially, it seems like it would only require adding name information to these items. But the nameMap is a dictionary in order to speed up field number => field name conversions, and it's not clear whether it would make any sense to use a dictionary here?
A dictionary that used field numbers for keys and mapped to enums that carried field name and lookup functions could provide much of the need here. That would not obviously encode oneOf data, but I wonder if we really need oneOf to be coded here? (We could query each item of the oneOf separately and take whichever one was present?) That feels like it might be slower for some things, but I wonder what the real impact would be?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it help to combine this with the _protobuf_nameMap information ? Superficially, it seems like it would only require adding name information to these items
Yes absolutely I think this design allows us to remove the _protobuf_nameMap
and probably speed up json encoding as a result since we can get rid of the dictionary lookup. I didn't include it in this PR to keep it focused but I think that absolutely makes sense.
We could have _fields
be an OrderedDictionary where the keys are the field numbers allow it to completely remove the need for _protobuf_nameMap
. Allowing us to traverse fields in an ordered way while still allowing us to look up a field name from a given field number for decoding purposes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm.... So many trade-offs! OrderedDictionary would have additional runtime memory overhead, of course.
I know the compiler folks are working on techniques to make the compiler better at coding constant arrays in TEXT at compile-time rather than building them in dirty memory at runtime. But to take advantage of that, we'd need to simplify that array construction a lot. That's somewhat speculative, but would suggest that it would be worth building the Fields array as a simple list of constant enum cases, eliminating as much indirection as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The lookup is just for parsing, right? With a low number of fields, walking a list is likely cost the the performance of searching a dictionary. And we might even be able to tweak things to optimize for an ordered input also. i.e. - a table driven parse might be something to also look at doing, reducing code size even more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OrderedDictionary would have additional runtime memory overhead, of course
Yeah and I'm honestly not very familiar with it, I just know that it exists. There are other ways to do this like lazily constructing a static dictionary that we store alongside the fields array that maps field numbers to Field<>
, or even just maps it to an index in the field array to save space, but basically something similar to _NameMap
could be done.
but would suggest that it would be worth building the Fields array as a simple list of constant enum cases, eliminating as much indirection as possible.
This was my original design!
enum Field<M: Message> {
case singularFloat(getter: (M) -> Float, isDefault: (M) -> Bool)
// ...
}
However modeling sub-messages, enums, and maps proved to be difficult which required creating abstractions for indirection using protocols.
case singularMessage(getter: (M) -> any Message, isUnset: (M) -> Bool)
// ...
private extension Message {
func traverseAsField<V: Visitor>(visitor: inout V, fieldNumber: Int) throws {
try visitor.visitSingularMessage(message: self, fieldNumber: fieldNumber)
}
}
Also the switch statement over the enum turned out to be more expensive than I thought (the switch itself was slow and also the compiler was deciding to create copies of the enum case and its content during the switch for some reason). More expensive than protocol dispatching that I'm doing now with FieldItem
. My original enum based implementation was on the order of 10x-30x slower than the implementation of traverse
on the main branch. Due to paying both the cost of the slow enum switch statement, as well as having to do a dynamic dispatch as when calling traverseAsField
for enums, messages, and maps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... the compiler was deciding to create copies of the enum case and its content during the switch ...
Ah. Yeah, I know that problem well. 😞
Sources/SwiftProtobuf/Message.swift
Outdated
for field in Self._fields { | ||
try field.traverse(message: self, visitor: &visitor) | ||
} | ||
try unknownFields.traverse(visitor: &visitor) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would that work? Would we just not visit unknown fields at all in the implementation of traverse then? Or would this be maybe an argument to traverse
to decide the order of traversal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My idea: traverse
wouldn't visit unknown fields at all. The encoders would be changed to use traverse
to visit the regular fields and request the unknown fields directly from the message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tbkka should the rather than visitors having collect it, does it make sense to change up the visit api at the same time so a visitor can ask to go forward or backwards? Then unknowns can always be "last" independent of the field order traversed? We could even support flags to say if they want extensions included/skips and/or unknowns first/last/skipped to provide much more customization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
traverse wouldn't visit unknown fields at all
Could that break existing implementations of Visitor
that are expecting a call to visitUnknown
? The compiler will not complain even if visitUnknown
is removed from the protocol and could cause bugs to those who are expecting this to be called on their visitor implementations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's one rough idea of how I was thinking could Visitor
might control the behavior:
public protocol Visitor {
public enum UnknownMode {
// Visit any unknowns before the fields/extensions.
case first
// Visit any unknowns after the fields/extensions.
case last
// Don't visit any unknowns.
case skip
}
public static var unknownMode:UnknownMode { get }
public enum FieldOrder {
// Visit fields/extensions in increasing number order.
case forward
// Visit fields/extensions in decreasing number order.
case reverse
}
public static var fieldOrder:FieldOrder { get }
// ...
}
// Defaults
extension Visitor {
public static var unknownMode:UnknownMode { return .last }
public static var fieldOrder:FieldOrder { return .forward }
}
And then the default for Message
traverse()
could be something like:
public func traverse<V: Visitor>(visitor: inout V) throws {
let unknownMode = V.Type.unknownMode
if unknownMode == .first {
try unknownFields.traverse(visitor: &visitor)
}
switch V.Type.fieldOrder {
case .forward:
for field in Self._fields {
try field.traverse(message: self, visitor: &visitor)
}
case .reverse:
for field in Self._fields.reversed() {
try field.traverse(message: self, visitor: &visitor)
}
}
if unknownMode == .last {
try unknownFields.traverse(visitor: &visitor)
}
}
Or something like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be controlled by the visitor or just be an argument to Traverse
?
I was imagining this
struct TraversalOptions {
public enum UnknownMode {
// Visit any unknowns before the fields/extensions.
case first
// Visit any unknowns after the fields/extensions.
case last
// Don't visit any unknowns.
case skip
}
public enum FieldOrder {
// Visit fields/extensions in increasing number order.
case forward
// Visit fields/extensions in decreasing number order.
case reverse
}
struct AdditionalVisits: OptionSet { // needs a better name
let rawValue: Int
static let visitDefaultFields = AdditionalVisits(rawValue: 1 << 0)
static let visitUnsetFields = AdditionalVisits(rawValue: 1 << 1)
}
var unknownMode: UnknownMode = .last
var fieldOrder: FieldOrder = .forward
var additionalVisits: AdditionalVisits = []
static let defaultOptions = .init()
}
func traverse<V: Visitor>(visitor: V, options: TraversalOptions = .defaultOptions)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is sorta what I was originally thinking, but since the visit*Message()
impls usually have to recurse, if they forget to forward the options
, it would change to the .defaultOptions
which could be a subtle bug.
I also wondered if a given Visitor will always use the same options, in which case that seems like it being properties of the Visitor might make sense.
But I'm not tied to one vs. the other, the potential danger with having to pass through the Options
does worry me a little.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In practice, visitors always traverse in the same way, and call the traverse method from just one of a very few places in their source. So I would suggest using either arguments or separate functions for the different traversal styles. In practice, I'm pretty sure that reverse traversal will always visit unknown fields first, and forward traversal will always visit last. The only exception is -- if I remember correctly -- JSON doesn't actually support unknown fields at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm remembering correctly, I think Go might have support to traverse fields within a message in random order so folks can't bake assumptions into the binary encoding (they might also space pad JSON for the same sorta reasons). Which I guess could be exposed as to the public EncodingOptions to let folks pick if they wanted the default (non deterministic) or deterministic ordering, like we have for map<>
s. So, if that's something we ever thought we might also want to support, that might be a reason not to go two different traversal methods, but instead stick with some more of way to control it via some options.
Sources/SwiftProtobuf/Message.swift
Outdated
@@ -110,6 +110,9 @@ public protocol Message: _CommonMessageConformances { | |||
/// normal `Equatable`. `Equatable` is provided with specific generated | |||
/// types. | |||
func isEqualTo(message: Message) -> Bool | |||
|
|||
/// Provides `Field` information for this `Message` type used to provide a default implementation of `traverse` | |||
static var _fields: [Field<Self>] { get } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In working with the Swift standard library team, we've learned that generic types are not free at runtime -- in particular, the runtime typically ends up having to actually create a runtime representation for each generic type. So Field<M>
will end up producing a separate runtime type representation for every message type M
(much the same as if you had created a type Field_MessageName
instead). So it can make sense to combine or otherwise minimize the number of different generic types in a library where memory usage might be an issue.
I'm not sure there's a good alternative in this case: Field<>
does seem like it needs to be generic over the message type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, I think the reified generics have a negative impact on binary size. IDK what kind of impact they have on memory but its probably a negative one too.
I did have a different approach I explored where I created a non-generic and less safe version of Field:
public struct UnsafeField {
private let item: UnsafeFieldItem
func unsafeTraverse<M: Message, V: Visitor>(message: M, visitor: inout V) throws {
try item.unsafeTraverse(message: message, visitor: &visitor)
}
public static func singularFloat<M: Message>(_ message: M.Type, _ getValue: @escaping (M) -> Float, fieldNumber: Int, defaultValue: Float = 0) -> Self {
Self(item: UnsafeSingularFloatField(fieldNumber: fieldNumber, getValue: getValue, isDefault: { getValue($0) == defaultValue }))
}
}
private protocol UnsafeFieldItem {
func unsafeTraverse<M: Message, V: Visitor>(message: M, visitor: inout V) throws
}
private protocol UnsafeGetterProtocol<ReturnValue> {
associatedtype ReturnValue
func invoke<M: Message>(message: M, then action: (ReturnValue) throws -> Void) throws
}
private struct UnsafeGetter<Msg: Message, ReturnValue>: UnsafeGetterProtocol {
let getter: (Msg) -> ReturnValue
let shouldSkip: (Msg) -> Bool
func invoke<M: Message>(message: M, then action: (ReturnValue) throws -> Void) throws {
let msg = unsafeBitCast(message, to: Msg.self)
if !shouldSkip(msg) {
return try action(getter(msg))
}
}
}
private struct UnsafeSingularFloatField: UnsafeFieldItem {
private let fieldNumber: Int
private let getter: any UnsafeGetterProtocol<Float>
fileprivate init<M: Message>(fieldNumber: Int, getValue: @escaping (M) -> Float, isDefault: @escaping (M) -> Bool) {
self.fieldNumber = fieldNumber
self.getter = UnsafeGetter(getter: getValue, shouldSkip: isDefault)
}
func unsafeTraverse<M: Message, V: Visitor>(message: M, visitor: inout V) throws {
try getter.invoke(message: message) {
try visitor.visitSingularFloatField(value: $0, fieldNumber: fieldNumber)
}
}
}
This had identical runtime performance as the Field<>
solution. However I think this ultimately still created the same amount of generic types and just pushed this cost to UnsafeGetter
and I didn't see noticeable difference in binary size impact.
The UnsafeGetter
design was done this way to allow me to do unsafeBitCast
which doesn't have a runtime penalty compared to doing an as!
However if the goal is to remove as many generics as possible UnsafeGetter
could be done like this:
private struct UnsafeGetter<ReturnValue>: UnsafeGetterProtocol {
let invokeIfShouldNotSkip: (any Message, (ReturnValue) throws -> Void) throws -> Void
init<M: Message>(getter: (M) -> ReturnValue, shouldSkip: (M) -> Bool) {
invokeIfShouldNotSkip = { msg, callback in
let message = msg as! M
if !shouldSkip(msg as! M) {
try callback(getter(message))
}
}
}
func invoke<M: Message>(message: M, then action: (ReturnValue) throws -> Void) throws {
try invokeIfShouldNotSkip(message, action)
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private struct UnsafeSingularFloatField: UnsafeFieldItem {
fileprivate init<M: Message>(...) { ... }
func unsafeTraverse<M: Message, V: Visitor>(message: M, visitor: inout V) throws { ... }
}
This kind of design is very tempting, actually. This avoids the generic type (avoiding the need to reify a lot of runtime type information) by pushing the generic parameters down to the functions. The perf impact here is from looking up the type arguments for those functions. If the function is called from a separately-compiled module, the call will end up looking like this:
M = runtime_type_lookup("name of message type")
V = runtime_type_lookup("name of visitor type")
unsafeTraverse(self, message, visitor, M, V)
Of course, if the M
and V
arguments are available already, then repeat lookups can be avoided. For example, in your code above, it looks like there's a sequence of unsafeTraverse()
functions that all take the same M
and V
type arguments, so it should be possible for the compiler to just pass those through.
The lookups can also be avoided if the function can be inlined into a place where the type arguments are constants. (This is why one reason the standard library is so aggressive about inlining everything, but the cost can be high, since heavy inlining can prevent you from changing the code in the future.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let msg = unsafeBitCast(message, to: Msg.self)
Does this actually work? I somehow thought unsafeBitCast
was only for pointer/object types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this actually work
Yes it works but only if the types have the exact same size in memory. It does not work if message
is an Any
which is why UnsafeGetter
captured Msg: Message
and the function captures M: Message
. Otherwise the only way to do the cast is with an as?
/as!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you tested this for cases where the message has indirect storage? In that case, the message has a reference-counted pointer to the class that actually stores the data, and I wasn't aware it was safe to use unsafeBitCast
to copy a struct that has reference-counted pointers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure, I'd have to dig up my old branch and see. Either way the approach that used unsafeBitCast
is probably not the right way to go since in the end it created the same amount of generic classes due to UnsafeGetter<Msg: Message, ReturnValue>
still capturing the message type. If the goal is reducing the amount of generics then the second approach I posted with struct UnsafeGetter<ReturnValue>
would be better even thought the as!
is less performant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to reduce memory usage without hurting performance "too much." How much slower is your second approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The difference is significant. Around 2x slower with as!
I gave this a try last night and it was pretty straightforward to build. The end result was a ~10% binary size reduction in my test app that contained 100 messages with 50 fields each. |
04f0eaf
to
07aec96
Compare
Just pushed 2 updates:
|
@tbkka Do you have any outstanding feedback for me? Anything I can do to help push this through? |
Compared your branch to What would we need to do to also support decoding with this approach? Minimum, we'd need a way to quickly look up field info from field number and/or name. And we'd need to support setting a field, not just reading it. |
Encoding performance is still pretty proportional to the number of fields, so we haven't significantly increased our fixed costs. |
@mrabiciu Could you merge main into your branch? (Or rebase, whatever you're happier with right now.). That will make some of my experiments a little simpler. |
Hmmm... TestJSONPerformance/testEncoding_manyEnumMapsEncoding_shouldBeUnder1s is almost 4s on my machine right now. |
Ah. Binary encoding does two traversals -- one to measure the total size and another to actually serialize things -- which explains this. Eliminating this would require a resizable buffer that we could expand as necessary during the regular serialization. I've long thought such expansion would actually be slower than the second traversal, but I'm starting to wonder if that's actually true or not. Hmmm.... That may require some future experimentation. |
I'm seeing some retain/release traffic during the traversals. I'm not sure where that's coming from. I'm also see copy and destroy of the Field objects, which might be due to some local changes I'm experimenting with. |
Yes I saw the increase too, however after experimenting with implementing
I haven't familiarized myself with decoding yet but I think having
Yep I can do that!
Do you have any instructions for how I can run and profile this? |
I'm still not sure we need to take this all the way to a Dictionary. I'd think most messages have few enough fields, that a linear search (or maybe a binary if the list if over a certain size) wouldn't be that bad (and might be better in performance and memory tradeoffs compared to having to make the Dictionary. We might be able to provide a option to indicate if the input is expected to be in order. When in order, you could keep a current index, and after each filed, we just walk in order from that index to find the next field. For Binary and TextFormat where they should be well ordered, simply walking in order to the next field is likely going to be much better. It would take a really sparse message with lots of fields for that to be bad. We also might be able to store some extra flags about the table, say if there are few gaps in the field numbers, we could null pad things, so that we could just index the field list by the field number and make all those lookups O(1). |
You can use |
07aec96
to
0ee2fba
Compare
I've been profiling I noticed we do |
That seems like a reasonable guess. Accessing data on the storage class does require retaining that class object. Another approach might be to combine the fetch and default handling so that you make one function call and get both pieces of information at once with one round-trip down to the storage class instead of two. |
The functional tests are passing now, which is promising. Failures above seem to only be because this branch has not re-generated the reference files, so the plugin tests are not generating the expected output. Please don't re-generate until we're ready to land this; that creates a huge number of differences that make it a little hard to see the actual code differences. |
Good point. Dictionary has the drawback that it's hard to compile it down to static text. Where possible, it would be nice to avoid Dictionary and stick with simple Arrays. A few options:
|
The C++ generation these days use some tables for a bunch of things also, so there might be things to draw them. One thing I believe they do is have a header on the table, we could model that as a second static, or maybe make the static an enum with two cases: (a) packed array/table (b) sparse array/table. If for (a) any unused field number gets a marker that is is unused, and we can do direct field number to array indexing (minus one since there is no zero field number). For (b), we do the binary search over the list. And if we decided the do the map for field to offset, we could always compute that at runtime the first time we need it. |
Haven't thought this through, but I wonder if it would help at all if the visitor could visit the indirect storage class directly when there was one, instead of always going through the façade struct. I'm not sure what that would entail... Hmmm... (I've experimentally combined my reverse binary encoder with this and it seems to be working pretty well.) |
I've long wondered about making out generated traverse/decode move into the Storage class (when it exists), it means moving the unknowns/extension there also, and then we could skip the need for the extended lifetime things. But if we move to table driven, I'm not sure how that would work out. I guess we could have a default impl for message without Storage that directly uses the table there, but we'd generate a stub for storage that forwards to the Storage class, and provide a |
Yeah I had the same idea, I'm in the middle of prototyping that now.
I was thinking the same thing. The dictionary could also be generated at runtime on first access of
I think this could be possible. The Storage class may need to become public in order for us to have public |
I've tried all of the above but it did not appear to improve the performance of the tests. I'm still seeing a lot of retain/release traffic despite doing something like this: struct SomeMessage {
class _StorageClass {
var _singularInt: Int = 0
// ...
static let _fields: [Field<Unmanaged<_StorageClass>>] = [
.singularInt32({ $0.takeUnretainedValue()._singularInt32 }, fieldNumber: 1),
// ...
]
}
func traverse<V: Visitor>(visitor: inout V) throws {
let unmanagedStorage = Unmanaged.passUnretained(_storage)
for field in _StorageClass._fields {
try field.traverse(unmanagedStorage, visitor: visitor)
}
}
} I'm not really sure why since I thought |
As a side-note my original goal with this PR was to enable traversal of default and unset fields for the v2 release. Would you be opposed to me opening a second PR where I add a more naive version of default and unset field traversal that uses the existing traverse implementation? I'm happy to continue experimenting with this refactor but I would also like to unblock my project that depends on needing default fields in json. |
Not at all. It's possible we would have objections to the specific implementation, but I'd have to see it first. Regardless, it's always worth discussing incremental improvements -- we don't have to get it all perfect in a single shot. 😁 |
Do you all use |
Once my PR is merged my intention is to either pin to the specific commit hash that includes my PR, or if main happens to be unstable at that commit I can just apply my PR as a Bazel patch on top of whatever the latest stable 1.x version is until 2.x is released. |
I hope there's still activity on this, I'm working with a server that requires a request to have all of its options filled-out, even if they're values are |
This change is some what larger, and got paused for some issues around it. But it sounds like what you are interested in is in #1526. However, due to how the library has to change, it won't be in a 1.x release when done, you'll likely have to wait for a 2.0 release. |
Summary
This diff changes how
traverse
is implemented to allow the implementation to be internal to SwiftProtobuf.This allows for the mechanism of traversal to be modified in the future without making breaking changes to the API. For example we can add
VisitorOptions
to the API, or introduce aNamedVisitor
type while remaining backwards compatible. Additionally this can be expanded upon to get rid of_ProtoNameProviding
since name information can be included inField
which can in-turn be used to speed up json serialization.This comes with a slight hit to runtime performance, and little impact to bundle size.
I intend to use this change as the basis for adding a
VisitorOptions
argument totraverse
which will allow the visitor to visit fields that are set to the default value.This is a pretty massive change but mostly because it required me to run
make regenerate
as well asMessage
to now be expressed asany Message
to due to the self constraints.The important files to look at are:
Sources/SwiftProtobuf/Field.swift
Sources/SwiftProtobuf/Message.swift
Sources/protoc-gen-swift/MessageGenerator.swift
Sources/protoc-gen-swift/MessageFieldGenerator.swift
Sources/protoc-gen-swift/OneOfGenerator.swift
Testing
Unit tests
All unit tests pass
Runtime Performance
I tested with various different Message configurations. Overall this solution does have a slight runtime performance hit. I also played with using enums with associated values and keypaths instead however that produced worse performance than what I have in this request.
Release Builds
Debug Builds
Bundle size
I saw little to no impact on the bundle size with this change. In some cases the bundle size increased slightly, and in some other cases the bundle size deceased. It seems to be dependent on the specific shape of the Message types being built.