Switch back to protobuf.js; add and integrate CompactPlayerFrames support #1
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
After some investigation, I was able to create a new data representation for PlayerMove messages that compacts the data by about 85%:
(data taken from the 11GB replay recorded from the large lobby test we ran on Dunk's stream)
During integration, I decided to move back to protobuf.js for proto serialization/deserialization, given its significant performance benefits over protobuf-es. This makes the code more awkward, but it seems a worthwhile tradeoff. This choice was drive mostly by observing the last test lobby's cpu usage to be significantly higher than I had hoped - which is likely attributable to serialization/deserialization overhead of "all the messages except movement updates". The difference between actual used-cpu% vs the synthetic load test (which only generated PlayerMove) messages was significant, so I'm hoping to see further reduction without low-level optimization for other message types.
Encoding details
This is accomplished by selecting very specific representations of the data that takes advantage of its inherent structure:
https://github.com/Noita-Together/lobby-server/compare/myndzi/compact-player-moves#diff-8c4147a83129b7bb2fd2b0dd8bd3cfeba7f150820c07f9b7b911c43b993e6f87R171-R177
The NT app sends PlayerMove updates as a list of ~15 (one per 2 frames, once every half-second), which consist of:
Of these fields,
x
andy
are the player's position as a fractional value of "pixels";arm_r
is a value for the arm's rotational position in radians;arm_scale_x
andscale_y
are signed integers indicating the direction (and magnitude) of the arm and player sprite (-1 for left, 1 for right; they can be values other than 1, but we don't currently use that functionality, so support is not included here - it is possible this will break if some mods make player sprites "fat" though?);anim
andheld
are unsigned integers representing the currently-playing animation and the slot of the held wand respectively. (nb: it should be possible to represent held flasks too, but the NT mod does not currently send that information).To optimize each of these kinds of data, I've done the following:
x
/y
: these values will commonly be very similar to each other. I encode a float for the first value in the list, and then a sint32 giving the delta from the prior value to the current value. The integer is multiplied (on encoding) and divided (on decoding) by a factor to retain a configurable amount of precision, which I've selected here to be 1 decimal place. Values are rounded to the nearest integer and calculated from the "rounded" result so that the error is mitigated across a sequence, leading to a maximum error in position of about0.05
pixels (verified empirically against the test data). By using arepeated sint
, this will be serialized as a packed list of zigzag-encoded varints, which are represented by a single byte when the absolute value of the encoded number is small (< 6 bits), two bytes when it's less than 13 bits, etc. Packed encoding frames the list with a single field id and a length value, and then a list of raw varints.arm_r
: since these are floating point values representing radians, they can be converted to an integer fraction with an implied denominator. This lets us bound the integer size according to how many bytes we want it to take on the wire. For example, if we want rotations to take exactly one byte, we can represent up to 128 (0-127) values. The larger the denominator, the more fine-grained the rotational positions we can encode will be. Since this is a very subtle visual difference, I've chosen to encode as a single byte (rather than the 4 bytes taken by encoding a float32); this leads to an observed maximum error of around0.05
as well. Decoding is again the inverse of encoding: we just divide the value by the denominator and multiply it by 2pi to get radians.arm_scale_y
/scale_x
: since these are essentially booleans, I've encoded them as a single 32-bit value / bitfield. This means that encoding will fail if we try to encode more than 32 bits (javascript max safe integer is 53 bits, but when you use bitwise operations, everything will be cast as 32 bit ints). We're fine here since we're only encoding 15 values at the moment. Encoded values will tend to be 3 bytes for all 15 frames (rather than 15)anim
/held
: since these don't change frequently, they're often going to be either empty (id = 0) or change once per half-second if at all. I've encoded them as a repeated[index, value]
tuple, where the index is the position in the decoded frame list that the change occurred at and the value is the new value.In addition to changing how the data is represented, I've also flattened the message representation itself into a single proto message, allowing me to take advantage of packed repeated scalar values and avoid a bunch of extra message nesting overhead. cPlayerMove encodes a CompactPlayerFrames message directly with no wrapping. sPlayerMoves encodes a new message type, ServerPlayerMoves, which serves as a wrapper for
repeated CompactPlayerFrames
since you can't have a repeated message in a oneof. This is to allow for server-side batching of PlayerMove updates by concatenation of the underlying Buffer objects (not yet implemented, but will be implemented before this branch is complete).Requires Noita-Together/noita-together#140