-
Notifications
You must be signed in to change notification settings - Fork 593
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Removing bounds checks when serializing commands/frames. #1030
Conversation
Might want to take a look at this @bollhals :) |
Thanks! I will. Short question:
While I agree on the writing part (client -> server), I'm not sold on the reading part (server -> client). I do fancy performance, but for the receiving side this sounds risky. |
|
||
[Benchmark] | ||
public int ShortstrWriteEmpty() => WireFormatting.WriteShortstr(_buffer.Span, string.Empty); | ||
public int ShortstrWriteEmpty() => WireFormatting.WriteShortstr(ref _buffer.Span.GetStart(), string.Empty, _buffer.Length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also we should change all of the fields to be arguments to prevent inlining benefits. This more accurately measures how it is used then in the actual methods.
(My assumption is here that due to the inlining it sees that the parameter is string.Empty and therefore throwing away most of the WriteShortStr method. Where as e.g. in the BasicPublish.WriteArgumentsTo() method you access the field, which was set from the ctor. So even if the value is string.Empty, it can not throw away parts of the method)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least this should be done for all constants. In newer .net (5.0+) even static readonly can be considered as invariant and enable better code than it should be for this performance test
Currently it doesn't fail, but it might read garbage data (checks I've done have all bytes return 0, haven't been able to see anything else come up). I can add simple checks when reading to make sure we have enough remaining bytes when trying to read frames in the case where we might receive malformed frames from the server. It should still be faster than all the slicing. |
After doing some thinking, doing all the checks on the Reads is pretty complicated so I'll change this PR to apply to just writes. That'll simplify the PR a bit and limit it to safer code. I'll revisit the reads later. |
@bollhals @michaelklishin Reverted the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally it is looking good, left a few remarks.
Would you mind sharing the improvements of a simple method? (Disassembly file)
Will do. |
Example method improvements from ;BasicAck.WriteArgumentsToBefore(System.Span`1<Byte>)
L0000: push ebp
L0001: mov ebp, esp
L0003: push edi
L0004: push esi
L0005: push ebx
L0006: lea eax, [ebp+8]
L0009: mov edx, [eax]
L000b: mov eax, [eax+4]
L000e: lea esi, [ecx+4]
L0011: mov edi, [esi]
L0013: mov esi, [esi+4]
L0016: bswap edi
L0018: xor ebx, ebx
L001a: bswap esi
L001c: add esi, 0
L001f: adc edi, 0
L0022: cmp eax, 8
L0025: jb short L0062
L0027: mov [edx], esi
L0029: mov [edx+4], edi
L002c: cmp dword ptr [ebp+0xc], 8
L0030: jb short L006d
L0032: mov eax, [ebp+8]
L0035: mov edx, [ebp+0xc]
L0038: sub edx, 8
L003b: add eax, 8
L003e: movzx ecx, byte ptr [ecx+0xc]
L0042: cmp edx, 0
L0045: jbe short L0073
L0047: test ecx, ecx
L0049: jne short L004f
L004b: xor ecx, ecx
L004d: jmp short L0054
L004f: mov ecx, 1
L0054: mov [eax], cl
L0056: mov eax, 9
L005b: pop ebx
L005c: pop esi
L005d: pop edi
L005e: pop ebp
L005f: ret 8
L0062: mov ecx, 0x28
L0067: call System.ThrowHelper.ThrowArgumentOutOfRangeException(System.ExceptionArgument)
L006c: int3
L006d: call System.ThrowHelper.ThrowArgumentOutOfRangeException()
L0072: int3
L0073: call 0x068175d0
L0078: int3
;BasicAck.WriteArgumentsToAfter(System.Span`1<Byte>)
L0000: push edi
L0001: push esi
L0002: lea eax, [esp+0xc]
L0006: mov eax, [eax]
L0008: lea edx, [ecx+4]
L000b: mov esi, [edx]
L000d: mov edx, [edx+4]
L0010: bswap esi
L0012: xor edi, edi
L0014: bswap edx
L0016: add edx, 0
L0019: adc esi, 0
L001c: mov [eax], edx
L001e: mov [eax+4], esi
L0021: lea eax, [esp+0xc]
L0025: mov eax, [eax]
L0027: movzx edx, byte ptr [ecx+0xc]
L002b: mov [eax+8], dl
L002e: mov eax, 9
L0033: pop esi
L0034: pop edi
L0035: ret 8 |
Splitting WireFormatting into separate files for clarity.
eb097d2
to
846e590
Compare
Hardcoding redundant buffer-length checks since we precalculate the required buffer sizes before writing.
Thank you! |
Proposed Changes
Simplified version of my previous PR. This is the biggest change. It removes bounds-checking on spans by using
Unsafe.Write
methods in theNetworkOrderSerializer
classes.Also split up
WireFormatting
into different files to make it easier to work with (Shared, Read, Write).Types of Changes
Checklist
CONTRIBUTING.md
documentFurther Comments
Created extensions methods for most cases to make it easier to work with byte offsets rather than having
Span.Slice
calls all over the place. We can allow ourselves to do this since we can guarantee that we have the require amount of bytes to read/write the frames since we request a certain amount of bytes when serializing and read the required amount of bytes and check frame-end markers when deserializing.Most
span.slice
calls were replace withref span.GetOffset
calls references to the spans themselves are replaced withref span.GetStart()
calls.This utilizes ref parameters heavily and results in a hefty speed increase in several cases. A handful of cases turn out slower but this still results in an overall gain in performance and there is more room for improvement with simplifying read/write methods for frames as I made an example of in
BasicAck
and more that can be done in WireFormatting..NET Framework 4.8 improvements:
.NET Core 3.1 improvements: