Improve benchmarks and errors #421

puellanivis · 2021-03-17T11:28:03Z

Ahead of some other improvements, I wanted to narrow the benchmarks to only relevant code, and this allow necessitated fixing some tests to go along with it.

Fixes:

errors.Errorf("constant string") have been replaced with errors.New("constant string")
errors.Errorf("constant string: %v", err) have been replaced with errors.Wrap(err, "constant string")
sftp.clientConn.conn.sendPacketTest has been replaced with a timeBombWriter, this removes unnecessary test code from conn
all err != errFakeNet have been replaced with !errors.Is(err, errFakeNet), this works because we are now properly wrapping errors.
Error messages starting with a capital letter have been fixed to use a lower-case character.
%v formatting codes that can be more type-safe have been made to be more type specific.
errors from runSftpClient’s cmd.Wait have been put into an execErr type that wraps the underlying error.

packet_test.go has a lot of changes:

test tables have been moved into their individual functions, rather than globals
more marshaling and unmarshaling tests have been added to verify full byte ordering
where a byte value in a byte-slice is semantically an ASCII character, use 'c' to make the value clear.
marshal/unmarshal string can contain NUL bytes, which can always be a sticky point, so add some tests for that
byte-slices that are marshaled output have semantic line-breaks on each field to make the semantics of the binary data clearer
TestRecvPacket now tests a the short-packet behavior, and the long-packet behavior.
benchmarks on marshaling now marshal to ioutil.Discard as we do not want to benchmark bytes.Buffer, we want to benchmark our marshaling code.

packet_test.go

client_integration_test.go

…as elsewhere

puellanivis · 2021-03-17T13:40:28Z

Benchmarks between this and the current main branch are discontinuous, i.e. the tests have changed enough that any change in values is (mostly†) purely noise.

†: There is a small change here that has a surprising impact on performance. Removing the if sendTestPacket != nil { … } is pretty significant considering its part of every single packet sent. And since this was only ever for test code, reworking the tests to remove it is a straight and simple win.

drakkan · 2021-03-17T16:57:29Z

Benchmarks between this and the current main branch are discontinuous, i.e. the tests have changed enough that any change in values is (mostly†) purely noise.

†: There is a small change here that has a surprising impact on performance. Removing the if sendTestPacket != nil { … } is pretty significant considering its part of every single packet sent. And since this was only ever for test code, reworking the tests to remove it is a straight and simple win.

I have noticed the same thing in other projects in the past, it is precisely for this reason that in #407 I wrote:

If we want to change them we should do it in a way that does not require to checking if an interface is implemented for each packet (for example for writes)

Makefile

greatroar · 2021-03-17T17:45:23Z

Removing the if sendTestPacket != nil { … } is pretty significant considering its part of every single packet sent.

Just out of curiosity, do you have a figure on that? I just took a diff of the asm, and the only differences are a memory access that should go to the L1 cache and a predictable jump.

sftp.go

puellanivis · 2021-03-17T17:56:53Z

Just out of curiosity, do you have a figure on that? I just took a diff of the asm, and the only differences are a memory access that should go to the L1 cache and a predictable jump.

Generating benchmarks now…

client_integration_test.go

puellanivis · 2021-03-17T20:14:58Z

old is with the if, new is without.

name                          old time/op    new time/op    delta
Read1k-12                       52.1ms ± 3%    51.2ms ± 1%  -1.76%  (p=0.027 n=10+8)
Read16k-12                      6.75ms ± 2%    6.54ms ± 1%  -3.12%  (p=0.000 n=8+9)
Read32k-12                      7.08ms ±32%    5.10ms ±10%    ~     (p=0.105 n=10+10)
Read128k-12                     3.49ms ± 4%    3.64ms ± 4%  +4.37%  (p=0.001 n=10+10)
Read512k-12                     3.32ms ± 2%    3.40ms ± 1%  +2.40%  (p=0.000 n=9+8)
Read1MiB-12                     3.70ms ± 2%    3.76ms ± 3%  +1.59%  (p=0.011 n=10+10)
Read4MiB-12                     4.34ms ± 1%    4.41ms ± 3%    ~     (p=0.079 n=9+10)
Read4MiBDelay10Msec-12          69.5ms ± 1%    69.6ms ± 1%    ~     (p=0.436 n=10+10)
Read4MiBDelay50Msec-12           311ms ± 0%     310ms ± 0%    ~     (p=0.218 n=10+10)
Read4MiBDelay150Msec-12          912ms ± 0%     911ms ± 0%  -0.06%  (p=0.005 n=7+10)
Write1k-12                      98.1ms ± 7%    98.8ms ± 6%    ~     (p=0.579 n=10+10)
Write16k-12                     12.2ms ± 6%    12.1ms ± 3%    ~     (p=0.631 n=10+10)
Write32k-12                     9.03ms ± 5%    9.20ms ±10%    ~     (p=0.661 n=9+10)
Write128k-12                    9.05ms ± 7%    9.16ms ± 8%    ~     (p=0.549 n=9+10)
Write512k-12                    9.30ms ± 5%    9.51ms ± 9%    ~     (p=0.156 n=9+10)
Write1MiB-12                    9.30ms ± 6%    9.17ms ± 5%    ~     (p=0.604 n=10+9)
Write4MiB-12                    9.41ms ± 7%    9.45ms ±12%    ~     (p=0.739 n=10+10)
Write4MiBDelay10Msec-12          3.37s ± 0%     3.37s ± 0%    ~     (p=0.549 n=10+9)
Write4MiBDelay50Msec-12          16.3s ± 0%     16.3s ± 0%    ~     (p=0.400 n=10+9)
Write4MiBDelay150Msec-12         48.6s ± 0%     48.6s ± 0%    ~     (p=0.393 n=10+10)

I‘ve cut out most of the metrics, because they’re largely the same. Actually, things are looking generally so fast that the difference that are there look to be mostly just noise.

I would be more inclined to trust the smaller buffer ones, as they have more time/op (more transfers means more of this one tiny code change having any effect). But then also maybe confirmation bias.

I do think the most interesting thing here is the massive variance at the 32k buffer for both the read and write. Even though it’s a massive 2ms to the new one, the variance is so massive there’s no way to say for sure it’s because of this code change.

So, nothing spectacular, but I still think “nocode is best code” is still controlling anyways. Nothing massive, but still a better choice anyways. (More interesting, removing this opens up the possibility of moving the mutex later to just the writing, meaning the packet marshal times can potentially overlap. That might be where I saw the big boost that I spoke of above.)

…to gitignore

puellanivis · 2021-03-17T20:21:31Z

If we want to change them we should do it in a way that does not require to checking if an interface is implemented for each packet (for example for writes)

🤔 A good idea. Have all of them implement marshalPacket rather than MarshalBinary and then drop the test. Good idea. I will probably work it into my “rewrite all the unmarshal code”. Since I’m there futzing with all the packet marshal/unmarshal code already.

puellanivis · 2021-03-19T18:10:19Z

With no objections, let’s merge this?

Improve benchmarks and errors

f1e28f8

puellanivis commented Mar 17, 2021

View reviewed changes

packet_test.go Show resolved Hide resolved

packet_test.go Outdated Show resolved Hide resolved

packet_test.go Show resolved Hide resolved

client_integration_test.go Outdated Show resolved Hide resolved

address my own code review comments

39e1161

puellanivis commented Mar 17, 2021

View reviewed changes

client_integration_test.go Outdated Show resolved Hide resolved

remove writeToBuffer, the bytes.Buffer.Grow I saw in the memprofile w…

b22b9e4

…as elsewhere

Add convenient benchmark Makefile rule

325cdac

greatroar reviewed Mar 17, 2021

View reviewed changes

Makefile Show resolved Hide resolved

greatroar reviewed Mar 17, 2021

View reviewed changes

sftp.go Outdated Show resolved Hide resolved

greatroar reviewed Mar 17, 2021

View reviewed changes

client_integration_test.go Outdated Show resolved Hide resolved

puellanivis added 3 commits March 17, 2021 20:17

split benchmark and benchmark_w_memprofile, include memprofile files …

32f98f3

…to gitignore

errors.Errorf over fmt.Errorf

91163e4

give the bytes.Buffer a preallocated slice to use for less variance

addaabd

puellanivis merged commit d26c4bc into master Mar 20, 2021

puellanivis deleted the benchmarks-and-errors branch March 20, 2021 03:11

puellanivis restored the benchmarks-and-errors branch March 20, 2021 03:11

puellanivis deleted the benchmarks-and-errors branch March 20, 2021 03:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve benchmarks and errors #421

Improve benchmarks and errors #421

puellanivis commented Mar 17, 2021 •

edited

Loading

puellanivis commented Mar 17, 2021

drakkan commented Mar 17, 2021 •

edited

Loading

greatroar commented Mar 17, 2021

puellanivis commented Mar 17, 2021

puellanivis commented Mar 17, 2021

puellanivis commented Mar 17, 2021

puellanivis commented Mar 19, 2021

Improve benchmarks and errors #421

Improve benchmarks and errors #421

Conversation

puellanivis commented Mar 17, 2021 • edited Loading

puellanivis commented Mar 17, 2021

drakkan commented Mar 17, 2021 • edited Loading

greatroar commented Mar 17, 2021

puellanivis commented Mar 17, 2021

puellanivis commented Mar 17, 2021

puellanivis commented Mar 17, 2021

puellanivis commented Mar 19, 2021

puellanivis commented Mar 17, 2021 •

edited

Loading

drakkan commented Mar 17, 2021 •

edited

Loading