Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve benchmarks and errors #421

Merged
merged 7 commits into from
Mar 20, 2021
Merged

Improve benchmarks and errors #421

merged 7 commits into from
Mar 20, 2021

Conversation

puellanivis
Copy link
Collaborator

@puellanivis puellanivis commented Mar 17, 2021

Ahead of some other improvements, I wanted to narrow the benchmarks to only relevant code, and this allow necessitated fixing some tests to go along with it.

Fixes:

  • errors.Errorf("constant string") have been replaced with errors.New("constant string")
  • errors.Errorf("constant string: %v", err) have been replaced with errors.Wrap(err, "constant string")
  • sftp.clientConn.conn.sendPacketTest has been replaced with a timeBombWriter, this removes unnecessary test code from conn
  • all err != errFakeNet have been replaced with !errors.Is(err, errFakeNet), this works because we are now properly wrapping errors.
  • Error messages starting with a capital letter have been fixed to use a lower-case character.
  • %v formatting codes that can be more type-safe have been made to be more type specific.
  • errors from runSftpClient’s cmd.Wait have been put into an execErr type that wraps the underlying error.

packet_test.go has a lot of changes:

  • test tables have been moved into their individual functions, rather than globals
  • more marshaling and unmarshaling tests have been added to verify full byte ordering
  • where a byte value in a byte-slice is semantically an ASCII character, use 'c' to make the value clear.
  • marshal/unmarshal string can contain NUL bytes, which can always be a sticky point, so add some tests for that
  • byte-slices that are marshaled output have semantic line-breaks on each field to make the semantics of the binary data clearer
  • TestRecvPacket now tests a the short-packet behavior, and the long-packet behavior.
  • benchmarks on marshaling now marshal to ioutil.Discard as we do not want to benchmark bytes.Buffer, we want to benchmark our marshaling code.

packet_test.go Show resolved Hide resolved
packet_test.go Outdated Show resolved Hide resolved
packet_test.go Show resolved Hide resolved
client_integration_test.go Outdated Show resolved Hide resolved
@puellanivis
Copy link
Collaborator Author

Benchmarks between this and the current main branch are discontinuous, i.e. the tests have changed enough that any change in values is (mostly†) purely noise.

†: There is a small change here that has a surprising impact on performance. Removing the if sendTestPacket != nil { … } is pretty significant considering its part of every single packet sent. And since this was only ever for test code, reworking the tests to remove it is a straight and simple win.

@drakkan
Copy link
Collaborator

drakkan commented Mar 17, 2021

Benchmarks between this and the current main branch are discontinuous, i.e. the tests have changed enough that any change in values is (mostly†) purely noise.

†: There is a small change here that has a surprising impact on performance. Removing the if sendTestPacket != nil { … } is pretty significant considering its part of every single packet sent. And since this was only ever for test code, reworking the tests to remove it is a straight and simple win.

I have noticed the same thing in other projects in the past, it is precisely for this reason that in #407 I wrote:

If we want to change them we should do it in a way that does not require to checking if an interface is implemented for each packet (for example for writes)

Makefile Show resolved Hide resolved
@greatroar
Copy link
Contributor

Removing the if sendTestPacket != nil { … } is pretty significant considering its part of every single packet sent.

Just out of curiosity, do you have a figure on that? I just took a diff of the asm, and the only differences are a memory access that should go to the L1 cache and a predictable jump.

sftp.go Outdated Show resolved Hide resolved
@puellanivis
Copy link
Collaborator Author

Just out of curiosity, do you have a figure on that? I just took a diff of the asm, and the only differences are a memory access that should go to the L1 cache and a predictable jump.

Generating benchmarks now…

@puellanivis
Copy link
Collaborator Author

old is with the if, new is without.

name                          old time/op    new time/op    delta
Read1k-12                       52.1ms ± 3%    51.2ms ± 1%  -1.76%  (p=0.027 n=10+8)
Read16k-12                      6.75ms ± 2%    6.54ms ± 1%  -3.12%  (p=0.000 n=8+9)
Read32k-12                      7.08ms ±32%    5.10ms ±10%    ~     (p=0.105 n=10+10)
Read128k-12                     3.49ms ± 4%    3.64ms ± 4%  +4.37%  (p=0.001 n=10+10)
Read512k-12                     3.32ms ± 2%    3.40ms ± 1%  +2.40%  (p=0.000 n=9+8)
Read1MiB-12                     3.70ms ± 2%    3.76ms ± 3%  +1.59%  (p=0.011 n=10+10)
Read4MiB-12                     4.34ms ± 1%    4.41ms ± 3%    ~     (p=0.079 n=9+10)
Read4MiBDelay10Msec-12          69.5ms ± 1%    69.6ms ± 1%    ~     (p=0.436 n=10+10)
Read4MiBDelay50Msec-12           311ms ± 0%     310ms ± 0%    ~     (p=0.218 n=10+10)
Read4MiBDelay150Msec-12          912ms ± 0%     911ms ± 0%  -0.06%  (p=0.005 n=7+10)
Write1k-12                      98.1ms ± 7%    98.8ms ± 6%    ~     (p=0.579 n=10+10)
Write16k-12                     12.2ms ± 6%    12.1ms ± 3%    ~     (p=0.631 n=10+10)
Write32k-12                     9.03ms ± 5%    9.20ms ±10%    ~     (p=0.661 n=9+10)
Write128k-12                    9.05ms ± 7%    9.16ms ± 8%    ~     (p=0.549 n=9+10)
Write512k-12                    9.30ms ± 5%    9.51ms ± 9%    ~     (p=0.156 n=9+10)
Write1MiB-12                    9.30ms ± 6%    9.17ms ± 5%    ~     (p=0.604 n=10+9)
Write4MiB-12                    9.41ms ± 7%    9.45ms ±12%    ~     (p=0.739 n=10+10)
Write4MiBDelay10Msec-12          3.37s ± 0%     3.37s ± 0%    ~     (p=0.549 n=10+9)
Write4MiBDelay50Msec-12          16.3s ± 0%     16.3s ± 0%    ~     (p=0.400 n=10+9)
Write4MiBDelay150Msec-12         48.6s ± 0%     48.6s ± 0%    ~     (p=0.393 n=10+10)

I‘ve cut out most of the metrics, because they’re largely the same. Actually, things are looking generally so fast that the difference that are there look to be mostly just noise.

I would be more inclined to trust the smaller buffer ones, as they have more time/op (more transfers means more of this one tiny code change having any effect). But then also maybe confirmation bias.

I do think the most interesting thing here is the massive variance at the 32k buffer for both the read and write. Even though it’s a massive 2ms to the new one, the variance is so massive there’s no way to say for sure it’s because of this code change.

So, nothing spectacular, but I still think “nocode is best code” is still controlling anyways. Nothing massive, but still a better choice anyways. (More interesting, removing this opens up the possibility of moving the mutex later to just the writing, meaning the packet marshal times can potentially overlap. That might be where I saw the big boost that I spoke of above.)

@puellanivis
Copy link
Collaborator Author

If we want to change them we should do it in a way that does not require to checking if an interface is implemented for each packet (for example for writes)

🤔 A good idea. Have all of them implement marshalPacket rather than MarshalBinary and then drop the test. Good idea. I will probably work it into my “rewrite all the unmarshal code”. Since I’m there futzing with all the packet marshal/unmarshal code already.

@puellanivis
Copy link
Collaborator Author

With no objections, let’s merge this?

@puellanivis puellanivis merged commit d26c4bc into master Mar 20, 2021
@puellanivis puellanivis deleted the benchmarks-and-errors branch March 20, 2021 03:11
@puellanivis puellanivis restored the benchmarks-and-errors branch March 20, 2021 03:11
@puellanivis puellanivis deleted the benchmarks-and-errors branch March 20, 2021 03:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants