Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove unnecessary allocations and Write calls. #215

Merged
merged 1 commit into from
Jun 27, 2023

Conversation

NullHypothesis
Copy link
Contributor

This commit improves performance by 1) moving an unnecessary allocation to outside the hot loop and by 2) eliminating an unnecessary Write call.

NullHypothesis pushed a commit to brave/nitriding-daemon that referenced this pull request May 8, 2023
This commit improves the performance of the rx and tx loop by 1) moving
allocations to outside the loop and by 2) eliminating an unnecessary
Write call.

As long as gvproxy implements the same improvements, this should result
in ~20% more reqs/sec.

For posterity, here's the upstream PR for the gvproxy side of things:
containers/gvisor-tap-vsock#215
NullHypothesis pushed a commit to brave/nitriding-daemon that referenced this pull request May 8, 2023
This commit improves the performance of the rx and tx loop by 1) moving
allocations to outside the loop and by 2) eliminating an unnecessary
Write call.

As long as gvproxy implements the same improvements, this should result
in ~20% more reqs/sec.

For posterity, here's the upstream PR for the gvproxy side of things:
containers/gvisor-tap-vsock#215
NullHypothesis pushed a commit to brave/nitriding-daemon that referenced this pull request May 8, 2023
This commit improves the performance of the rx and tx loop by 1) moving
allocations to outside the loop and by 2) eliminating an unnecessary
Write call.

As long as gvproxy implements the same improvements, this should result
in ~20% more reqs/sec.

For posterity, here's the upstream PR for the gvproxy side of things:
containers/gvisor-tap-vsock#215
@baude
Copy link
Member

baude commented May 9, 2023

/approve

@baude
Copy link
Member

baude commented May 9, 2023

please sign your commit. i.e.

git commit --amend -s

@openshift-ci openshift-ci bot added the approved label May 9, 2023
}
if _, err := conn.Write(frame); err != nil {
errCh <- errors.Wrap(err, "cannot write packet to socket")
if _, err := conn.Write(append(size, frame...)); err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm slightly puzzled by this one. The append involves a memory copy and possibly a memory allocation, while at the same time you move another allocation out of the loop. Is conn.Write() a lot more expensive than a memory allocation?

Copy link
Contributor Author

@NullHypothesis NullHypothesis May 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. Over here, the overhead of the append call is less than the overhead of two Write calls: I saw a ~7% increase in both throughput (measured via iperf3) and requests per second (measured via a custom test bed). Also, I noticed that the buffer pre-allocation in rxStream made things worse, so I removed that code. That said, we could get rid of the append by adding a re-usable buffer but that requires a more involved fix.

Regarding the changes to main_linux.go: I haven't tested these changes directly. Instead, I tested our modified fork: https://github.com/brave/nitriding-daemon/blob/master/proxy.go#L124-L185 I'm happy to refactor main_linux.go's rx/tx loop to resemble what we've done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One other performant alternate for append might be https://pkg.go.dev/net#Buffers.WriteTo

This makes use of writev syscall to send multiple buffers over a single write.

We can give a try with this and see if it improves. I noticed that there is a point in document that buffers are optimised with writev for some connections so need to verify if it actually improves for us

This commit improves networking performance by 1) moving an unnecessary
allocation to outside the hot loop and by 2) eliminating an unnecessary
Write call.

This result in ~7% faster throughput as measured via iperf3 and a custom
Web service to measure requests per second.  Note that this is *despite*
the append call.

Signed-off-by: Philipp Winter <phw@brave.com>
@cfergeau
Copy link
Collaborator

I've ran some testing over this, iperf3 was improved/unchanged, and this does not appear to break networking, so let's merge this!
/approve
/lgtm

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 27, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: baude, cfergeau, NullHypothesis

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@cfergeau cfergeau merged commit 9a6dda3 into containers:main Jun 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants