-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
netlink: performance bottleneck due to single goroutine in high performance applications #154
Comments
Are you setting DisableNSLockThread? If so I would be interested in seeing what optimizations can be made on that path and would welcome PRs. I haven't had a need for higher performance myself yet. If you truly need maximum performance, you could also use the SyscallConn method and deal with the socket directly. |
In this case yes - this is a privileged per host conntrack monitor which is
in the root net namespace.
…On Tue, Oct 8, 2019 at 10:29 PM Matt Layher ***@***.***> wrote:
Are you setting DisableNSLockThread? If so I would be interested in seeing
what optimizations can be made on that path and would welcome PRs. I
haven't had a need for higher performance myself yet.
If you truly need maximum performance, you could also use the SyscallConn
method and deal with the socket directly.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#154?email_source=notifications&email_token=AAI37J7WX3FKIGK5JSFJUJLQNU62DA5CNFSM4I6ZG3Y2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAWJTHI#issuecomment-539793821>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAI37JYS7I5KTNYRNNU65UDQNU62DANCNFSM4I6ZG3YQ>
.
|
I'm back on my computer today and had a chance to think about this more. First, is your code open source? That'd help me understand what all it is that you're doing. Second, the original reasons for the goroutine locking are here: Lines 655 to 674 in 99e7bba
I suspect that at the time, I may have had another bug that caused issues with routing messages between sockets. If the caller disables thread locking via the config, perhaps we can allow multiple goroutines to directly access the socket simultaneously without any issues. That said, I haven't bothered to challenge the status quo as my needs are a lot simpler than many of the nfnetlink use cases that have popped up with folks like yourself, @ti-mo, and @florianl. If we can prove that it's safe to do so when network namespace manipulation is not required, I'd love to see a free performance boost for folks who don't need that functionality. |
I've never personally encountered the issue @mdlayher describes about messages being routed to the wrong sockets, always been wondering what the exact symptoms would be. I suspect it would look like the situation I'm describing below if network namespaces were involved. The reason behind locking a worker thread to a network namespace is simple, but requires some context (which I assume @smarterclayton has 🙂). For a deep-dive, read https://www.weave.works/blog/linux-namespaces-golang-followup and related articles. tl;dr: When
Because of the way Go's OS threads are implemented, there is absolutely no guarantee that a Imagine your program made the runtime create 10 OS threads and your code contains a That said, it's no surprise to me that shoving a closure down a channel and waiting for results carries a CPU overhead because it simply needs to chase more pointers. Perhaps we could extend @mdlayher Could you somehow dig up or reproduce your findings of back in the day around sockets getting mixed up? |
@mdlayher Instead of opening a new issue as you suggested over the slack channel, I think my issue is in the same area as the one reported by @smarterclayton . netlink netlink Here is the repo with my code: https://github.com/sbezverk/nfproxy |
Here is pprof collected data.. |
I have a couple of ideas, but ultimately I really need a program that can reproduce these types of profiles, to verify if any of them actually pan out.
Ideally I need an example program which runs on a typical Linux machine without something like Kubernetes running. |
@mdlayher Here is e2e test I use to test nftableslib functionality, https://github.com/sbezverk/nftableslib/blob/master/cmd/e2e/e2e.go It is basic but it does test connectivity as well. Please let me know if it works for you. |
@mdlayher Reminder that you implemented this thread locking because of https://lists.infradead.org/pipermail/libnl/2017-February/002293.html. The behaviour you reported in this email thread is what we need a reproducer for. If we have a reproducer, we can simply try removing the single thread-locked executor goroutine and see what breaks. I wouldn't touch the actual syscall logic for now. |
The thread locking logic remains necessary due to network namespace support. When that is disabled via config, we can explore our options. |
@mdlayher +1 for ability to disable it via parameter. Some applications, example nfproxy operates in host namespace so hopefully disabling "network namespace support" might give an extra performance boost. |
@sbezverk that config option exists today in the netlink.Config struct. Have you tried it? There's probably still more work to be done on optimizing that path though. |
@mdlayher I was not aware, I will give it a try and let you know. thank you. |
@mdlayher tested and see on average 2x times improvements. See latest |
Here is what I did in nftables library to activate it,
What do you think, is it a safe assumption if netns == 0 it is safe to disable NSLockThread? |
Dial will return an error if NetNS and DisableNSLockThread are both set. See the docs: https://godoc.org/github.com/mdlayher/netlink#Config. |
@mdlayher For network namespaces, only socket creation needs to take place on a locked thread in another namespace. Once the socket has been created, the thread can be safely reassigned to the main namespace. As long as you hold onto the fd, the socket can be written to from the main namespace. You only implemented thread locking to work around https://lists.infradead.org/pipermail/libnl/2017-February/002293.html. |
This is news to me. Do you have documentation I can read? |
I don't think this is explicitly documented anywhere, but I remember talking about this with you a couple months ago after I implemented these e2e integration tests in conntracct: https://github.com/ti-mo/conntracct/blob/286c127/pkg/bpf/integration_test.go#L397-L401. This test locks the OS thread, creates and enters a new netns with the thread, brings up interfaces, sets up some nft rules, opens two UDP sockets in the netns, returns the thread to the main namespace, unlocks the thread, and returns. If you read the That makes me conclude it's possible to read/write to sockets in other namespaces as long as the fd is held onto. Otherwise, any socket operations would fail as soon as the creating thread returns to the main namespace. This should be no different for netlink sockets, someone just needs to dive in and confirm this. So: https://lists.infradead.org/pipermail/libnl/2017-February/002293.html should be the only reason the thread locking exists, and I'm sure it was already there when I contributed netns support. 🙂 We should try to reproduce the phenomenon you described in the thread, and try to narrow down the root cause, since it might mean that we could remove a lot of code and improve efficiency by a ton. |
FWIW I'm working on this in #171. With the release of a possibly-breaking |
I'm pretty sure #171 and the newly released v1.2.0 will make a significant difference for this issue. |
Thinking about it more, this issue is really old and so much has changed, so I'm just going to close it out. We can always open a new one if there are more issues to be handled. |
In scenarios where a locked OS thread isn't necessary, is the goroutine in sysSocket.write/read still required? Perhaps I'm missing a subtlety of M:N with the use of Recvmsg and netlink, but my expectation was that with the proper descriptor any goroutine could read correctly from that socket if the netns is consistent. Is that not the case?
I ask because in high traffic systems I'm seeing a fair amount of go runtime scheduler time in profiles when GOMAXPROCS>1 in go 1.12 due to the forced go routine change, and in testing in lightly loaded setups with a 100-200 netlink multicast event per second rate I saw a 5-10% CPU improvement if the goroutine is skipped (some from not needing to close the channel or defer). I am not a socket expert of course in this case so I figured it was easier to ask.
The text was updated successfully, but these errors were encountered: