-
Notifications
You must be signed in to change notification settings - Fork 712
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conntrack XML parsing is consuming a lot of CPU #1991
Comments
@alban Suggests replacing the conntrack command (and in turn parsing its XML output) by talking to the kernel directly through a Netlink socket. |
Found https://github.com/typetypetype/conntrack but the status is not really reassuring ... |
Using netlink is going to be a bit more work than I originally expected due to the incomplete Golang support of NETLINK_NETFILTER and my almost total ignorace about netlink until yesterday.
My plan is to fork https://github.com/typetypetype/conntrack/ to reuse all the parsing but it may be more sensible to bite the bullet and implement vishvananda/netlink#171 (which I think would take a much longer time). @awh Thoughts? |
After an offline discussion with @awh I've confirmed that supporting a new netlink protocol is a non-trivial task and that we should instead try improving parsing first. I will proceed in the following order, reevaluating performance after each step:
|
I have failed to find an alternative XML library at all. Moving on to parse the conntrack text manually. |
Huh, I thought codec did it, but it doesn't. https://godoc.org/github.com/ugorji/go/codec On Tue, 15 Nov 2016 at 12:53, Alfonso Acosta notifications@github.com
|
Parsing the line-based output of conntrack is a pain in the ass because some (at least 4) fields are optional. This means I cannot simply use Scanf. I started coding a manual parser for it, but I am not sure it's worth the effort compared to spending time on https://github.com/vishvananda/netlink |
Does https://github.com/lucsky/go-exml help? Do you know what bit of the
xml parsing is the overhead?
…On Thu, Dec 15, 2016 at 12:35 PM, Alfonso Acosta ***@***.***> wrote:
Parsing the line-based output of conntrack is a pain in the ass because
some (at least 4) fields are optional. This means I cannot simply use Scanf.
I started coding a manual parser for it, but I am not sure it's worth the
effort compared to spending time on https://github.com/vishvananda/netlink
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#1991 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAbGhU18Q3lWAs1rPOkDdknkxZs60PxPks5rITP-gaJpZM4KrLIo>
.
|
IIRC string allocation/gc is the issue. Due to golang's lack of string interning / symbols, or a proper generational GC. |
If that is the case, you could parse the IPs straight into unit32s (or
net.IPs), and the Proto, Direction and State into a enum etc
…On Thu, Dec 15, 2016 at 12:43 PM, Matthias Radestock < ***@***.***> wrote:
Do you know what bit of the xml parsing is the overhead?
IIRC string allocation/gc is the issue. Due to golang's lack of interns or
a proper generational GC.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#1991 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAbGhYoIbW2w4x1VwNx_0H7mJEE3Lsjaks5rITXegaJpZM4KrLIo>
.
|
Note that in previous cases of encoding / decoding overhead, its been the
reflection thats the problem, not allocations (cf report structure).
…On Thu, Dec 15, 2016 at 1:11 PM, Tom Wilkie ***@***.***> wrote:
If that is the case, you could parse the IPs straight into unit32s (or
net.IPs), and the Proto, Direction and State into a enum etc
On Thu, Dec 15, 2016 at 12:43 PM, Matthias Radestock <
***@***.***> wrote:
> Do you know what bit of the xml parsing is the overhead?
>
> IIRC string allocation/gc is the issue. Due to golang's lack of interns
> or a proper generational GC.
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> <#1991 (comment)>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/AAbGhYoIbW2w4x1VwNx_0H7mJEE3Lsjaks5rITXegaJpZM4KrLIo>
> .
>
|
As I understand it, for the xml parsing the string allocation/gc happens mainly due to tags and attribute keys, e.g. if you have a document containing 500 <meta> tags, that's 500 allocations, and 500 items of garbage, not one. |
… On Thu, Dec 15, 2016 at 1:16 PM, Matthias Radestock < ***@***.***> wrote:
As I understand it, for the xml parsing the string allocation/gc happens
mainly due to tags and attribute keys, e.g. if you have a document
containing 500 <meta> tags, that's 500 allocations, and 500 items of
garbage, not one.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#1991 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAbGhf_UqOIgCYYkkyWessfFFa77aJsuks5rIT2-gaJpZM4KrLIo>
.
|
I could create a parser with something like https://github.com/prataprc/goparsec Maybe I could also give https://github.com/moovweb/gokogiri a try |
It doesn't promise any performance improvements plus sits on top of a standard Go encoding/xml/Decoder so I am doubtful |
See #1985 (comment)
The text was updated successfully, but these errors were encountered: