Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conntrack XML parsing is consuming a lot of CPU #1991

Closed
2opremio opened this issue Nov 7, 2016 · 15 comments
Closed

Conntrack XML parsing is consuming a lot of CPU #1991

2opremio opened this issue Nov 7, 2016 · 15 comments
Assignees
Labels
chore Related to fix/refinement/improvement of end user or new/existing developer functionality performance Excessive resource usage and latency; usually a bug or chore
Milestone

Comments

@2opremio
Copy link
Contributor

2opremio commented Nov 7, 2016

See #1985 (comment)

@2opremio 2opremio added the performance Excessive resource usage and latency; usually a bug or chore label Nov 7, 2016
@2opremio 2opremio added this to the re:Invent milestone Nov 7, 2016
@rade rade added the chore Related to fix/refinement/improvement of end user or new/existing developer functionality label Nov 7, 2016
@2opremio
Copy link
Contributor Author

2opremio commented Nov 7, 2016

@alban Suggests replacing the conntrack command (and in turn parsing its XML output) by talking to the kernel directly through a Netlink socket.

@2opremio 2opremio mentioned this issue Nov 7, 2016
17 tasks
@2opremio
Copy link
Contributor Author

2opremio commented Nov 9, 2016

Found https://github.com/typetypetype/conntrack but the status is not really reassuring ...

@2opremio
Copy link
Contributor Author

2opremio commented Nov 10, 2016

Using netlink is going to be a bit more work than I originally expected due to the incomplete Golang support of NETLINK_NETFILTER and my almost total ignorace about netlink until yesterday.

My plan is to fork https://github.com/typetypetype/conntrack/ to reuse all the parsing but it may be more sensible to bite the bullet and implement vishvananda/netlink#171 (which I think would take a much longer time). @awh Thoughts?

@2opremio
Copy link
Contributor Author

2opremio commented Nov 15, 2016

After an offline discussion with @awh I've confirmed that supporting a new netlink protocol is a non-trivial task and that we should instead try improving parsing first. I will proceed in the following order, reevaluating performance after each step:

  • Try to find a better performing XML library
  • Try parsing the textual output of conntrack (i.e. without -o xml)
  • Use netlink

@2opremio
Copy link
Contributor Author

Try to find a better performing XML library

I have failed to find an alternative XML library at all. Moving on to parse the conntrack text manually.

@tomwilkie
Copy link
Contributor

Huh, I thought codec did it, but it doesn't.

https://godoc.org/github.com/ugorji/go/codec

On Tue, 15 Nov 2016 at 12:53, Alfonso Acosta notifications@github.com
wrote:

Try to find a better performing XML library

I have failed to find an alternative XML library at all. Moving on to
parse the conntrack text manually.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#1991 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAbGhQ1NpJUVmd3I1U4XyyBHu7u5WPDsks5q-aspgaJpZM4KrLIo
.

@2opremio 2opremio modified the milestones: December2016, EOY 2016 Dec 13, 2016
@2opremio
Copy link
Contributor Author

Parsing the line-based output of conntrack is a pain in the ass because some (at least 4) fields are optional. This means I cannot simply use Scanf.

I started coding a manual parser for it, but I am not sure it's worth the effort compared to spending time on https://github.com/vishvananda/netlink

@tomwilkie
Copy link
Contributor

tomwilkie commented Dec 15, 2016 via email

@rade
Copy link
Member

rade commented Dec 15, 2016

Do you know what bit of the xml parsing is the overhead?

IIRC string allocation/gc is the issue. Due to golang's lack of string interning / symbols, or a proper generational GC.

@tomwilkie
Copy link
Contributor

tomwilkie commented Dec 15, 2016 via email

@tomwilkie
Copy link
Contributor

tomwilkie commented Dec 15, 2016 via email

@rade
Copy link
Member

rade commented Dec 15, 2016

As I understand it, for the xml parsing the string allocation/gc happens mainly due to tags and attribute keys, e.g. if you have a document containing 500 <meta> tags, that's 500 allocations, and 500 items of garbage, not one.

@tomwilkie
Copy link
Contributor

tomwilkie commented Dec 15, 2016 via email

@2opremio
Copy link
Contributor Author

I could create a parser with something like https://github.com/prataprc/goparsec

Maybe I could also give https://github.com/moovweb/gokogiri a try

@2opremio
Copy link
Contributor Author

Does https://github.com/lucsky/go-exml help?

It doesn't promise any performance improvements plus sits on top of a standard Go encoding/xml/Decoder so I am doubtful

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
chore Related to fix/refinement/improvement of end user or new/existing developer functionality performance Excessive resource usage and latency; usually a bug or chore
Projects
None yet
Development

No branches or pull requests

3 participants