-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Graphite TCP input -- questions about performance #3741
Comments
I can take a look at this, should be very easy. |
@pauldix and I took a look at this code, and there doesn't actually appear to be an issue. Whether the code loops reading the buffer, and injects a point into the batcher every read-loop, or splits the buffer, loops on that, and injects a point into the batcher, there should be no difference. For high performance batching must be enabled i.e. batch sizes greater than 1, ideally 100s or 1000s. |
#3772 is to help understand how the system is behaving. |
We don't believe there is an issue here, and I have confirmed that batching is at 1,000 by default. |
@otoolep , @pauldix looks like the root cause is the way TCP connections are handled in the graphite plugin, not batching. If TCP handler loops over message line by line and there is a large number of lines in message, won't the TCP connection be blocked for the duration of line parsing? Batch size in plugin does not matter, connection handling happens prior to messages getting put into batcher. The blocked connections will time out and get put in a [fin/close]_wait state by the kernel, causing client connection failures, and, if clients auto re-connect, eventually influxdb runs out of sockets, effectively DOSing itself until the kernel cleans up the waiting sockets. This does not happen on UDP as there is only one udp socket which does not need closing. To replicate with default batch size and graphite plugin listening on tcp port 2013, relay on port 2003 forwarding to influxdb (so you can see the connection failures) and load test client: python graphite_test_client.py -c 200 -d0 --port 2003 127.0.0.1 3600
# On another terminal, check connections to influxdb listener
netstat -an | grep 2013 | wc -l
30
# Connections in wait state. This increases as more connections are opened to replace the timed out ones.
$ netstat -an | grep 2013 | grep '_WAIT' | wc -l
81
$ netstat -an | grep 2013 | grep '_WAIT' | wc -l
97 Client sees connection failures, more sockets go in
Strace also shows connection time outs on the influxdb process:
Hope this helps. Can raise a new ticket for this if you'd like. |
Ok, thanks for the detailed analysis - we will take a look at this. On Friday, August 21, 2015, Dan notifications@github.com wrote:
|
Much appreciated :) |
We should have seen this earlier as we hit this before with collectd. We Thanks again @pkittenis On Friday, August 21, 2015, Philip O'Toole philip@influxdb.com wrote:
|
Technically, decouple parsing from handling the socket since there is no On Friday, August 21, 2015, Philip O'Toole philip@influxdb.com wrote:
|
@pkittenis -- is there any special way the relay needs to be run? Default config? |
Related issues: I decoupled parsing from reading from the TCP connection and saw about a 10% speed-up. We can implement this, but after further analysis and discussion, we're still not sure if this is really solving the problem, and if there actually is any. It should be noted that the code wraps the connection in My understanding of TCP is as follows, please correct me if am wrong. If the server-side code processing a given TCP connection cannot keep up with the incoming traffic, then the client will see slow performance. But why would the system disconnect the connection just because upstream is not accepting the data as fast as the client would like to send it? The client will simply block. Connection failures have nothing to do with existing connections being serviced by the system. Furthermore acceptance of new connections is completely decoupled by the code because @pkittenis -- you mention "blocked connections will time out and get put in a [fin/close]_wait state" -- can you provide more references about this? Also, performance of TCP was contrasted with performance of UDP. How was this measureed? Are you sure every single UDP packet made it into the system? The kernel has buffers that may overflow if the UDP socket is not serviced sufficiently quickly. These drops can be detected using No matter what we do, there will be a limit to how fast the system can service a given TCP connection. At some point that limit will be reached. We can make some improvements (perhaps 10%) but it will not be huge. If I am missing something here, please let me know. The smoking gun here would be demonstrable evidence that UDP is working way faster -- sending exactly the same amount of data over both protocols, and having that data actually returned in queries so we know it all made it. @pkittenis -- did you see that? |
| you mention "blocked connections will time out and get put in a [fin/close]_wait state" -- can you provide more references about this?" From wikipedia: | FIN-WAIT-1 Basically the client sees the unresponsive connection as a sign that the server is.. unresponsive and closes it on client side (timeout setting on the client side). Since the connection is still open and being processed on server side, it goes into half closed state. The kernel will forcefully close such connections after some time - that time in seconds being the value of I agree this is probably just an effect of slow write throughput and large batches of metrics sent by client. Not much server can do if it's getting that much data and can't process it fast enough. I did demonstrate UDP working faster than TCP in the mailing list, with a test client and relay. That was on 0.9.3-rc1. Since then performance has improved across the board, however. You will find a full test setup with relay config at https://groups.google.com/forum/#!topic/influxdb/T6mF4_WeNkM Still, 10% speed up is quite significant. Thanks for the investigation. |
FWIW, I am still working on this issue. Right now I am adding stats and diags to the system, and specifically instrumenting the Graphite input first. See #3916 for more details. Once this is in place, I hope it will bring more quantitative details to the discussion about Graphite performance. |
The UDP listener will take a batch of lines: https://github.com/influxdb/influxdb/blob/master/services/graphite/service.go#L213-L220
The TCP listener should do the same, otherwise performance will be much poorer.
The text was updated successfully, but these errors were encountered: