-
Notifications
You must be signed in to change notification settings - Fork 3
/
README
44 lines (27 loc) · 2.29 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
Update: we stopped using this. varnishncsa sucks.
This is better: https://github.com/jib/libvmod-statsd
This is how we (at Krux) get varnish requests/s, hit / miss, and response time metrics in graphite.
The script reads from a logfile (argument #1), using pyinotify.
Via inotify, it supports both truncation and log rotation by means of rename + re-creation of the file.
For every file IN_MODIFY event, the script reads the varnishncsa log, and sends a metric.
FAQ: Q: why not use Etsy's Logster? A: it doesn't use inotify, and this was very important to us, due to dealing with > 15K req/s.
Some things are pretty specific, like the use of kruxstatsd and our metrics layout - adjust to taste.
Options:
-h, --help show this help message and exit
--cluster=CLUSTER_NAME
The cluster_name stats will be grouped in
--environment=ENVIRONMENT
The (krux) environment the script is running in
--stdin Instead of tailing a log file, read from stdin
Usage:
Run a varnishncsa daemon outputting the data this script expects:
/usr/bin/varnishncsa -a -w /data/var/log/varnish/statsd_requests.log -F %U %{Varnish:time_firstbyte}x %{X-Request-Backend}o %{Varnish:hitmiss}x
Run the script (via supervisor):
python varnish_statsd_send.py --cluster=apiservices-a --environment=prod /data/var/log/varnish/statsd_requests.log
Requires:
pyinotify
kruxstatsd (or implement your own mechanism for sending to statsd/graphite in the run() method.. see send_graphite.py for an example)
Notes:
varnishncsa won't log every request, if they are served by pipelining (in varnish, I mean, where it supresses multiple backend requests for the same HTTP pipelined request). This sucks, when using 'ab' with a high concurrency to test.. but in production, we never see the same exact request from the same HTTP client pipelined, so it's really not a big deal. Something to be aware of, though.
TODO:
currently at log rotation, we (correctly) catch-up with missed events based upon the file handle cursor. But this leads to big spikes in graphs, at high traffic rates. Once varnish supports strftime() formatting (support is in master now, but not in a release), we should start logging that too, and send catch-up events directly to graphite with the correct timestamp.