-
Notifications
You must be signed in to change notification settings - Fork 154
tailer (tail ‐f)
One of the basic components of grok_exporter
is a file tailer, which is something like tail -f
, except that it can also deal with logrotate
. It took a while to get this right, so I would like to share my findings here.
###Logrotate
Logrotate is a tool that automatically archives old log data so that log files don't grow infinitely. Depending on the configuration, there are several ways how logrotate deals with the files. The following shell commands simulate two possible logrotate configurations:
Move the old file and create a new one.
mv logfile logfile.1 && echo > logfile
echo 'next log line' >> logfile
Copy the old file and trunkate the original copy. This has the effect that the original file is never deleted, which is good if programs keep the logfile open while logging.
cp logfile logfile.1 && :> logfile
echo 'next log line' >> logfile
One option to implement a logrotate-aware file tailer is to continuously poll for new loglines. This is what filebeat does. Filebeat's polling interval is configurable with the backoff and max_backoff configuration options.
However, if a line is written and the logfile is rotated before the next polling, the line will be lost. Therefore, grok_exporter
does not implement Option 1.
All operating systems provide some way for programs to subscribe to file system events. Using file system events, we can avoid unnecessary polling, and we can be sure sure that we don't miss anything if logrotate
runs immediately after a line has been logged.
fsnotify is a Go library providing a unified event API across the most common operating systems (Linux, BSD/macOS, Windows). This is what mtail uses.
However, it turns out that the sequence of events provided by fsnotify is not independent of the operating system when it comes to corner cases:
- When logrotate does something like
mv logfile logfile.1 && echo 'next log line' > logfile
, the WRITE event may be lost on BSD operating systems (this is due to a race-condition in the way kqueue is used to simulate recursive directory watches). - When logrotate truncates a file instead of removing it, the truncation results in different fsnotify events on different operating systems.
- Some underlying implementations keep the logfile open when monitoring the file. However, we cannot use the open file handles, because other underlying implementations don't keep open files. As a result, we open the watched logfile twice on some operating systems (like BSD), which works, but is not nice.
After a lot of debugging, we figured that interpreting the fsnotify events correctly for each operating system is about as hard as to go with Option 3. Therefore, grok_exporter
does not use fsnotify.
BTW: mtail seems to have its focus on Linux, in which case it doesn't really have these problems.
While debugging the behaviour of fsnotify, we found out that the underlying system calls are actually not as hard as it sounds, especially as we only want to watch a single file. So the current implementation has one file tailer for each operating system:
- fileTailer_darwin.go is the BSD / macOS implementation and using kqueue system call.
- fileTailer_linux.go is the Linux implementation using the inotify call.
- fileTailer_windows.go is the Windows implementation using Windows system calls that are fortunately wrapped by the nice winfsnotify library.
grok_exporter
implements Option 3.
Implementing tail -f
is a lot harder than it seems, and if you want to do it correctly, you need to implement it in an operating system specific way.