Collects all the logs from the server and parses it for making a common schema for all the logs and stores at given storage engine.
- We expect users to follow Best practices for logging their application
- Most importantly, do structured logging. Since, parsing/formatting logs is way easier that way.
-
files
: Log files which are being tracked by logagg -
node
: The server(s) where the logfiles
reside -
collector
: A program that runs on eachnode
to collect and parse log lines in thefiles
-
formatters
: The parser function that thecollector
uses to format the log lines to put it the common format. -
nsq
: The central location where logs are sent bycollector
(s) after formatting as messages. -
forwarder
: The program that runs on the central node which receives messages fromnsq
and passes it on toforwarders
-
forwarders
: The parsers that take messages and formats it for storing attarget
(s) databases -
targets
: The databases that store the logs finally so that we can query on them easily
- Guaranteed delivery of each log line from files to
targets
- Reduced latency between a log being generated an being present in the
targets
- Options to add custom
formatters
&target
databases - File poll if log file not yet generated
- Works on rotational log files
- Custom
formatters
to support parsing of any log file. - Custom
forwarders
to support usage of any database/storage-engine file. - Output format of processed log lines (dictionary)
id
(str) - A unique id per log with time ordering. Useful to avoid storing duplicates.timestamp
(str) - ISO Format time. eg:data
(dict) - Parsed log dataraw
(str) - Raw log line read from the log filehost
(str) - Hostname of the node where this log was generatedformatter
(str) - name of the formatter that processed the raw log linefile
(str) - Full path of the file in the host where this log line came fromtype
(str) - One of "log", "metric" (Is there one more type?)level
(str) - Log level of the log line.event
(str) - LOG eventerror
(bool) - True if collection handler failed during processingerror_tb
(str) - Error traceback
Prerequisites: Python2.7
Install the nsq
package, at where we need to bring up the nsq
server.
- Run the following commands to install
nsq
:$ sudo apt-get install libsnappy-dev $ wget https://s3.amazonaws.com/bitly-downloads/nsq/nsq-1.0.0-compat.linux-amd64.go1.8.tar.gz $ tar zxvf nsq-1.0.0-compat.linux-amd64.go1.8.tar.gz $ sudo cp nsq-1.0.0-compat.linux-amd64.go1.8/bin/* /usr/local/bin
Install the Docker package, at both forwarder
and collector
nodes. (If you will be using Docker image to run logagg)
- Run the following commands to install :
$ sudo apt-get update $ sudo apt-get install \ apt-transport-https \ ca-certificates \ curl \ software-properties-common $ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add - $ sudo add-apt-repository \ "deb [arch=amd64] https://download.docker.com/linux/ubuntu \ $(lsb_release -cs) \ stable" $ sudo apt-get update $ sudo apt-get install docker-ce
- Check Docker version >= 17.12.1
$ sudo docker -v Docker version 18.03.1-ce, build 9ee9f40
- Run the following command to pip install
logagg
:$ sudo pip install https://github.com/deep-compute/pygtail/tarball/master/#egg=pygtail-0.6.1 $ sudo pip install logagg
- Run the following command to pull docker image of
logagg
:$ sudo docker pull deepcompute/logagg
- NOTE: Run each command in a seperate Terminal window
- nsqlookupd
$ nsqlookupd
- nsqd -lookupd-tcp-address :4160
$ nsqd -lookupd-tcp-address localhost:4160
- nsqadmin -lookupd-http-address :4161
$ nsqadmin -lookupd-http-address localhost:4161
NOTE: Run each command in a seperate Terminal window
- We will use serverstats
- Install
serverstats
$ sudo pip install serverstats
- Run
serverstats
to write logs in afile
which will be tracked bylogagg collect
$ sudo serverstats --log-file /var/log/serverstats.log run 2018-03-01T06:57:00.709472Z [info ] system_metrics _={'ln': 113, 'file': '/usr/local/lib/python2.7/dist-packages/serverstats/serverstats.py', 'name': 'serverstats.serverstats', 'fn': '_log_system_metrics'} cpu={'avg_load_5_min': 15.0, 'avg_load_15_min': 0.11, 'idle_percent': 89.0, 'iowait': 34.34, 'avg_load_1_min': 23.0, 'usage_percent': 11.0} disk={'usage': 6000046080, 'total': 41083600896, 'free_percent': 80.25838230555476, 'usage_percent': 15.4, 'free': 32973033472} id=20180301T065700_bd9ad0bc1d1d11e8bcf1000c2925b24d network_traffic={'lo': {'received': 93836, 'sent': 93836}, 'docker0': {'received': 0, 'sent': 0}, 'ens33': {'received': 268122175, 'sent': 3999917}} ram={'avail': 724705280, 'usage_percent': 59.5, 'avail_percent': 40.49408598212978, 'usage': 883863552, 'total': 1789657088, 'free': 120479744} swap={'usage': 11022336, 'total': 1071640576, 'free_percent': 98.97145215972112, 'free': 1060618240, 'usage_percent': 1.0} type=metric 2018-03-01T06:57:05.719910Z [info ] system_metrics _={'ln': 113, 'file': '/usr/local/lib/python2.7/dist-packages/serverstats/serverstats.py', 'name': 'serverstats.serverstats', 'fn': '_log_system_metrics'} cpu={'avg_load_5_min': 15.0, 'avg_load_15_min': 0.11, 'idle_percent': 89.0, 'iowait': 34.34, 'avg_load_1_min': 21.0, 'usage_percent': 11.0} disk={'usage': 6000046080, 'total': 41083600896, 'free_percent': 80.25838230555476, 'usage_percent': 15.4, 'free': 32973033472} id=20180301T065705_c09761401d1d11e8bcf1000c2925b24d network_traffic={'lo': {'received': 93836, 'sent': 93836}, 'docker0': {'received': 0, 'sent': 0}, 'ens33': {'received': 268122175, 'sent': 3999917}} ram={'avail': 724721664, 'usage_percent': 59.5, 'avail_percent': 40.49500146477223, 'usage': 883859456, 'total': 1789657088, 'free': 120479744} swap={'usage': 11022336, 'total': 1071640576, 'free_percent': 98.97145215972112, 'free': 1060618240, 'usage_percent': 1.0} type=metric
- Install
- Normal run
$ sudo logagg collect --file file=/var/log/serverstats.log:formatter=logagg.formatters.basescript --nsqtopic logagg --nsqd-http-address localhost:4151 2018-03-01T08:59:25.768443Z [info ] Created topic _={'ln': 33, 'file': '/usr/local/lib/python2.7/dist-packages/logagg/nsqsender.py', 'name': 'logagg.nsqsender', 'fn': '_ensure_topic'} id=20180301T085925_d799dd6c-1d2e-11e8-bcf1-000c2925b24d topic=logagg type=log 2018-03-01T08:59:25.771411Z [info ] Created topic _={'ln': 33, 'file': '/usr/local/lib/python2.7/dist-packages/logagg/nsqsender.py', 'name': 'logagg.nsqsender', 'fn': '_ensure_topic'} id=20180301T085925_d799dd6d-1d2e-11e8-bcf1-000c2925b24d topic=Heartbeat#ephemeral type=log 2018-03-01T08:59:25.772415Z [info ] found_formatter_fn _={'ln': 208, 'file': '/usr/local/lib/python2.7/dist-packages/logagg/collector.py', 'name': 'logagg.collector', 'fn': '_scan_fpatterns'} fn=logagg.formatters.basescript id=20180301T085925_d79a74c0-1d2e-11e8-bcf1-000c2925b24d type=log 2018-03-01T08:59:25.772980Z [info ] found_log_file _={'ln': 216, 'file': '/usr/local/lib/python2.7/dist-packages/logagg/collector.py', 'name': 'logagg.collector', 'fn': '_scan_fpatterns'} id=20180301T085925_d79a74c1-1d2e-11e8-bcf1-000c2925b24d log_file=/var/log/serverstats.log type=log 2018-03-01T08:59:25.773873Z [info ] Started collect_log_lines thread _={'ln': 223, 'file': '/usr/local/lib/python2.7/dist-packages/logagg/collector.py', 'name': 'logagg.collector', 'fn': '_scan_fpatterns'} id=20180301T085925_d79a74c2-1d2e-11e8-bcf1-000c2925b24d log_key=('/var/log/serverstats.log', '/var/log/serverstats.log', 'logagg.formatters.basescript') type=log
- Docker run
$ sudo docker run --name collector --hostname $HOSTNAME --volume /var/log:/var/log deepcompute/logagg logagg collect --file file=/var/log/serverstats.log:formatter=logagg.formatters.basescript --nsqtopic logagg --nsqd-http-address <nsq-server-ip-or-DNS>:4151
- Note: Replace with the ip of
nsq
server eg.: 192.168.0.211 - Note: --volume argument is to mount local directory of log file into
Docker
container
- Note: --hostname argument is to use the same hostname and not the docker container hostname
- Note: Replace with the ip of
- You can check message traffic at
nsq
by going through the link: http://:4171/ for localhost see here - You can see the collected logs in realtime using the following command:
$ nsq_tail --topic=logagg --channel=test --lookupd-http-address=<nsq-server-ip-or-DNS>:4161
- We will use
mongoDB
- Install
mongoDB
- Start
mongoDB
$ sudo mongod --dbpath <database-path> --bind_ip_all
- Create user for
mongoDB
using the following commands:$ mongo . . 2018-03-01T03:47:54.027-0800 I CONTROL [initandlisten] > use admin > db.createUser( ... { ... user: "deadpool", ... pwd: "chimichanga", ... roles: [ "readWrite", "dbAdmin" ] ... } ... ) Successfully added user: { "user" : "deadpool", "roles" : [ "readWrite", "dbAdmin" ] }
- Install
- Normal run
$ logagg forward --nsqtopic logagg --nsqchannel test --nsqd-tcp-address localhost:4150 --target forwarder=logagg.forwarders.MongoDBForwarder:host=localhost:port=27017:user=deadpool:password=chimichanga:db=logs:collection=cluster_logs_and_metric
- Docker run
sudo docker run --name forwarder deepcompute/logagg logagg forward --nsqtopic logagg --nsqchannel test --nsqd-tcp-address <nsq-server-ip-or-DNS>:4150 --target forwarder=logagg.forwarders.MongoDBForwarder:host=<mongoDB-server-ip-or-DNS>:port=27017:user=deadpool:password=chimichanga:db=logs:collection=cluster_logs_and_metrics
- NOTE: Replace with the ip of
nsq
server - NOTE: Replace with the ip of
mongoDB
server eg.: 192.168.0.111 - NOTE: --volume argument is to mount local directory of log file into eg.: 192.168.0.111
- NOTE: Replace with the ip of
- You can check records in mongoDB
$ mongo -u deadpool -p chimichanga .... .... > show dbs admin 0.000GB config 0.000GB local 0.000GB logs 0.003GB > use logs switched to db logs > show collections cluster_logs_and_metrics > db.cluster_logs_and_metrics.count() 5219 > db.cluster_logs_and_metrics.findOne()
-
For
logagg
$ logagg --help
$ sudo docker run deepcompute/logagg logagg --help
- you should see something like:
usage: logagg [-h] [--name NAME] [--log-level LOG_LEVEL] [--log-format {json,pretty}] [--log-file LOG_FILE] [--quiet] {collect,forward,run} ... Logagg command line tool optional arguments: -h, --help show this help message and exit --name NAME Name to identify this instance --log-level LOG_LEVEL Logging level as picked from the logging module --log-format {json,pretty} Force the format of the logs. By default, if the command is from a terminal, print colorful logs. Otherwise print json. --log-file LOG_FILE Writes logs to log file if specified, default: None --quiet if true, does not print logs to stderr, default: False commands: {collect,forward,run} collect Collects the logs from different processes and sends to nsq forward Collects all the messages from nsq and pushes to storage engine
-
For
logagg collect
$ logagg collect --help
$ sudo docker run deepcompute/logagg logagg collect -h
-
For
logagg forward
,$ logagg forward --help
$ sudo docker run deepcompute/logagg logagg forward -h
- After installation of the logagg module through pip, we can perform operations in the python shell.
$ python
>>> import logagg
>>> dir(logagg)
['LogCollector', 'LogForwarder', 'NSQSender', '__builtins__', '__doc__', '__file__', '__name__', '__package__', '__path__', 'collector', 'command', 'formatters', 'forwarder', 'forwarders', 'main', 'nsqsender', 'util']
>>> dir(logagg.formatters)
['RawLog', '__builtins__', '__doc__', '__file__', '__name__', '__package__', 'basescript', 'convert_str2int', 'datetime', 'django', 'docker_log_file_driver', 'elasticsearch', 'json', 'mongodb', 'nginx_access', 're']
>>> from pprint import pprint
>>> mongo_line = '2017-08-17T07:56:33.489+0200 I REPL \[signalProcessingThread\] shutting down replication subsystems'
>>> pprint(logagg.formatters.mongodb(mongo_line))
{'data': {'component': 'REPL',
'context': '[signalProcessingThread]',
'message': 'shutting down replication subsystems',
'severity': 'I',
'timestamp': '2017-08-17T07:56:33.489+0200'},
'timestamp': '2017-08-17T07:56:33.489+0200'}
- Connect to the mongo shell and perform queries:
> use database_name
> db.collection_name.find({'formatter': 'logagg.formatters.<handler_name>'})
- You can see the basic format of record like below:
{
"_id" : "20180301T065838_f7e042841d1d11e8bcf1000c2925b24d",
"level" : "info",
"timestamp" : "isoformat_time. Ex: 2017-08-01T07:32:24.183981Z",
"data" : {},
"raw" : "raw_log_line",
"host" : "x.com",
"formatter" : "logagg.formatters.basescript",
"event" : "default_event",
"file" : "/path/to/log/file",
"type" : "log | metric"
}
- Arbitrary example to get the records for
nginx
:
> use nginx
> db.cluster_logs_and_metrics.find({'handler': 'logagg.handlers.nginx_access', 'data.request_time' : {$gt: 0}}).count()
751438
> db.cluster_logs_and_metrics.find({'handler': 'logagg.handlers.nginx_access', 'data.request_time' : {$gt: 60}}).count()
181
- Install influxDB
$ curl -sL https://repos.influxdata.com/influxdb.key | sudo apt-key add -
source /etc/lsb-release
$ echo "deb https://repos.influxdata.com/${DISTRIB_ID,,} ${DISTRIB_CODENAME} stable" | sudo tee /etc/apt/sources.list.d/influxdb.list
$ sudo apt-get update && sudo apt-get install influxdb
$ sudo service influxdb start
- For metrics, connect to the InfluxDB shell and perform queries.
> use database_name
> show measurements
> select <field_key> from <measurement_name>
- Arbitrary example to get the metrics for a
measurement
:
> use nginx
> select request_time from nginx_metric limit 10
- You should see something like below:
time request_time
---- ------------
1508770751000000000 0.027
1508770751000000000 0.026
1508770753000000000 0.272
1508770754000000000 0.028
Formatter-name | Comments |
---|---|
nginx_access | See Configuration here |
mongodb | |
basescript | |
docker_log_file_driver | See example here |
Forwarder-name | Sample command |
---|---|
MongoDBForwarder | --target forwarder=logagg.forwarders.MongoDBForwarder:host=<mongoDB-server-ip>:port=<mongod-port-number>:user=<user-name>:password=<passwd>:db=<db-name>:collection=<collection name> |
InfluxDBForwarder | --target forwarder=logagg.forwarders.InfluxDBForwarder:host=<influxDB-server-ip>:port=<influxd-port-number>:user=<user-name>:password=<passwd>:db=<db-name>:collection=nothing |
Rules to follow when using InfluxDBForwarder:
- Only logs with
"type":"metric"
go to InfluxDB - key-value pairs where value is a string type becomes a
tag
in InfluxDB database and values having numerical values are considered asfields
. See link for more information on tags and fields - Naming a key starting with single '_' in the beggining will force them as fields even though they are strings.
- Naming a key starting with double '__' in the beggining will not be forwarded to InfluxDB
Note: For using multiple forwarders use the format --target <forwarder1> <forwarder2>
and not --target <forwarder1> --target <forwarder2>
$ echo $PYTHONPATH
$ mkdir customformatters
$ #Now append the path to $PYTHONPATH
$ export PYTHONPATH=$PYTHONPATH:/home/path/to/customformatters/
$ echo $PYTHONPATH
:/home/path/to/customformatters
$ cd customformatters/
$ mkdir myformatters
$ cd myformatters/
$ touch formatters.py
$ touch __init__.py
$ echo 'import formatters' >> __init__.py
$ #Now write your formatter functions inside the formatters.py file
Important:
- Only python standard modules can be imported in formatters.py file
- A formatter function should return a dict()
datatype
- The 'dict()' should only contain keys which are mentioned in the above log structure.
- Sample formatter functions:
import json import re sample_log_line = '2018-02-07T06:37:00.297610Z [Some_event] [Info] [Hello_there]' def sample_formatter(log_line): log = re.sub('[\[+\]]', '',log_line).split(' ') timestamp = log[0] event = log[1] level = log[2] data = dict({'message': log[3]}) return dict(timestamp = timestamp, event = event, level = level, data = data, )
To see more examples, look here
- Check if the custom handler works in
python interpreter
like for logagg.>>> import myformatters >>> sample_log_line = '2018-02-07T06:37:00.297610Z [Some_event] [Info] [Hello_there]' >>> output = myformatters.formatters.sample_formatter(sample_log_line) >>> from pprint import pprint >>> pprint(output) {'data': {'message': 'Hello_there'}, 'event': 'Some_event', 'level': 'Info', 'timestamp': '2018-02-07T06:37:00.297610Z'}
- Pseudo logagg collect commands:
or docker run
$ sudo logagg collect --file file=logfile.log:myformatters.formatters.sample_formatter --nsqtopic logagg --nsqd-http-address localhost:4151
$ sudo docker run --name collector --env PYTHONPATH=$PYTHONPATH --volume /var/log:/var/log deepcompute/logagg logagg collect --file file=logfile.log:myformatters.formatters.sample_formatter --nsqtopic logagg --nsqd-http-address <nsq-server-ip-or-DNS>:4151
You can store logagg collector/forwarder logs into files using basescript --log-file argument or docker file log driver
$ sudo logagg --log-file /var/log/logagg/collector.log collect file=/var/log/serverstats/serverstats.log:formatter=logagg.formatters.basescript --nsqtopic logagg --nsqd-http-address <nsq-server-ip-or-DNS>:4151
or docker run
$ sudo docker run --name collector --hostname $HOSTNAME --volume /var/log/:/var/log/ --restart unless-stopped --label formatter=logagg.formatters.basescript --log-driver file-log-driver --log-opt labels=formatter --log-opt fpath=/logagg/collector.log --log-opt max-size=100 deepcompute/logagg logagg collect --file file=/var/log/serverstats.log:formatter=logagg.formatters.basescript --nsqtopic serverstats --nsqd-http-address <nsq-server-ip-or-DNS>:4151
If there are multiple files being tracked by multiple collectors on multiple nodes, the collector information can be seen in "Heartbeat" topic of NSQ. Every running collector sends a hearbeat to this topic (default interval = 30 seconds). The heartbeat format is as follows:
timestamp
: Timestamp of the recieved heartbeat.heartbeat_number
: The heartbeat number since the collector started running.host
: Hostname of the node on which the collector is running.nsq_topic
: The nsq topic which the collector is using.files_tracked
: list of files that are being tracked by the collector followed by the fomatter.
You can run the following command to see the information:
$ nsq_tail --topic=Heartbeat --channel=test --lookupd-http-address=<nsq-server-ip-or-DNS>:4161
You're more than welcome to hack on this:-)
$ git clone https://github.com/deep-compute/logagg
$ cd logagg
$ sudo python setup.py install
$ docker build -t logagg .