- Currently only works on Centos/RHEL
- Some steps may fail for Zenoss Core
Pull requests welcome.
Zenoss staff interested in making changes should refer to this document describing the code change process for inspector.
Use the following to unit-test changes to the inspector framework:
$ make test
To inspect a host, simply run the following on that host:
$ curl -s https://raw.githubusercontent.com/zenoss/inspector/master/bootstrap.sh | sudo sh
Inspection results will be printed to stdout, and complete logs will be compressed and stored in
a dated file that looks like inspected-2015-10-09T12-48-47.054684.tar.gz
. If inspector detects
any problems during your inspection, it will inform you at runtime.
free.sh done
dfinode.sh done
df.sh done
ps-aux.sh done
diskspace.py done
The following filesystems are >= 90% full:
Filesystem Type Size Used Avail Use% Mounted on
/dev/sda1 ext4 30G 29G 1G 97% /
docker-version.sh done
etchosts.sh done
docker-container-logs.py done
netstat-plant.sh done
docker-images.sh done
uname-a.sh done
localhost-loopback.py done
inodes.py done
docker-running.sh done
Compressing results: inspected-2015-10-09T12-48-47.054684.tar.gz
Cleaning up...
Inspector has two main functions - it collects a wide variety of information that might be useful when doing root-cause analysis on a problem, and it verifies a number of install and operational requirements.
To run just the verification steps, use the following:
$ ./inspect -w verify
Inspector assumes the target is a CC master node. It can be run on a CC delegate node, but will produce a number of complaints that are not relevant on a CC delegate. For instance, the storage configuration on a CC delegate node is much simpler.
To run only the checks that make sense for a CC delegate node use:
$ ./inspect-delegate.sh
Inspections can be performed with a handful of options. To use inspector in this fashion, just download the master branch, unzip it, and run inspect:
$ wget -q https://github.com/zenoss/inspector/archive/master.zip
$ unzip -q master.zip
$ cd inspector-master
$ ./inspect -h
usage: inspect [-h] [--no-remove] [--no-save] [-j JOBS] [-t TIMEOUT]
[-w WHITELIST [WHITELIST ...] | -b BLACKLIST [BLACKLIST ...]]
Inspect a Zenoss installation.
optional arguments:
-h, --help show this help message and exit
-v, --version Print version and exit.
--print-tags Print tags and exit.
--no-remove Don't remove working directory.
--no-save Don't compress results.
-j JOBS, --jobs JOBS Maximum number of jobs to run in parallel.
-t TIMEOUT, --timeout TIMEOUT
Timeout in seconds for entire inspection.
-l LINES, --lines LINES
Limit output of each result file to LINES lines. Set
to 0 to collect all lines. Default is 100000.
-w WHITELIST [WHITELIST ...], --whitelist WHITELIST [WHITELIST ...]
Use only these tags.
-b BLACKLIST [BLACKLIST ...], --blacklist BLACKLIST [BLACKLIST ...]
Don't use these tags.
There's two kinds of scripts inspector uses to collect and process host information - collector and info scripts. Collector scripts are executed and their stdout and stderr are stored in a newly created directory. Examples include collecting disk space usage, memory usage, and system log files. Info scripts are a little different - their output is printed to stdout in a scary red color. The purpose of info scripts is to inform the user that something may need investigating. Examples include checking for full disks, excessive memory usage, or errors in logs.
Directives can be added to inspector scripts to indicate how they should be used. Add a directive to a script by simply adding a comment with the directive in it. See examples below.
Add a comment to your script with zenoss-inspector-tags
followed by a space-separated list of
tags that should be applied to your script at runtime.
#!/bin/bash
# zenoss-inspector-tags slow filesystem
lsof
Users can whitelist or blacklist tags at runtime. For example, if a user wanted to run only scripts
tagged filesystem
, they would use the following command:
$ ./inspect -w filesystem
Alternatively, they could blacklist slow scripts thus:
$ ./inspect -b slow
Add a comment to your script with zenoss-inspector-deps
followed by a space-separated list of
dependencies. Your script will not be executed until/unless its dependencies have been met.
Suppose there are already two inspector scripts, a.sh
& b.sh
, upon which your third
script, c.sh
, should depend. Here's what c.sh
might look like:
#!/bin/bash
# zenoss-inspector-deps a.sh b.sh
diff a.sh.stdout b.sh.stdout
Add a comment to your script with zenoss-inspector-info
. This will redirect the stdout of your
script to the tty of the user running inspect, in red to indicate an there might be a problem. This
should only be used to indicate a possible issue, not for general output.
#!/bin/bash
# zenoss-inspector-info
echo "There's a problem!"
We'll need two scripts for this example: a collector script and an info script. Let's start with the collector.
Under inspector/scripts
, make a new script and name it ps-aux.sh
. Place the following
into it:
#!/bin/bash
# zenoss-inspector-tags process
ps aux
Now let's run inspector like this:
$ ./inspect --no-save --no-remove -w process
ps-aux.sh done
Here we're instructing inspector to not create a tarball (--no-save
), not remove the
working directory (--no-remove
) and to whitelist scripts tagged process
, which is what we
tagged our script with. Inspector indicates that our script was run. We should now have a new
directory that looks something like inspected-2015-10-11T11-18-11.347149
, and inside we should
see at least two files: ps-aux.sh.stdout
and inspector.log
.
Now that we've collected a list of processes running on the system, let's write a script that will warn the user if the Docker daemon is not on that list.
Create a new script under inspector/scripts
and name it docker-running.sh
. Place the
following inside it:
#!/bin/bash
# zenoss-inspector-info
# zenoss-inspector-tags process docker
# zenoss-inspector-deps ps-aux.sh
grep "docker -d" ps-aux.sh.stdout &>/dev/null
if [ $? -ne 0 ]
then
grep "docker daemon" ps-aux.sh.stdout &>/dev/null
if [ $? -ne 0 ]
then echo "Docker doesn't appear to be running."
fi
fi
Take a look at the zenoss-inspector-*
directives we've added. zenoss-inspector-info
instructs
inspector to direct the stdout of this script immediately to the user.
zenoss-inspector-tags process docker
tags our new script process
and docker
.
zenoss-inspector-deps ps-aux.sh
instructs inspector to wait for the ps-aux.sh
script
to finish executing successfully before kicking off our new script.
Note that this script is looking in ps-aux.sh.stdout
in the current working directory. Collector
script output filenames are appended with .stdout
or .stderr
, and all scripts are executed
within the directory that contains those output files.
Now let's run inspector again:
./inspect --no-save --no-remove -w process
ps-aux.sh done
docker-running.sh done
Looks good. If I stop docker and run it again:
$ sudo stop docker
docker stop/waiting
$ ./inspect --no-save --no-remove -w process
ps-aux.sh done
docker-running.sh done
Docker doesn't appear to be running.
Once you commit your new scripts and push them to github, they will be executed the next time someone kicks off an inspection.