This is not yet very sophisticated.
- clone the repository somewhere
- symlink the python modules files into the system area
# cd /usr/lib/ganglia/python_modules # ln -s /path/to/ganglia-modules/pymodules/*.py .
- symlink or copy the configuration files
# cd /etc/ganglia/conf.d/ # ln -s /path/to/ganglia-modules/conf.d/*.pyconf .
- Edit the configuration files. See the next section.
- Restart Ganglia
# /etc/init.d/gmond restart
The nv_gpu
module makes use of nvidia-smi
to inject values about the card’s memory usage, temperature and fan. It currently hard-codes the available quantities in the module code. Which ones to report are turned on in the config file. The quantities exposed are what looked interesting and implemented on a GTX 750 Ti.
The lm_sens
module makes use of the sensors
command to inject values from sensors. This module requires explicit configuration with knowledge of what LM Sensors returns on the particular system. To configure, run sensors -u
to see what is available. The output is given as a hierarchy of:
- device
- the LM sensors device
- adapter
- the adapter the device is on
- thing
- the thing for which values are sensed
- name
- the name of one value
For example:
$ sensors -u coretemp-isa-0000 Adapter: ISA adapter Physical id 0: temp1_input: 33.000 temp1_max: 80.000 temp1_crit: 100.000 temp1_crit_alarm: 0.000
To configure to use a value from this one needs to set a parameter in the module
section like:
modules { module { name = "lm_sens" language = "python" param CPUTemp { value = "coretemp-isa-0000:ISA adapter:Physical id 0:temp1_input" } } }
This is needed in order to tell the lm_sens
module how to “find” the value in the hierarchy of data from sensors
. This is done by associating a “:”-separated path into the hierarchy with a parameter name. This parameter is then used to name a metric:
collection_group { collect_every = 60 time_threshold = 3600 metric { name = "CPUTemp" title = "CPU Temperature" } }
The hddtemp
module makes use of the program of the same name to deliver disk drive temperatures. The configuration is two parts. First module parameters are used to associate a device to a name:
modules { module { name = "hddtemp" language = "python" param SSDTemp { value = "/dev/sda" } param HDDTemp { value = "/dev/sdb" } } }
Second, turn on those named values as desired.
collection_group { collect_every = 60 time_threshold = 3600 metric { name = "SSDTemp" title = "SSD temperature" } metric { name = "HDDTemp" title = "HDD temperature" } }