-
Notifications
You must be signed in to change notification settings - Fork 1
/
ipmimex.8
423 lines (378 loc) · 13.7 KB
/
ipmimex.8
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
.TH ipmimex 8 "2021-03-28"
.SH "NAME"
ipmimex \- a metrics exporter for BMC
.SH "SYNOPSIS"
.nh
.na
.HP
.B ipmimex
[\fB\-DLNPSTUVcdfh\fR]
[\fB\-b\ \fIbmc_path\fR]
[\fB\-l\ \fIfile\fR]
[\fB\-p\ \fIport\fR]
[\fB\-s\ \fIip\fR]
[\fB\-v\ DEBUG\fR|\fBINFO\fR|\fBWARN\fR|\fBERROR\fR|\fBFATAL\fR]
[\fB\-x\ \fImetric_regex\fR]
[\fB\-X\ \fIsensor_regex\fR]
[\fB\-i\ \fImetric_regex\fR]
[\fB\-I\ \fIsensor_regex\fR]
.ad
.hy
.SH "DESCRIPTION"
.B ipmimex
is a \fBm\fRetrics \fBex\fRporter for Board Management Controller (BMC)
and similar devices, which are directly accessible from the host running
this utility using the system's \fBI\fRntelligent \fBP\fRlatform
\fBM\fRanagement \fBI\fRnterface (\fBIPMI\fR).
Collected data can be exposed via HTTP in Prometheus exposition format [1]
e.g. using the default endpoint URL
\fBhttp://\fIhostname\fB:\fI9290\fB/metrics\fR (optionally via
\fBhttp://\fIhostname\fB:\fI9290\fB/overview\fR in a ipmitool sensor like
format) and thus visualized e.g. using Grafana [2], Netdata [3], or Zabbix [4].
In contrast to prometheus' ipmi_exporter and other IPMI based metrics gatherers
\fBipmimex\fR is written in plain C (having KISS in mind)
and thus it is extremely lightweight, has beside libprom, libmicrohttp and
OS standard libs libc and libm no other dependencies, is more or less a
standalone tool and handles IPMI traffic/sensors efficiently. It
\fBdoes not need any IPMI library\fR installed, just the related kernel
module loaded (Linux: ipmi_si, Solaris: bmc) and read-write access to
/dev/ipmi0 on Linux, /dev/bmc on Solaris (or whatever path the OS IPMI driver
binds the related BMC). Per default these pathes are read-write for the root
user, only. Running \fBipmimex\fR with root privileges should be safe, however,
it is not a big deal to e.g. allow a certain group access to this path and thus
running \fBipmimex\fR as an normal, unprivileged user should be preferred.
Note that for now \fBipmimex\fR supports threshold-based, non-discrete sensors,
only. This decision has been made to keep it simple and because usually no-one
needs the others, or if needed \fBipmievd\fR(8) or SNMP tools, etc. already
take care of it.
BMCs are usually slow and a priori not designed to handle IPMI queries
in parallel (IPMI specifications explicitly mentions this). So one should avoid
to query any metrics not really needed. One may use the options \fB-x\fR,
\fB-X\fR, \fB-i\fR, and \fB-I\fR with an extended regular expression (regex)
argument to exclude/include metrics (lower case options) and to exclude/include
sensors by name (upper case options). Since the list of sensors to query gets
constructed on the start of the \fBipmimex\fR (or if a change in the
\fBS\fRensor \fBD\fRata \fBR\fRecord (\fBSDR\fR)
repository requires a re-scan), the complexity of the regex arguments have no
impact when metrics get queried by a client. So there is no need to spent much
time for optimizing the regexs - instead keep it small and simple.
The include options take precedence over exclude options. So one may exclude
all metrics (e.g. -x '.*') and include only the power usage of the
box (e.g. -i '.*_dcmi_power_.*'). The experience shows, that usually the
reading for the later gets returned really fast, but the reading of a single
other sensor takes ~ 2-5 ms (the scanning of the whole SDR repository ~ 1-3 s).
\fBipmimex\fR operates in 3 modes:
.RS 2
.IP \fBdefault\fR 2
Just collects all data as it would for a /metrics HTTP request, print
it to the standard output and exit.
.IP \fBforeground\fR
Start the internal HTTP server to answer HTTP requests, but stays
attached to the console/terminal (i.e. standard input, output and error).
Use option \fB-f\fR to request this mode.
.IP \fBdaemon\fR
Start the internal HTTP server (daemon) to answer HTTP requests in the
background (fork-exec model) in a new session, detach from the
console/terminal, attach standard input, output and error to /dev/null
and finally exit with exit code \fB0\fR, if the daemon is running as
desired. Remember, if you do not specify a logfile to use, all messages
emitted by the daemon get dropped.
Use option \fB-d\fR to request this mode.
.RE
\fBipmimex\fR answers one HTTP request after another to have a
very small footprint wrt. the system and queried device. So it is
recommended to adjust your firewalls and/or HTTP proxies accordingly.
If you need SSL or authentication, use a HTTP proxy like nginx - remember:
\fBipmimex\fR should be kept as small and simple as possible.
When \fBipmimex\fR runs in \fBforeground\fR or \fBdaemon\fR mode, it also
returns by default the duration of the following data collect and format tasks:
.RS 2
.TP 2
.B default
HTTP related statistics.
.TP
.B process
\fBipmimex\fR process related data.
.TP
.B ipmi
Sensor data via normal IPMI and DCMI commands.
.TP
.B libprom
All tasks together, i.e. sum of the default, process, and ipmi task.
.RE
.SH "OPTIONS"
.P
The following options are supported:
.TP 4
.B \-D
.PD 0
.TP
.B \-\-ignore\-disabled\-flag
Some buggy BMCs mark sensors as \fBD\fRisabled in the SDR capabilities but
they provide normal reading values and thresholds. So if you miss some sensors
in your list, just try this option and see, whether you get more. Dell iDRACs
are know to have this bug.
.TP
.B \-L
.PD 0
.TP
.B \-\-no\-scrapetime
Disable the overall scrapetime metrics (libprom collector), i.e. the time
elapsed when scraping all the required data. One needs to also disable
collecting scrapetimes of all other collectors before this option
gets honored. This is very helpful when one tries to determine the stats query
interval to use. E.g. Gigabyte BMCs are slow as hell - the overall scrapetime
intervall will show you something close to 2 seconds =8-(. So especially there
it is pretty important to unselect all sensors not needed using the exclude
options -X and -x.
.TP
.B \-N
.PD 0
.TP
.B \-\-drop\-no\-read
Some BMC firmware (like Sun's ILOMs) provide SDRs for devices/sensors not yet
populated/conncted. If they get queried, a more or less generic command
completion code gets returned: "Command not supported in present state". Hmmm,
hard to guess the real cause of it and whether it will come up within the
next ...
So per default such devices gets always querried and in case of Sun ILOMs will
always return no result and thus in this case useless to query.
Ergo one may use this option to exclude such sensors. On a X4600M2 e.g. with
two CPU boards this reduces the number of sensors to query from 96 to 52 and cuts the query cycle time to almost the half!
Ergo the final recommendation: If in doubt, just use this option.
\fBipmimex\fR will report the sensors skipped, if any.
.TP
.B \-P
.PD 0
.TP
.B \-\-no\-powerstats
Minor optimization to reduce bandwidth and resource usage. The response to a
dcmi power reading command contains some statistical values like min, max and
average power usage of a certain sample period, and the width of this sample
period in seconds (some BMC use seconds since last boot, some 1s, some 10s,
etc.). So if one uses a timeseries database and Grafana or similar monitors,
those are more or less redundant, useless data - there is no need to
transfer it over the wire or to store it in a database.
.TP
.B \-S
.PD 0
.TP
.B \-\-no\-scrapetime\-all
Disable recording the scrapetime of each collector separately. There is
one collector named \fBdefault\fR, which collects HTTP request/response
statistics, the optional \fBprocess\fR collector, which records metrics
about the ipmimex process itself, the \fBipmi\fR collector, which queries
all the BMC for metrics, and finally the \fBlibprom\fR collector,
which just records the time it took to collect and prom-format the data
of all other collectors.
.TP
.B \-T
.PD 0
.TP
.B \-\-no\-thresholds
Disable all \fI*\fB_threshold_*\fR metrics.
By default a sensor usually provides up to
6 different thresholds (\fBn\fRon-\fBr\fRecoverable, \fBcr\fritical,
\fBn\fRon-\fBc\fRritical for upper and lower bounds) and when reporting it
is disabled, it saves some bandwith (~300-500 bytes/sensor uncompressed).
However, in good timeseries databases like VictoriaMetrics such constant
values usually consume very little space and it doesn't hurt much to have
them available for visualizations, too. Thresholds get queried, formatted and
cached at the SDR repository scan phase, so pretty cheap for each client
request.
.TP
.B \-U
.PD 0
.TP
.B \-\-no\-state
A minor bandwith optimization. The sensor reading command response contains
the threshold state as well, and therefore gets reported, too (4 .. >= upper
non-recoverable, 2 .. >= upper critical, 1 .. >= upper non-critical
and -1 .. <= lower non-critical, -2 .. <= lower critical, -4 .. <= lower
non-recoverable bound).
Using this option reduces bandwith (~60-80 bytes/sensor) and
database resource usage (constant like) marginal.
.TP
.B \-V
.PD 0
.TP
.B \-\-version
Print \fBipmimex\fR version info and exit.
.TP
.BI \-b " path"
.PD 0
.TP
.BI \-\-bmc= " path"
Use the given \fIpath\fR to access the desired BMC. If not given, the default
platform specific path (e.g. Linux: /dev/ipmi0, Solaris: /dev/bmc) will be used.
.TP
.B \-c
.PD 0
.TP
.B \-\-compact
Sending a HELP and TYPE comment alias description about a metric is
according to the Prometheus exposition format [1] optional. With this
option they will be ommitted in the HTTP response and thus it saves
bandwith and processing time.
.TP
.B \-d
.PD 0
.TP
.B \-\-daemon
Run \fBipmimex\fR in \fBdaemon\fR mode.
.TP
.B \-f
.PD 0
.TP
.B \-\-foreground
Run \fBipmimex\fR in \fBforeground\fR mode.
.TP
.B \-h
.PD 0
.TP
.B \-\-help
Print a short help summary to the standard output and exit.
.TP
.BI \-l " file"
.PD 0
.TP
.BI \-\-logfile= file
Log all messages to the given \fIfile\fR when the main process is running.
.TP
.BI \-n " list"
.PD 0
.TP
.BI \-\-no-metric= list
Skip all the metrics given in the comma separated \fIlist\fR of identifiers.
Currently supported are:
.RS 4
.TP 4
.B version
All \fBipmimex_version\fR metrics (default collector).
.TP 4
.B ipmi
All \fBipmimex_ipmi_*\fR metrics (ipmi collector). See option \-x, \-X, \-i
and \-I for a little bit more fine grained selection.
.TP 4
.B dcmi
All \fBipmimex_dcmi_*\fR metrics. Right now power reading is supported,
only (ipmi collector).
.TP 4
.B process
All \fBipmimex_process_*\fR metrics (process collector).
.RE
.TP
.B \-o
.PD 0
.TP
.B \-\-overview
If ipmimex runs in \fBforeground\fR or \fBdaemon\fR mode, enable an
ipmitool sensor look alike output under the URL path \fB/overview\fR.
As \fB/metrics\fR it triggers a new full cycle of sensor requests and therefore
you should take care to not overwhelm your BMC (a whole cycle usually takes
about 0.2 .. 0.3 seconds).
.TP
.BI \-p " num"
.PD 0
.TP
.BI \-\-port= num
Bind to port \fInum\fR and listen there for HTTP requests. Note that a port
below 1024 usually requires additional privileges.
.TP
.BI \-s " IP"
.PD 0
.TP
.BI \-\-source= IP
Bind the HTTP server to the given \fIIP\fR address, only. Per default
it binds to 0.0.0.0, i.e. all IPs configured on this host/zone/container.
If you want to enable IPv6, just specify an IPv6 address here (\fB::\fR
is the same for IPv6 as 0.0.0.0 for IPv4).
.TP
.BI \-v " level"
.PD 0
.TP
.BI \-\-verbosity= level
Set the message verbosity to the given \fIlevel\fR. Accepted tokens are
\fBDEBUG\fR, \fBINFO\fR, \fBWARN\fR, \fBERROR\fR, \fBFATAL\fR and for
convenience \fB1\fR..\fB5\fR respectively.
.P
The following flags are related to the ipmi task and compared against sensor
reading metrics (\fBipmimex_ipmi_*\fR), only.
To disable all \fB*_threshold\fR or \fB*_state\fR metrics one may use
the option \-T and \-U respectively. If you need a more fine grained
selection, consider to use a proxy (e.g. VictoriaMetrics vmagent or nginx,
etc.).
.TP
.BI \-x " regex"
.PD 0
.TP
.BI \-\-exclude-metrics= regex
Exclude all metrics from the ipmi task whoms name matches the given extended
regular expression \fIregex\fR (see also \fBregcomp\fR(3C)).
.TP
.BI \-X " regex"
.PD 0
.TP
.BI \-\-exclude-sensors= regex
Exclude all metrics from the ipmi task whoms sensor name matches the given
extended regular expression \fIregex\fR (see also \fBregcomp\fR(3C)).
.TP
.BI \-i " regex"
.PD 0
.TP
.BI \-\-include-metrics= regex
Include all metrics from the ipmi task whoms name matches the given extended
regular expression \fIregex\fR (see also \fBregcomp\fR(3C)). Takes precedence
over excludes (see -X ... and -x ...).
.TP
.BI \-I " regex"
.PD 0
.TP
.BI \-\-include-sensors= regex
Include all metrics from the ipmi task whoms sensor name matches the given
extended regular expression \fIregex\fR (see also \fBregcomp\fR(3C)). Takes
precedence over excludes (see -X ... and -x ...).
.SH "EXIT STATUS"
.TP 4
.B 0
on success.
.TP
.B 1
if an unexpected error occurred during the start (other problem).
.TP
.B 96
if an invalid option or option value got passed (config problem).
.TP
.B 100
if the logfile is not writable or port access is not allowed (permission problem).
.TP
.B 101
If BMC could not be found, is not accessible or provides no threshold-based,
non-discrete sensors.
.SH "ENVIRONMENT"
.TP 4
.B PROM_LOG_LEVEL
If no verbosity level got specified via option \fB-v\ \fI...\fR, this
environment variable gets checked for a verbosity value. If there is a
valid one, the verbosity level gets set accordingly, otherwise \fBINFO\fR
level will be used.
.SH "FILES"
.TP 4
.B /dev/ipmiN or /dev/bmc
The character special devices used by default to communicate with the BMC.
.SH "BUGS"
https://github.com/jelmd/ipmimex is the official source code repository
for \fBipmimex\fR. If you need some new features, or metrics, or bug fixes,
please feel free to create an issue there using
https://github.com/jelmd/ipmimex/issues .
.SH "AUTHORS"
Jens Elkner
.SH "SEE ALSO"
[1]\ https://prometheus.io/docs/instrumenting/exposition_formats/
.br
[2]\ https://grafana.com/
.br
[3]\ https://www.netdata.cloud/
.br
[4]\ https://www.zabbix.com/
.\" # vim: ts=4 sw=4 filetype=nroff