Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OID rewriting in generator.yml #486

Closed
bissquit opened this issue Feb 26, 2020 · 10 comments
Closed

OID rewriting in generator.yml #486

bissquit opened this issue Feb 26, 2020 · 10 comments

Comments

@bissquit
Copy link

Host operating system: output of uname -a

Linux monitoring-test 4.4.190-1.el7.elrepo.x86_64 #1 SMP Sun Aug 25 07:32:44 EDT 2019 x86_64 x86_64 x86_64 GNU/Linux

snmp_exporter version: output of snmp_exporter -version

Docker container: prom/snmp-exporter:latest

What device/snmpwalk OID are you using?

1.3.6.1.4.1.318.1.1.1.9.3.3.1.2 (upsPhaseOutputPhaseIndex)

If this is a new device, please link to the MIB(s).

APC PowerNet MIB

What did you do that produced an error?

My generator.yml was:

modules:
  apc:
    walk:
      - 1.3.6.1.4.1.318.1.1.1.9.3.3.1.2
    version: 3
    max_repetitions: 25
    retries: 3
    timeout: 10s
    auth:
      <...>

And here is snmp.yml generated via standard procedure (with Docker) described in documentation:

And snmp.yml:
apc:
  walk:
  - 1.3.6.1.4.1.318.1.1.1.9.3.3.1.2
  metrics:
  - name: upsPhaseOutputPhaseIndex
    oid: 1.3.6.1.4.1.318.1.1.1.9.3.3.1.2
    type: gauge
    help: The output phase identifier. - 1.3.6.1.4.1.318.1.1.1.9.3.3.1.2
    indexes:
    - labelname: upsPhaseOutputPhaseTableIndex
      type: gauge
    - labelname: upsPhaseOutputPhaseIndex
      type: gauge
  version: 3
  max_repetitions: 25
  retries: 3
  timeout: 10s
  auth:
    <...>

What did you expect to see?

I expected to see three metrics that normally returned by snmpwalk:

# snmpwalk -v3  -l authPriv -u <username> -a SHA -A <...> -x DES -X <...> <device_name> 1.3.6.1.4.1.318.1.1.1.9.3.3.1.2
SNMPv2-SMI::enterprises.318.1.1.1.9.3.3.1.2.1.1.1 = INTEGER: 1
SNMPv2-SMI::enterprises.318.1.1.1.9.3.3.1.2.1.1.2 = INTEGER: 2
SNMPv2-SMI::enterprises.318.1.1.1.9.3.3.1.2.1.1.3 = INTEGER: 3

What did you see instead?

I ran the command:

curl -sw '' 'http://localhost:9116/snmp?module=apc&target=<device_name>'

And received an error:

An error has occurred while serving metrics:

2 error(s) occurred:
* collected metric "upsPhaseOutputPhaseIndex" { label:<name:"upsPhaseOutputPhaseIndex" value:"1" > label:<name:"upsPhaseOutputPhaseTableIndex" value:"1" > gauge:<value:2 > } was collected before with the same name and label values
* collected metric "upsPhaseOutputPhaseIndex" { label:<name:"upsPhaseOutputPhaseIndex" value:"1" > label:<name:"upsPhaseOutputPhaseTableIndex" value:"1" > gauge:<value:3 > } was collected before with the same name and label values

The error is like in this issues #246.
But I found strange device behavior (like in issue #273 ) described below.

Let's poll OID 1.3.6.1.4.1.318.1.1.1.9.3.3.1.2 (upsPhaseOutputPhaseIndex) but with addition digit (.1) added to the end of OID:

# snmpwalk -v3  -l authPriv -u <username> -a SHA -A <...> -x DES -X <...> <device_name> 1.3.6.1.4.1.318.1.1.1.9.3.3.1.2.1
SNMPv2-SMI::enterprises.318.1.1.1.9.3.3.1.2.1.1.1 = INTEGER: 1
SNMPv2-SMI::enterprises.318.1.1.1.9.3.3.1.2.1.1.2 = INTEGER: 2
SNMPv2-SMI::enterprises.318.1.1.1.9.3.3.1.2.1.1.3 = INTEGER: 3

And add another one:

# snmpwalk -v3  -l authPriv -u <username> -a SHA -A <...> -x DES -X <...> <device_name> 1.3.6.1.4.1.318.1.1.1.9.3.3.1.2.1.1
SNMPv2-SMI::enterprises.318.1.1.1.9.3.3.1.2.1.1.1 = INTEGER: 1
SNMPv2-SMI::enterprises.318.1.1.1.9.3.3.1.2.1.1.2 = INTEGER: 2
SNMPv2-SMI::enterprises.318.1.1.1.9.3.3.1.2.1.1.3 = INTEGER: 3

Look at OID. The last OID I've checked is 1.3.6.1.4.1.318.1.1.1.9.3.3.1.2.1.1 (with additional .1.1 at the end)

Let's change our snmp.yml adding .1.1 to OID:

apc:
  walk:
  - 1.3.6.1.4.1.318.1.1.1.9.3.3.1.2
  metrics:
  - name: upsPhaseOutputPhaseIndex
    oid: 1.3.6.1.4.1.318.1.1.1.9.3.3.1.2.1.1
    type: gauge
    help: The output phase identifier. - 1.3.6.1.4.1.318.1.1.1.9.3.3.1.2(added .1.1)
    indexes:
    - labelname: upsPhaseOutputPhaseTableIndex
      type: gauge
    - labelname: upsPhaseOutputPhaseIndex
      type: gauge
  version: 3
  max_repetitions: 25
  retries: 3
  timeout: 10s
  auth:
    <...>

After restarting snmp_exporter I receive the output as expected:

# HELP upsPhaseOutputPhaseIndex The output phase identifier. - 1.3.6.1.4.1.318.1.1.1.9.3.3.1.2(added .1.1)
# TYPE upsPhaseOutputPhaseIndex gauge
upsPhaseOutputPhaseIndex{upsPhaseOutputPhaseIndex="0",upsPhaseOutputPhaseTableIndex="1"} 1
upsPhaseOutputPhaseIndex{upsPhaseOutputPhaseIndex="0",upsPhaseOutputPhaseTableIndex="2"} 2
upsPhaseOutputPhaseIndex{upsPhaseOutputPhaseIndex="0",upsPhaseOutputPhaseTableIndex="3"} 3

The problem exists not only with this OID but with all the OIDs in subtree 1.3.6.1.4.1.318.1.1.1.9.3.3.1.*. The solution is the same - add .1.1 to the end of OID.
Some OIDs in other subtrees work as expected without any OIDs changes.

We observe this behavior on multiple similar devices with different firmware version. Are there any ways to rewrite OIDs (maybe using regex) in generator.yml? Now we forced to rewrite OIDs in snmp.yml via shell scripts and is not a good practice for production environment. I found similar problems (#150, #186, #442) but as I understood they related to value rewriting.

If OIDs rewrite function does not exist are you going to implement it in future releases? I think it's very useful for buggy devices (because it's SNMP and there are a lot of devices that do not comply with the standard)

@brian-brazil
Copy link
Contributor

Your device's output is not conforming to the MIB. You'll need to report this as a bug to your vendor and either get a correct MIB or updated firmware for your devices.

@brian-brazil
Copy link
Contributor

And this duplicates #246 and #273, so closing.

@bissquit
Copy link
Author

bissquit commented Feb 27, 2020

Brian you are right it's not a normal behavior. But I think the bugs like this is a common case and the generator should be able to handle it. Am I wrong?
As I mentioned above we have not only the one device (and with different firmware versions) with this strange behavior and the guys in 246 and 273 issues faces with the same bug.

If you implement OID rewrite functionality into snmp_exporter it'll make it more flexible. Report to vendor is acceptable option but we'll have to wait before bug fixing a long time with no guarantee and we'll not be able to use snmp_exporter normally. So the OID rewriting will help us.

@brian-brazil
Copy link
Contributor

The generator handles it perfectly, however this is a GIGO situation. Your vendor has not provided devices confirming to a MIB, and you need to take it up with them.

@SuperQ
Copy link
Member

SuperQ commented Feb 27, 2020

This might be another overrides section MIB workaround. Not sure exactly how we would define this in a generic way.

@bissquit
Copy link
Author

It'll be very useful because we'll be able to use snmp_exporter while the vendor is fixing a device bugs. Telegraf + influxdb + Grafana works fine but we want to use Prometheus but couldn't.

As mentioned above we have a lot of devices with this behavior and "report this as a bug to your vendor" sounds like "don't use snmp_exporter until the bugs fixed by vendor".

@bissquit
Copy link
Author

@brian-brazil , my question was not about device bugs but about feature-request. Are you going to implement OID rewriting? It's not possible to use snmp_exporter for some devices without this function. But Telegraf + influxdb + Grafana works perfectly

@brian-brazil
Copy link
Contributor

I don't think it's sane to add a generic tree mutator. You need a corrected MIB or device.

@bissquit
Copy link
Author

bissquit commented Mar 3, 2020

@brian-brazil , here is the same problem with the same solution - snmp.yml patching provided by generator. So I would be very grateful if you provide OID rewriting if future releases. Thank you

@brian-brazil
Copy link
Contributor

If you want a different oid tree, provide a different MIB.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants