Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[inputs.snmp] bits conversion fails #15963

Closed
llamafilm opened this issue Oct 2, 2024 · 19 comments · Fixed by #16027
Closed

[inputs.snmp] bits conversion fails #15963

llamafilm opened this issue Oct 2, 2024 · 19 comments · Fixed by #16027
Labels
bug unexpected problem or unintended behavior

Comments

@llamafilm
Copy link
Contributor

llamafilm commented Oct 2, 2024

Relevant telegraf.conf

[agent]
  snmp_translator = "gosmi"

[[inputs.snmp]]
  path = ["$MIBDIRS"]
  agents = ["10.37.155.80"]
  version = 2
  community = "public"
  agent_host_tag = "source"
  [[inputs.snmp.field]]
    oid = 'BARCO-ME-DCP-MIB::lampStatus.0'
    conversion = 'int'
    name = 'lampStatus'

Logs from Telegraf

% telegraf --config barco-debug2.conf --test --debug
2024-10-02T13:02:38Z I! Loading config: barco-debug2.conf
2024-10-02T13:02:38Z I! Starting Telegraf 1.32.0 brought to you by InfluxData the makers of InfluxDB
2024-10-02T13:02:38Z I! Available plugins: 235 inputs, 9 aggregators, 32 processors, 26 parsers, 62 outputs, 5 secret-stores
2024-10-02T13:02:38Z I! Loaded inputs: snmp
2024-10-02T13:02:38Z I! Loaded aggregators:
2024-10-02T13:02:38Z I! Loaded processors:
2024-10-02T13:02:38Z I! Loaded secretstores:
2024-10-02T13:02:38Z W! Outputs are not used in testing mode!
2024-10-02T13:02:38Z I! Tags enabled: host=Elliott-M2-4.local
2024-10-02T13:02:38Z D! [agent] Initializing plugins
2024-10-02T13:02:39Z W! DeprecationWarning: Value "agent_host" for option "agent_host_tag" of plugin "inputs.snmp" deprecated since version 1.29.0 and will be removed in : set to "source" for consistent usage across plugins or safely ignore this message and continue to use the current value
2024-10-02T13:02:39Z D! [agent] Starting service inputs
2024-10-02T13:02:39Z E! [inputs.snmp] Error in plugin: agent 10.37.155.80: converting "\x00" (OID .1.3.6.1.4.1.12612.220.11.2.2.4.3.0) for field lampStatus: strconv.ParseInt: parsing "\x00": invalid syntax
2024-10-02T13:02:39Z D! [agent] Stopping service inputs
2024-10-02T13:02:39Z D! [agent] Input channel closed
2024-10-02T13:02:39Z D! [agent] Stopped Successfully
2024-10-02T13:02:39Z E! [telegraf] Error running agent: input plugins recorded 1 errors

System info

Telegraf 1.32, MacOS 13.6.9

Steps to reproduce

Run this telegraf config

Expected behavior

The device returns a bits type value of 00. Telegraf should convert this to an integer 0.

Actual behavior

The conversion fails.

Additional info

This is a problem in 1.31 and 1.32 but it works fine in 1.30.3.

Below is the snmpget output from this same MacOS machine.

% snmpget -v2c -c public 10.37.155.80 BARCO-ME-DCP-MIB::lampStatus.0    
BARCO-ME-DCP-MIB::lampStatus.0 = BITS: 00 

Here's the relevant section from the MIB:

-- 1.3.6.1.4.1.12612.220.11.2.2.4.3
lampStatus	OBJECT-TYPE
	SYNTAX BITS
		{
			unused(0),
			on(7)
		}
	MAX-ACCESS	read-only
	STATUS	current
	DESCRIPTION
		"This is the lamp status.
		on: Lamp on"
	::= { lampProperties 3 }
@llamafilm llamafilm added the bug unexpected problem or unintended behavior label Oct 2, 2024
@llamafilm llamafilm changed the title SNMP bits conversion fails on MacOS [inputs.snmp] bits conversion fails on MacOS Oct 2, 2024
@llamafilm
Copy link
Contributor Author

llamafilm commented Oct 2, 2024

This is similar to #14694 but this time I get different results on Mac and Linux.

If I try converting to enum instead, then there is no output at all.

@Hipska
Copy link
Contributor

Hipska commented Oct 8, 2024

Again this strange OID 😛 So you are saying with the exact same config file works fine on Ubuntu, but gives this error when running on MacOS? Are the telegraf versions also exact the same?

PS; I created a new conversion named displayhint in #15935 which can convert the returned value exactly like how the net-snmp tools are doing it. You can try it out with the current nightly builds..

@llamafilm
Copy link
Contributor Author

So you are saying with the exact same config file works fine on Ubuntu, but gives this error when running on MacOS? Are the telegraf versions also exact the same?

Yes exactly

@llamafilm
Copy link
Contributor Author

My mistake — this is actually a regression in Telegraf version 1.32. Not related to the OS. It works in 1.30.3.

@llamafilm llamafilm changed the title [inputs.snmp] bits conversion fails on MacOS [inputs.snmp] bits conversion fails Oct 15, 2024
@llamafilm
Copy link
Contributor Author

llamafilm commented Oct 15, 2024

@Hipska thanks for the tip. Unfortunately displayhint doesn't work either; it ends up like this: lampStatus="00[]".

If I convert to hex, then I get a string lampStatus="00". I supposed I could convert that to int with a separate processor. But that was not necessary before, in Telegraf 1.30.

@Hipska
Copy link
Contributor

Hipska commented Oct 15, 2024

Okay, that makes sense. I see the change of behaviour came from #15390

Fix to this would be to use a better []byte to int conversion (preferably without the string step in between). @srebhan @DStrand1

@Hipska
Copy link
Contributor

Hipska commented Oct 15, 2024

Actually, you would just need the hextoint conversion, which is doing this correct already.

Could you confirm?

@llamafilm
Copy link
Contributor Author

llamafilm commented Oct 15, 2024

That doesn't work either. With conversion = 'hextoint' I get this:

E! [inputs.snmp] Error in plugin: agent 10.37.155.80: converting "\x00" (OID .1.3.6.1.4.1.12612.220.11.2.2.4.3.0) for field lampStatus: invalid conversion type "hextoint"

I also tried conversion = 'hextoint:BigEndian:uint16' and it crashes:

% telegraf --test --config ~/source/mse-netboxdata/barco-debug.conf
2024-10-15T10:50:43Z I! Loading config: /Users/ebalsley/source/mse-netboxdata/barco-debug.conf
2024-10-15T10:50:43Z I! Starting Telegraf 1.32.0 brought to you by InfluxData the makers of InfluxDB
2024-10-15T10:50:43Z I! Available plugins: 235 inputs, 9 aggregators, 32 processors, 26 parsers, 62 outputs, 5 secret-stores
2024-10-15T10:50:43Z I! Loaded inputs: snmp
2024-10-15T10:50:43Z I! Loaded aggregators:
2024-10-15T10:50:43Z I! Loaded processors:
2024-10-15T10:50:43Z I! Loaded secretstores:
2024-10-15T10:50:43Z W! Outputs are not used in testing mode!
2024-10-15T10:50:43Z I! Tags enabled: host=Elliott-M2-4.local
panic: runtime error: index out of range [1] with length 1

goroutine 14 [running]:
encoding/binary.bigEndian.Uint16(...)
	encoding/binary/binary.go:146
github.com/influxdata/telegraf/internal/snmp.(*Field).Convert(0x14003bc4770, {{0x10b0b26a0, 0x14003bc37e8}, {0x140046df770, 0x23}, 0x4})
	github.com/influxdata/telegraf/internal/snmp/field.go:264 +0x1d7c
github.com/influxdata/telegraf/internal/snmp.Table.Build({{0x10900a23f, 0x4}, {0x0, 0x0, 0x0}, 0x0, {0x14002a9cee0, 0x1, 0x1}, {0x0, ...}, ...}, ...)
	github.com/influxdata/telegraf/internal/snmp/table.go:175 +0x4c8
github.com/influxdata/telegraf/plugins/inputs/snmp.(*Snmp).gatherTable(0x140019e2780, {0x10c756200, 0x14001153820}, {0x10c713c98, 0x14003bc66e0}, {{0x10900a23f, 0x4}, {0x0, 0x0, 0x0}, ...}, ...)
	github.com/influxdata/telegraf/plugins/inputs/snmp/snmp.go:134 +0xa0
github.com/influxdata/telegraf/plugins/inputs/snmp.(*Snmp).Gather.func1(0x0?, {0x14002ac02b9, 0xc})
	github.com/influxdata/telegraf/plugins/inputs/snmp/snmp.go:116 +0x148
created by github.com/influxdata/telegraf/plugins/inputs/snmp.(*Snmp).Gather in goroutine 13
	github.com/influxdata/telegraf/plugins/inputs/snmp/snmp.go:102 +0x68

@Hipska
Copy link
Contributor

Hipska commented Oct 15, 2024

Could you have a try with the build artifacts from #16027?

@llamafilm
Copy link
Contributor Author

Thank you! With that build hextoint:BigEndian:uint16 works.

@Hipska
Copy link
Contributor

Hipska commented Oct 18, 2024

Did you also test with a device responding something else than 0x00? As I think you will need to select LittleEndian 😉

@llamafilm
Copy link
Contributor Author

Ah! Thanks, you're right.

@llamafilm
Copy link
Contributor Author

Sorry, I spoke too soon. These results still don't make much sense to me.

With the lamp turned on I get this. The string is "on" and the number is 7.

snmpget -v2c -c public 10.91.77.132 BARCO-ME-DCP-MIB::lampStatus.0
BARCO-ME-DCP-MIB::lampStatus.0 = BITS: 01 on(7) 

With no conversion Telegraf returns lampStatus="".
With hextoint:LittleEndian:uint16 it returns lampStatus=1u. Why is it not number 7?

Now with the lamp turned off:

snmpget -v2c -c public 10.91.77.132 BARCO-ME-DCP-MIB::lampStatus.0
BARCO-ME-DCP-MIB::lampStatus.0 = BITS: 80 unused(0) 

With no conversion Telegraf returns lampStatus="80".
With hextoint:LittleEndian:uint16 it returns lampStatus=128u.
With hextoint:Big Endian:uint16 it returns lampStatus=32768u.
With enum it returns lampStatus="unused". This is correct but a string is harder to work with.

@Hipska
Copy link
Contributor

Hipska commented Oct 21, 2024

Yeah, your device is sending 0x8000 (could you confirm that with conversion hex please?) which correctly gets converted to 32768. I can't explain the 1u vs 7u situation, it would help if we know what the device is actually sending, net-snmp tools also translate the values by default.

@llamafilm
Copy link
Contributor Author

I'm not sure where you got the 2 extra zeroes; it looks to me like the device is sending 0x80. With conversion hex the results are like this. (I'm testing two different projectors, the first is is off and the latter is on.)

> snmp,host=Elliott-M2-4.local,source=10.91.77.132 lampStatus="80" 1729551902000000000
> snmp,host=Elliott-M2-4.local,source=10.37.156.50 lampStatus="01" 1729551902000000000

Wireshark also shows the same.
image

@Hipska
Copy link
Contributor

Hipska commented Oct 23, 2024

Oh, now I'm getting what this OID is returning. It is counting the bits backwards;

bits hex translation explanation
10000000 0x80 unused(0) bit 0 is enabled
00000001 0x01 on(7) bit 7 is enabled

Could you try again with conversion = 'displayhint'? As you did this in the beginning, but that device responded 0x00 instead, so could you test what you get when the device responds with 0x80 or 0x01?

If that result is still not helpful, best option is to use hex conversion, combined with a processors.enum mapping to map 80 and 01 to the values of your choice.

@llamafilm
Copy link
Contributor Author

llamafilm commented Oct 25, 2024

Wow, what a weird system! Now it makes sense, thank you.

With lamp on, displayhint returns lampStatus="01[on(7)]".

Using hex conversion plus processors.enum works for me. But perhaps a better solution would be to add a new conversion type called bits which would convert like this.

source output
10000000 0
01000000 1
00100000 2
00010000 3
00001000 4
00000100 5
00000010 6
00000001 7

In the same vein as my other FR for TruthValue... I was thinking it would be nice for Telegraf to have a conversion matching all possible SNMP data types. Not necessary though.

Edit: Thinking about this more, I think it's actually a vendor bug for this device to be sending a value of 00. Because in SNMP bits convention, a value of 0 should be 80. I'm using two different models of projector, both from the same vendor; one sends 00 and the other sends 80 to mean the same thing.

@llamafilm
Copy link
Contributor Author

I have another example, that hurt my brain for 10 minutes figuring this out. From the MIB:

        -- 1.3.6.1.4.1.12612.220.11.2.2.4.10
        lampErrorStatus OBJECT-TYPE
			SYNTAX BITS
			    {
                unused(0),
			    errorLampOffByProjector(4),
			    errorLampNoStrike(5),
			    errorDowserNotOpen(6),
			    lampOK(7)
				}
			MAX-ACCESS	read-only
			STATUS	current
			DESCRIPTION
				"This is the lamp activity status."
			::= { lampProperties 10 }

So I set conversion to hex and the processor like this:

  [[processors.enum.mapping]]
    field = "lampErrorStatus"
    default = -1
    [processors.enum.mapping.value_mappings]
      00 = 0 # unused
      01 = 7 # lampOK
      02 = 6 # errorDowserNotOpen
      04 = 5 # errorLampNoStrike
      08 = 4 # errorLampOffByProjector

If I were writing to influx, I could just use conversion enum with no processor. But I use prometheus which doesn't like strings.

@Hipska
Copy link
Contributor

Hipska commented Oct 25, 2024

I have the same problem using Graphite.

I think you need to quote your hex values for it to work? And need to add this to be complete: ”80” = 0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants