-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to connect to a new BMC #62
Comments
Even though the documentation states:
I suspect that this is a true "invalid padding" error because:
|
Thanks for digging into this! Failing presence pong and channel auth capabilities is fairly fundamental. This is prior to session establishment and so any kind of encryption. It should be possible to do a direct packet diff between The
Note the bit before - "care should be taken when displaying this": the library returns the granular error, but it ideally should be generalised before being bubbled up to anyone/anything else. Frankly IPMI security is poor enough that this isn't worth worrying about. |
I realized that ipmitool can't do DCMI ping/pong discovery on these nodes, either: # old BMC, works
$ ipmitool -N2 -R1 -I lanplus -L user -U "${user}" -E -H oldbmc -C 17 dcmi oob_discover -v
Running Get PICMG Properties my_addr 0x20, transit 0, target 0x20
Error response 0xc1 from Get PICMG Properities
Running Get VSO Capabilities my_addr 0x20, transit 0, target 0x20
Invalid completion code received: Invalid command
Discovered IPMB address 0x0
Received IPMI/RMCP response packet: IPMI Supported
# new BMC, fails
$ ipmitool -N2 -R1 -I lanplus -L user -U "${user}" -E -H newbmc -C 17 dcmi oob_discover -v
Running Get PICMG Properties my_addr 0x20, transit 0, target 0x20
Error response 0xc1 from Get PICMG Properities
Running Get VSO Capabilities my_addr 0x20, transit 0, target 0x20
Invalid completion code received: Invalid command
Discovered IPMB address 0x0 So, that was a red herring. We don't need ping/pong to work for our exporter. |
I hacked through cmd/describe/ to see which commands which work and which don't.
When I comment everything out from cmd/describe/ which doesn't work on the new BMC and compare the resulting output between the old and new BMCs, I notice some differences. Here is the sanitized diff: --- oldbmc.txt 2024-01-03 08:53:38.319268000 -0800
+++ newbmc.txt 2024-01-03 08:53:45.588590000 -0800
@@ -1 +1 @@
-2024/01/03 08:53:35 connected to 192.168.XX.YY:623 over IPMI v2.0
+2024/01/03 08:53:45 connected to 192.168.XX.ZZ:623 over IPMI v2.0
@@ -7 +7 @@
- Per-message auth: true
+ Per-message auth: false
@@ -10 +10 @@
- Null usernames: true
+ Null usernames: false
@@ -14,4 +14,4 @@
- ID: XX
- Revision: XX
- Manufacturer: XX
- Product: XX
+ ID: YY
+ Revision: YY
+ Manufacturer: YY
+ Product: YY
@@ -20 +20 @@
- Firmware (aux): XX
+ Firmware (aux): YY
@@ -25 +25 @@
- Identification: 0(Off)
+ Identification: 3(Unknown)
@@ -30 +30 @@
-2024/01/03 08:53:38 failed to get DCMI sensor info: read udp 192.168.AA.BB:44792->192.168.XX.YY:623: i/o timeout
+DCMI Sensors: I wonder if that "per-message auth" field is relevant? |
Thank you for these pointers! The issue turned out to be threefold:
I'd be happy to work on a PR for any or all of these changes. |
Nice work! A PR would be fantastic.
|
Apologies for the delay. That makes sense, and it is UTF-8. Separately, I ran into an issue where the BMC returns a malformed SDR that causes |
For the |
That sounds great.
Thanks for pointing this out! I was wrong when I said that the error occurred when decoding "the rest" of the SDR. The decoding error happens here. But, if I change |
Strange - the Get SDR response should always contain the next RecordID, even if the requested SDR is too long for the BMC to send in one go. Can I confirm the BMC is returning a normal completion code ( It sounds like we don't have much choice but to do a two-part read, either always, or in response to this kind of non-compliant behaviour. As much as I dislike the idea of doing it all the time and sending twice as many commands to read an SDR Repo, I think it is the most pragmatic solution. This should not necessitate implementing Reserve SDR Repository, however it may make sense to add at the same time, especially if your BMC's SDRs are long enough to require it. |
Good question. I hadn't noticed this before, but it's returning
The SDRs are not long enough to require partial reads, but the reservation ID is needed if we skip the header when requesting the rest of the SDR (i.e., if we set |
Assuming you have a support contract, it would be worth asking the vendor what they are expecting, as we are sending a valid request by the v2.0 spec. Does IPMItool give the same if forced to do v1.5?
That's valid, my thinking was it's cheaper to re-read the first 5 bytes than do two more RTTs for the reservation commands! |
That makes sense. I reached out to the vendor.
I haven't been able to get a v1.5 command to work. I tried
|
I suspect something similar to #56 is happening - I'm trying to connect to a new BMC and am unable to connect using this library (but I can connect with ipmitool). I suspect they started using some less common feature of IPMI which is not yet supported by this library.
The symptom is timeouts, unfortunately.
This works fine:
ipmitool -N2 -R1 -I lanplus -L user -U "${USER}" -P "${PASSWORD}" -H myhost -C 17 sdr list
.My custom prometheus exporter times out trying to download the SDR repo:
The sequence of calls is similar to cmd/describe/:
bmc.DialV2()
,transport.NewV2Session()
, and thenbmc.RetrieveSDRRepository
which times out.The text was updated successfully, but these errors were encountered: