-
-
Notifications
You must be signed in to change notification settings - Fork 361
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
APC Back-UPS BX1600MI spurious LOWBATT/REPLACEBATT events #2347
Comments
I have a new APC Back-UPS BX1200MI (basically the same model, just with a smaller battery) connected to TrueNAS that is using NUT 2.8.0, and there is some weird behavior. Sometimes, 1-2 times an hour, it triggers LOWBATT/REPLACEBATT events. In a debug log, I can see "[D2] parse_status: [OL DISCHRG CHRG LB RB]". When it works and the battery is charged to 100%, it seems to be switching between "OL" and "OL CHRG" every few seconds. I am not sure if that is a problem too, or just normal behavior, though. Here is a debug log from the driver at the time of an event:
|
CC @desertwitch : I think some of your earlier investigations were about similar behavior; any ideas here? Notably, in NUT master (after release 2.8.1) we had PR #2216 to address such notifications with new configuration options. From research and educated guesswork there, this may be just part of battery charge management which happens under the hood on all UPS devices that became visibly exposed on some firmwares. |
It seems to be driver related, something that the UPS is sending byte-wise that's being parsed wrongly into
I couldn't find an issue in the upsmon handling or the later processes. I've since had another report from a user with an APC BX1200MI-GR also experiencing intermediate |
Got another report of an affected Back-UPS BX750MI with spurious RB events. |
Same thing for me, I've the same problem with BX1200MI-GR |
Unfortunately my contact didn't come through with the UPS so we're back to square one. |
I am seeing the same behavior on a newly purchased BX1600MI-GR, manufactured October 2023, on Debian 12's nut 2.8.0-7. @desertwitch let me know if I can help by collecting logs or trying any patches |
Hello there! I can confirm this behavior is also happening with that version. The pattern seems to be random. It sometimes lasts a few seconds or a few minutes. |
Any chance you could try if the problem persists putting this flag in your UPS configuration in
So an example configuration would look like this:
Not sure if it will help or stop UPS recognition at all, but it's worth an attempt since we don't have an UPS at hand. @jimklimov : Any ideas from the logs provided above? It's a relatively popular entry-level line of APC so would be lovely if we can get this addressed one way or another, I'm sure many people would be appreciative. |
The only thing that comes to mind is to add a yet another throttle for such reports - so to not expose the status if RB appears and dissipates quickly (I'd be wary of tweaking LB like that... maybe optionally-throttle tied to known OL/OB/BYPASS status?) |
@desertwitch I just tried |
Thanks for the attempt, it was a long shot. Is there any logic to the statuses, as in do you notice a rapid succession or cycling through of certain statuses (LB, RB, LB, OL or OL, LB, RB, OL) in the logs? In general, do you have any logs you can provide us from NUT itself with timestamps so we can try to investigate this some more? Particularly interesting would be if the LB (low battery state) is happening at the same time the UPS is also OL or if the UPS goes OB before/after.
I'm curious though if this is something else entirely, which the driver misinterprets, or the UPS is actually sending these bogus statuses. I'm thinking it probably doesn't happen on their own APC software (if there is one for that series) or there would be more complaints on the APC forums, but considering that both APCUPSD and NUT are affected in a similar fashion it does seem to be some change in the firmware when compared to other functioning APC devices or even older models of the BX series which seem to work (firmware bug in newer models?). I agree with a LB throttle on its own being less than ideal, unless tied to another status or succession of states (if we can figure out any logic behind what's happening here). |
The general pattern is that it will usually be in a normal status like
Afterwards it goes back to a normal status. I've been running a patched |
I just got home and installed NUT through the unraid apps. I'd be happy to provide logs but not sure how. |
This is very valuable, thanks, so we can also see here that the UPS is doing some kind of - perhaps - calibration before these The other user's log above has shown a status of |
I don't know whether it is of any use, but I did not change anything (kept running git HEAD of nut) and the above strange behaviour stopped after a few days. It has been behaving normally for weeks now. Thanks, |
Just set NUT up normally through the GUI as you want to have it and watch the SYSLOG (Tools->System Log) for any such strange events being reported while NUT is started, they should pop up in the log by themselves if they occur on your system.
That is extremely curious, makes me wonder even more what the UPS is doing there - thanks though! |
Correct - I only get "is low" and "needs to be replaced". As a sanity check, I just tried pulling the plug for a few seconds and this is what I see in that case:
Thanks for taking the time to look into this :) |
Ah got it. They system log does show some low battery indications as seen below. I also pulled the plug and that all seems to be working fine. UPS switches to battery, I get an unraid notification, and it goes back online when plugged back in.
|
Thanks for the logs everyone - just to double-check back here, has anyone gotten a shutdown from this problem? |
My unraid was running fine until I hooked it up to the UPS yesterday evening. This morning it was off, and it started a parity check after turning it on even though nothing was scheduled. I can't find anything that logged the reason for the shutdown though. That was also using the built-in UPS utility, so I'll keep running NUT and see if I get any other unexpected shutdowns. |
So I just checked the system log again and found a whole bunch of errors. Don't know if it's related. I left all configuration options on the default values except for the setting the shutdown rules to 25% battery left. I'm also running powertop --auto-tune in case that's relevant.
|
Sorry I haven't been able to get much further on this issue either, as there's no apparent logic to these conditions. |
As the original reporter of this issue, I did go through this with APC support and they are not prepared to accept any software reports unless they are from their own Powerchute software on Windows. As these spurious events do not happen there, they closed my issue as not reproducible. So in short, APC do not accept there is any kind of bug to fix I'm afraid. Strangely enough as mentioned in earlier comment: the behaviour seems to have stopped for me by itself. It's been months and it hasn't happened. I have no idea why as all I did was change to what was at the time the HEAD of main. |
I am getting the exact same on my proxmox server with apcupsd and a APC Back UPS BX950 |
anyone else also getting lots of self test switch messages from apcupsd? i have BX750MI |
…etworkupstools#2347] Signed-off-by: Jim Klimov <jimklimov+nut@gmail.com>
…ay_sec et al [networkupstools#2347] Signed-off-by: Jim Klimov <jimklimov+nut@gmail.com>
Hello all, thank you for all the reports and logs. I've posted a prospective PR to address the situation by delaying LB/RB status propagation on impacted devices (user-configurable) and so hiding it from |
…o setting the lbrb_log_delay_without_calibrating flag [networkupstools#2347] Signed-off-by: Jim Klimov <jimklimov+nut@gmail.com>
For anyone here on Unraid, I have rolled out a testing package with the latest plugin update today. Please read the instructions shown on the "NUT Settings" page on how to switch to the testing package and if possible do report back if the problems are resolved for you (or not) - thanks a lot for your support. 😉 |
I’m currently on 6.12.11 and nut-2.8.2-x86_64-2master.ssl11 for about two days. Brand new Back-UPS BX1200MI, still getting the replacement battery error everyday, but the event count seems to be reduced. |
…r spurious LOWBATT/REPLACEBATT events on APC BXnnnnMI devices [networkupstools#2347] Signed-off-by: Jim Klimov <jimklimov+nut@gmail.com>
As suggested in the PR discussion, try setting
flags in your |
… tweaks since 2023)" [networkupstools#2347] Signed-off-by: Jim Klimov <jimklimov+nut@gmail.com>
…actly 0/1 [networkupstools#2347] When building a complex text expression, we rely on maths in some spots. Signed-off-by: Jim Klimov <jimklimov+nut@gmail.com>
If it wasn't obvious by that last comment and you've arrived here from a search engine looking for a quick fix: Set these flags in your
So the |
Is this problem finally solved? I've just bought an BX1200MI and I've connected to an UnRAID server and I'm receiving the RB notifications twice a day. |
To the best of my knowledge - yes. But it is not part of a NUT release yet (2.8,3 expected soonish) so probably not packaged in official repos. I think @desertwitch made some testing builds back then. |
Yes, we've had good results with the recent master builds and the new configuration toggles introduced. For Unraid, this seemed to help the majority of users with this problem: https://forums.unraid.net/topic/60217-plugin-nut-v2-network-ups-tools/page/70/#findComment-1494093
|
I have an APC Back-UPS BX1600MI connected by its supplied USB cable. With the version of nut in Debian 10 (2.7.4-8) this was unusable as the usbhid-ups driver kept disconnecting every few seconds.
I upgraded to the git HEAD of nut and communication is now stable, but spurious events come in every so often (once or twice an hour at the moment). When I call a notify script that calls
upsc
at the time of those spurious events, I can see that theups.status
does bear that out, but other values do not. Example:As you can see, the status does include
LB
andRB
, but the charge is still 100 and the runtime is as expected. The bad status lasts less than 2 seconds before returning to simplyOL
.I went to the effort of installing a Windows VM, passing USB through to it and trying APC's own Powerchute Serial Shutdown software in there. This does not report any spurious events. Both Powerchute and nut do report true events that I induce. A self-test of the device passed. I am unable to demonstrate any behaviour that APC consider incorrect.
I returned the UPS to the vendor as faulty and they sent a replacement. The replacement behaves the same.
I installed
apcupsd
just to see how it behaved. It maintained a connection but its rate of spurious events was even worse: every couple of minutes. Anotherapcupsd
user reports the same symptoms as me:https://sourceforge.net/p/apcupsd/mailman/message/58740970/
Interestingly, they had a BX1600MI working with
apcupsd
, replaced it with another BX1600MI and now they see what I see, implying that newer models of Back-UPS have something different about them even though they are the same model number.Is there any way to work around this sort of thing? It's almost like I need a way to not believe such statuses unless they persist for at least 5 seconds, or something.
The text was updated successfully, but these errors were encountered: