Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raspberry PI 3B+ and NUT (Network UPS Tools) #2977

Open
Zaitchikk opened this issue May 18, 2019 · 35 comments
Open

Raspberry PI 3B+ and NUT (Network UPS Tools) #2977

Zaitchikk opened this issue May 18, 2019 · 35 comments
Labels
Waiting for external input Waiting for a comment from the originator of the issue, or a collaborator.

Comments

@Zaitchikk
Copy link

Zaitchikk commented May 18, 2019

Describe the bug
I have a Powercom SPD-850U UPS. I try to connect this UPS with my Raspberry PI 3B + via NUT,
I came to the conclusion - all works nice with Raspbian (2018-03-13), the first one that supports 3B +, but does not work with the latest version of Raspbian (2019-04-08). I think, the algorithm for working with the USB bus has changed in the latest versions of the kernel. And this is bad, because it breaks down the work of previously debugged programs.

To reproduce
I use the following NUT driver settings.
This works on both the Debian desktop and Raspbian(2018-03-13) with this UPS.But does not work on Raspbian (2019-04-08)

sudo nano /etc/nut/ups.conf
[spider]
driver = usbhid-ups
port = auto
desc = "Poaercom SPD-850U"

Expected behaviour
The NUT driver start command must start the NUT driver

$ sudo upsdrvctl start
Network UPS Tools - UPS driver controller 2.7.4
Network UPS Tools - Generic HID driver 0.41 (2.7.4)
USB communication driver 0.33
Using subdriver: PowerCOM HID 0.5

Actual behaviour
Attempt to launch NUT driver

sudo upsdrvctl start
Network UPS Tools - UPS driver controller 2.7.4
Network UPS Tools - Generic HID driver 0.41 (2.7.4)
USB communication driver 0.33
No matching HID UPS found
Driver failed to start (exit status=1)

System

Raspberry PI3B+

$ cat /etc/rpi-issue
Raspberry Pi reference 2019-04-08
Generated using pi-gen, https://github.com/RPi-Distro/pi-gen, e910bf55ac92e27c6e77043e8995bcb5e0a2d742, stage5

$ vcgencmd version
Mar 27 2019 15:45:53
Copyright (c) 2012 Broadcom
version 2e98b31d18547962e564bdf88e57b3df7085c29b (clean) (release) (start)

$ uname -a
Linux raspberrypi 4.14.98-v7+ #1200 SMP Tue Feb 12 20:27:48 GMT 2019 armv7l GNU/Linux

Logs

$ dmesg
[ 937.439522] usb 1-1.1.2: new low-speed USB device number 6 using dwc_otg
[ 937.600259] usb 1-1.1.2: New USB device found, idVendor=0d9f, idProduct=0004
[ 937.600275] usb 1-1.1.2: New USB device strings: Mfr=3, Product=1, SerialNumber=2
[ 937.600284] usb 1-1.1.2: Product: HID UPS Battery
[ 937.600292] usb 1-1.1.2: Manufacturer: POWERCOM Co.,LTD
[ 937.600301] usb 1-1.1.2: SerialNumber: 004-0D9F-000
[ 937.808194] hid-generic 0003:0D9F:0004.0002: hiddev96,hidraw0: USB HID v1.00 Device [POWERCOM Co.,LTD HID UPS Battery] on usb-3f980000.usb-1.1.2/input0

$ sudo /lib/nut/usbhid-ups -DD -a spider
Network UPS Tools - Generic HID driver 0.41 (2.7.4)
USB communication driver 0.33
0.000000 debug level is '2'
0.002877 upsdrv_initups...
0.004120 Checking device (0D9F/0004) (001/006)
0.006031 - VendorID: 0d9f
0.006103 - ProductID: 0004
0.006176 - Manufacturer: unknown
0.006273 - Product: unknown
0.006366 - Serial Number: unknown
0.006461 - Bus: 001
0.006557 - Device release number: 0002
0.006650 Trying to match device
0.006765 Device matches
0.007697 Unable to get HID descriptor (error sending control message: Broken pipe)
0.007879 HID descriptor length 996
0.008555 Unable to get Report descriptor: Broken pipe

@Zaitchikk
Copy link
Author

Hey anyone! Can I hope that the problem will ever be fixed?

@ghost
Copy link

ghost commented Jun 25, 2019

Please contact whoever maintains NUT. They are responsible for this working or not in the first instance. If they come back and say it is a kernel bug then feel free to file a bug here, but as things stand this is not a kernel bug so this is not the correct place to report it.

@Zaitchikk
Copy link
Author

but as things stand this is not a kernel

If in the earlier version of Linux for Raspberry, the program worked normally with USB ports, and after updating this operating system, the program stops working, most likely, there have been some changes in the kernel of the operating system. And if we also take into account that the program works fine on Linux for X86 / 64 platforms, we can conclude that the Linux kernel for Raspberry does not work with USB ports in a standard way.

@6by9
Copy link
Contributor

6by9 commented Jul 1, 2019

It's difficult to debug without the hardware.

The suspicion would be that it is something that has changed between the 4.9 and 4.14 kernels. It'd be interesting to know if it worked against something like Ubuntu 18.10 or 19.04 on an x86 PC which will be using a similar kernel (actually 4.18 and 5.0 respectively IIRC)

You can use rpi-update to download particular releases using hashes from https://github.com/Hexxeh/rpi-firmware/commits/master sudo rpi-update 5c80565 should get the last 4.9 release.
PLEASE DO NOT DO THIS ON A CRITICAL SYSTEM.
sudo apt install --reinstall raspberrypi-kernel raspberrypi-bootloader should get you back to apt's version of the kernel and firmware.

@Zaitchikk
Copy link
Author

It's difficult to debug without the hardware.

You can check on any other UPS that is supported by the driver "usbhid-ups".

It'd be interesting to know if it worked against something like Ubuntu 18.10 or 19.04 on an x86 PC

I tested this UPS on the "Raspberry Pi Desktop". The system worked under "VMware® Workstation 15 Player 15.1.0 build-13591040" Everything works.
Here is the version of that system.

$ uname -a
Linux raspberry 4.9.0-9-amd64 #1 SMP Debian 4.9.168-1+deb9u2 (2019-05-13) x86_64 GNU/Linux

@Zaitchikk
Copy link
Author

It'd be interesting to know if it worked against something like Ubuntu 18.10 or 19.04 on an x86 PC

Now I tested on Debian 10 under the same Workstation 15 Player. Everything works great.

$ uname -a
Linux v 4.19.0-5-amd64 #1 SMP Debian 4.19.37-5 (2019-06-19) x86_64 GNU/Linux

@clepple
Copy link

clepple commented Aug 7, 2019

Hi, one of the NUT developers here. @Zaitchikk can confirm, but the base version of NUT in the two versions of Raspbian and Debian 10 is the same (2.7.4). As I understand it, the other differences between the working version and the non-working version of Raspbian are the kernel version, and potentially libusb-0.1, which NUT 2.7.4 uses to talk to this particular UPS. What about the versions of libusb? (dpkg -l libusb-0.1\*) If not libusb, it is likely a kernel regression.

@pelwell
Copy link
Contributor

pelwell commented Aug 8, 2019

Hi, @clepple - good to have you here. From the usbhid-ups debug output above it seems like the tool is failing quite early. Could you (or anybody) run it under strace on working and non-working systems so we can see the system calls that lead to the error?

# On the old, working system
$ sudo strace -o strace_good.txt /lib/nut/usbhid-ups -DD -a spider
# And on the new, broken system
$ sudo strace -o strace_bad.txt /lib/nut/usbhid-ups -DD -a spider

Feel free to change any of the usbhid-ups parameters as you see fit, as long as they are the same in both cases.

@Zaitchikk
Copy link
Author

What about the versions of libusb?

libusb-0.1-4:a 2:0.1.12-30 armhf userspace USB programming library

@Zaitchikk
Copy link
Author

And on the new, broken system
$ sudo strace -o strace_bad.txt /lib/nut/usbhid-ups -DD -a spider

strace_bad.txt

I cannot quickly send debug output for the working system (Raspbian (2018-03-13)), because I will have to install it. But I will do it now, and will send a little later.

@pelwell
Copy link
Contributor

pelwell commented Aug 8, 2019

Thanks. There are a lot of EACCESS and EPERM errors in the trace - can you just confirm that you ran it with sudo?

@Zaitchikk
Copy link
Author

can you just confirm that you ran it with sudo?

Yes

By the way, I can quickly send debug output to the working desktop Debian 10 (under VMware® Workstation 15 Player). I do not know if this is necessary.

@pelwell
Copy link
Contributor

pelwell commented Aug 8, 2019

Yes please - the two systems ought to look about the same at this level.

@Zaitchikk
Copy link
Author

Here for the desktop Debian10
strace_good.txt

@pelwell
Copy link
Contributor

pelwell commented Aug 8, 2019

Thanks again - the output is very different, but I can see that the syscalls that return EPIPE on the broken system:

write(2, "Checking device (0D9F/0004) (001"..., 38) = 38
open("/dev/bus/usb/001/004", O_RDWR)    = 4
ioctl(4, USBDEVFS_CONTROL, 0x7e9807c4)  = -1 EPIPE (Broken pipe)
ioctl(4, USBDEVFS_CONTROL, 0x7e9807c4)  = -1 EPIPE (Broken pipe)
ioctl(4, USBDEVFS_CONTROL, 0x7e9807c4)  = -1 EPIPE (Broken pipe)

are succeeding on the Debian 10 system:

write(2, "Checking device (0D9F/0004) (002"..., 38) = 38
openat(AT_FDCWD, "/dev/bus/usb/002/004", O_RDWR) = 4
ioctl(4, USBDEVFS_CONTROL, 0x7ffc26aaaa50) = 6
ioctl(4, USBDEVFS_CONTROL, 0x7ffc26aaaa50) = 34
ioctl(4, USBDEVFS_CONTROL, 0x7ffc26aaaa50) = 6

Did you try downgrading to a 4.9 kernel as @6by9 suggested:

You can use rpi-update to download particular releases using hashes from https://github.com/Hexxeh/rpi-firmware/commits/master sudo rpi-update 5c80565 should get the last 4.9 release.
PLEASE DO NOT DO THIS ON A CRITICAL SYSTEM.
sudo apt install --reinstall raspberrypi-kernel raspberrypi-bootloader should get you back to apt's version of the kernel and firmware.

@pelwell
Copy link
Contributor

pelwell commented Aug 8, 2019

A safer thing to try would be to use the upstream dwc2 driver instead of the heavily optimised dwc_otg driver that is the default on Raspberry Pi kernels. Simply add dtoverlay=dwc2 to config.txt and reboot. USB performance will be reduced, so this isn't a long-term fix, but it might help to pinpoint the cause.

@Zaitchikk
Copy link
Author

Simply add dtoverlay=dwc2 to config.txt and reboot.

Yes it works !! I suspected there was a way to get 3B + to work as before. I think it should somehow better describe this method in the documentation. The problem could be solved a couple of months ago.

I bought Raspberry in order to get some SNMP card functionality for the UPS. USB speed is not important to me at all.

And by the way ... if it's still interesting. I promised. Here is the debug output for a working system (Raspbian (2018-03-13)).
strace_good.txt

@pelwell
Copy link
Contributor

pelwell commented Aug 8, 2019

That's an interesting result, as it narrows it down to a driver regression.

If I could ask for one more thing, could you do the same strace capture using the dwc2 driver? strace_dwc2.txt would be a good name for the output file.

@Zaitchikk
Copy link
Author

OK!
Here it is
strace_dwc2.txt

@pelwell
Copy link
Contributor

pelwell commented Aug 8, 2019

Thank you very much - that's a big help. It shows that dwc2 today behaves the same as the dwc_otg used to, which should guide our USB expert when he looks into the problem.

@P33M
Copy link
Contributor

P33M commented Aug 12, 2019

Can you test rpi-update firmwares in sequence to find the first failing one?

Use hashes from this repository: https://github.com/Hexxeh/rpi-firmware/commits/master

And do sudo rpi-update <hash> to install a specific version.

@Zaitchikk
Copy link
Author

Zaitchikk commented Aug 20, 2019

Can you test rpi-update firmwares in sequence to find the first failing one?

I checked only official releases downloaded from http://downloads.raspberrypi.org/raspbian/images/
Raspbian (2018-03-13) worked.
uname -a
Linux raspberrypi 4.9.80-v7+ #1098 SMP Fri Mar 9 19:11:42 GMT 2018 armv7l GNU/Linux
The next version - Raspbian (2018-04-18) no longer worked.

I tried your way. And this version is no longer working
uname -a
Linux raspberrypi 4.14.27-v7+ #1100 SMP Fri Mar 16 13:51:48 GMT 2018 armv7l GNU/Linux

It occurred to me that I jumped over the version. I set the previous one. And it doesn’t work either.
uname -a
Linux raspberrypi 4.14.26-v7+ #1099 SMP Wed Mar 14 14:59:28 GMT 2018 armv7l GNU/Linux

@Zaitchikk
Copy link
Author

Simply add dtoverlay=dwc2 to config.txt and reboot.

Not for long I was happy. It turned out that with this setting, the NUT driver, although it starts up, falls off periodically.
After that, the debug output is as follows:
strace_bad_dwc2.txt
It is treated only by rebooting Raspberry.

@P33M
Copy link
Contributor

P33M commented Aug 21, 2019

That's irritating. The breakage corresponds to a kernel major version update and not anything to do with a dwc_otg driver change.

Using the dwc_otg driver, can you try capturing the USB "traffic" in both cases using with usbmon/wireshark, then upload the traces somewhere?
https://discussions.flightaware.com/t/monitoring-usb-utilization-on-raspberry-pi/16421

Instead of filtering on address/endpoint, just dump the whole output from usbmon1 until after the transmission breaks (or starts normally on 4.9). Note: don't enter any passwords via keyboard when wireshark is capturing as these will appear in the packets. You may also want to disconnect the network cable while capture is ongoing to reduce the amount of spam created.

@Zaitchikk
Copy link
Author

https://discussions.flightaware.com/t/monitoring-usb-utilization-on-raspberry-pi/16421

I cannot repeat this instruction.
It seems to me this line of script is not working.

$ sudo mount -t debugfs none_debugs /sys/kernel/debug
mount: none_debugs is already mounted or /sys/kernel/debug busy

At least I don't see any of the instructions in the wireshark window

@P33M
Copy link
Contributor

P33M commented Aug 21, 2019

It appears that debugfs is already mounted so you can skip that step.

You may need to run wireshark as root (from a terminal on the desktop) in order to see the usbmon interfaces.

@Zaitchikk
Copy link
Author

Zaitchikk commented Aug 21, 2019

to run wireshark as root

Yes, it did.

Now explain in more detail what conditions should be.
You want me to start debugging in the current configuration (Raspbian 2019-07-10 with the native dwc_otg driver), at the time the NUT driver starts.
Did I understand correctly?

@P33M
Copy link
Contributor

P33M commented Aug 21, 2019

Yes, capture with the default driver configuration. Ideally, start the capture with the device unplugged, then plug the device in, then start NUT. You can stop the capture after the communication breaks in the newer version of kernel (4.14.26) and after communication starts successfully in the older version of kernel (4.9.80).

@Zaitchikk
Copy link
Author

Here is the first attempt. I rebooted Raspberry with the UPS turned off, then did as you wrote. The file contains the result. If you got what you need, let me know. I will repeat the process on 4.14.26
4.9.80_successful_start.pcapng.zip

@Zaitchikk
Copy link
Author

I tried to install an older system, designated as "firmware: Rebuild from correct tree" dated March 10, 2018. Its version:

$ uname -a
Linux raspberrypi 4.14.24-v7 + # 1097 SMP Mon Mar 5 16:42:05 GMT 2018 armv7l GNU / Linux

And the NUT driver did not start either. Could there be a problem in "rebuild ... ing"?
Here is the wireshark debug output:
4.14.24_Failed_start.pcapng.zip

@Zaitchikk
Copy link
Author

Zaitchikk commented Sep 30, 2019

As I said, NUT start with the dwc2 driver does not work stably.

I was have a little dance with a tambourine around the fire, and I found one new and very strange circumstance. Perhaps it will prompt developers to solve this annoying problem.

I was able to run NUT with the default driver (dwc_otg) with my Powercom SPD-850U UPS.
The sequence is as follows:

  1. Run Raspberry with a UPS connected via USB. It is necessary to start exactly “with a UPS connected via USB”, because if we connect the UPS after starting Raspberry, the magic will not work.
    After this procedure, all checks show that the NUT driver did not start, and cannot be started manually.
  2. Take out the usb connector of the UPS and connect it to another Raspberry USB port. It is necessary to connect it to another USB port, otherwise the magic will not work.
    If you want to connect it to the same USB port, you must wait more than 10 seconds, (But this does not always work.)
  3. Run the NUT driver start command.
        sudo upsdrvctl start
    After that NUT works, but I'm not sure which is stable.

The following sequence is also possible:

  1. Start Raspberry with the UPS disconnected on USB.
  2. Connect the UPS via USB. We can see that the NUT driver did not start and does not start manually.
  3. Reconnect the UPS to the OTHER USB port. The NUT driver still cannot be started manually.
  4. Execute commands
      sudo / home / pi / hub-ctrl -h 1 -P 2 -p 0
      sudo / home / pi / hub-ctrl -h 1 -P 2 -p 1

Source here
https://github.com/codazoda/hub-ctrl.c
https://github.com/codazoda/hub-ctrl.c/wiki/Raspberry-Pi
5. After that, the NUT driver is started manually by the command
    sudo upsdrvctl start
After that NUT works, but I'm not sure which is stable.

After these procedures, the NUT can be reconnected to any port. The driver picks it up automatically ... BUT only until the next restart of Ruspberry.

Kernel version

$ uname -a
Linux raspberrypi 4.19.66-v7+ #1253 SMP Thu Aug 15 11:49:46 BST 2019 armv7l GNU/Linux

@P33M
Copy link
Contributor

P33M commented Jan 14, 2020

Whoops, forgot about this one - what with Pi4 release and all.

The traces you've provided are mostly identical up until the point the host requests a string descriptor for index 0x0. In the working case, a 6-byte string descriptor is returned (two LANGIDs), in the broken case you get -EPIPE (protocol error). All subsequent transfers are broken in the same manner.

Can you try with latest rpi-update firmware? It bumps kernel to 4.19 and also has a few dwc_otg fixes for small transfers such as these.

@JamesH65
Copy link
Contributor

@Zaitchikk Did you have chance to try with the latest firmware? I suspect that apt update apt full-upgrade would be sufficient, rather than using rpi-update.

@JamesH65 JamesH65 added Waiting for internal comment Waiting for comment from a member of the Raspberry Pi engineering team Waiting for external input Waiting for a comment from the originator of the issue, or a collaborator. and removed Waiting for internal comment Waiting for comment from a member of the Raspberry Pi engineering team labels Feb 13, 2020
@Zaitchikk
Copy link
Author

I tried "apt full-upgrade" now. No effect. Only those magic dances with a tambourine, about which I wrote above, allow me to launch NUT.

@hugazo
Copy link

hugazo commented Jul 9, 2021

Sorry the late bump, but i'm having exactly this issue and since i have a raspberry pi 4, i don't know if the kernel reversal solution could work for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Waiting for external input Waiting for a comment from the originator of the issue, or a collaborator.
Projects
None yet
Development

No branches or pull requests

7 participants