-
Notifications
You must be signed in to change notification settings - Fork 198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Netgear A8000 fails after approx 4 hours of uptime #381
Comments
Good report. I have 2 adapters based on the same chipset and am not seeing this. After pondering the issue for most of a day, I think what I would do is find another AP/wifi router to test with. If you don't see the problem on an alternate AP, then maybe it is time to take a hard look at the settings in your current AP. Start changing things one at a time with testing in between. Cheers, |
Thanks @morrownr I feel that even if something in the AP triggers the issue, there's definitely something getting stuck on the OS side too, given that it can't recover from the issue, the kernel module can't be removed, and even unplugging and plugging back in doesn't help. I'll probably upgrade my AP sometime this year so I'll check back in then. In the meantime I've worked around the issue by setting a cron job to run every 3 hours to restart the device as above (stop, rmmod, modprobe, start). It's not ideal to have 5-10 seconds of network outage every 3 hours, but better than needing a full reboot! Appreciate your time and suggestions. |
I saw new firmware flow into linux-wireless last week so it should be posted for download sometime this week or next. They don't post fixes with the firmware so I don't know if it will help. Is your AP one that is supported by OpenWRT? Tell me what brand and model it is and I will check. If it is supported, OpenWRT could be a solution to this problem. |
Thanks for the heads-up. I've applied the patch from the mailing list and turned off the workaround, so I should know in a few hours. driver: mt7921u It's a Netgear Nighthawk AXE3000 WiFi 6E USB 3.0 Adapter, Item No A8000-100PAS, in Australia - I can't see it in the OpenWRT table of hardware. |
I guess I was not clear or maybe I am confused. Your adapter will work with OpenWRT because OpenWRT has a driver for it but that was not what I was trying to say. It seemed to me from what you said that your AP/router might need to be upgraded so I thought OpenWRT might help...as long as it is a supported AP/router. |
Oh sorry I read AP as adapter, my fault. The AP is an Amplifi HD with an AmpliFi MeshPoint HD extending the range. |
Sadly the updated firmware has not fixed the issue for me. |
FYI: I noticed the 6.6 kernel is available in the Debian update system now. I upgraded to 6.6 from 6.5 last night with my Debian 12 installation. Don't know if that has anything to do with this problem. I looked at OpenWRT and I don't see support for your router.
If it were me with this problem, I would be in the web interface of the router checking the settings. I can't tell you what I would be looking for but I would be looking and researching the settings that are available. I'd probably also temporarily disable the AmpliFi MeshPoint HD extending the range to see if that has anything to do with it. WiFi is a cool thing but it is complicated and there are little incompatibilities here and there and sometimes we just have to find a configuration that works around problem issues. I really don't think your adapter is the problem but I could be wrong. Also, is the firmware in your router at the most current level? I've been helping people at this site for a few years. I've seen thousands of problem reports. In the last few years as technology has changed with updates including WPA3 and WiF6/6e/7, there have been growing pains for all operating systems. It has been a massive effort to get to where we are. I'm hoping that the bugs and problems stabilize over the next 2 years. For Linux, wireless support and especially usb adapter support has improved greatly over the last 5 years and will improve even more over the next year as more and more adapters are supported with good quality in-kernel drivers. Anyway, more than you wanted to know. |
No joy with the new kernel, same issue: I'll probably leave it at that for awhile, given that I have a reasonable (if dirty) workaround of restarting the kernel module every 3 hours. I agree that changing the AP settings or firmware level might work around the problem, but given the Windows PC next to me doesn't have the same issue with the same adapter and the same AP, I think the root cause is most likely on the Linux kernel side. If this issue pops up for other people and is worth looking at deeper, then I'm happy to collect crash dumps or run debug-enabled modules as required. Thanks for your effort. I've spent many years dealing with paid support much less responsive and helpful than you! |
Certainly may be. I'll keep an eye out. There are others that stop by here that have this adapter. Cheers and thanks for the kind words. |
I am experiencing similar behavior. I have the A8000 installed on a headless Raspberry Pi 4b running the latest Pi OS (Bookworm). This has the 6.6 kernel: My Pi stays up for variable periods of time but eventually becomes incommunicado with errors like these: My configuration has wlan0 (the internal NIC) configured as an AP. When the problem occurs, both network interfaces fail. The journalctl indicates that the Pi stays up (cron entries appear), but there is no way to access the machine. |
Hi @mousseq While I do not have a A8000, I do have 2 usb wifi adapters with the same chipset that uses the same driver and firmware. I also run my Pi4B headless.
What do I when I burn a clean sd? The first thing I do is turn off the internal wifi of my Pi4B. I don't think twice about it, i just it. It is a quality of driver issue. Broadcom... that is all I need to say. I don't see anything in your log that tells me the A8000 and its driver have anything to do with this problem. My Pi4B is also in AP mode. It serves up WiFi 6 on the 5 GHz band. It is ultra dependable. I can explain more about my setup if you wish. I may not be doing exactly what you are doing but you are welcome to ask. |
@morrownr , thanks for the response. My experience is a bit more complex. I have several Pi4s configured the same way, only with differing external NICs (Brostrend, Comfast, Netgear A6210). Only this one shows signs of instability. Frankly, I've had extensive success with this configuration. I recently installed another A8000 on a Pi4 running Ubuntu 23.10. It has been stable over the period that the Pi in question has restarted several times. Do you know if the Ubuntu driver is different from the Pi OS driver? - for either the internal or external NICs? |
The version of the driver, mt7921u, is based on the kernel version. If your Ubuntu installation is running kernel 6.5 and PiOS is running kernel 6.1, they have different versions of mt7921u. There is also the issue of firmware version. The firmware for the mt7921au has been updated several time over the last 2 years which is historically high for firmware. The Mediatel devs work this hard but we have gotten to the point that wifi is very complicated in modern WiFi 6 and 7 drivers. Things seem to have slowed down lately but knowing how to update your firmware is a good idea. To check on the driver info: $ ethtool -i wlan0 Notice how the driver version is the kernel version. Also note the version/date on the firmware. Modern in-kernel standards compliant Linux wifi drivers consist of multiple files. One of more driver files and one or more firmware file. The firmware is not part of the kernel, it is part of the distro and is easily upgraded by users or the distros maintainer. To upgrade your firmware: See section 3: When I work on my AP guide, I try out various ways of handling the networking. I always work toward the most stable setup possible. There are a lot of ways to set up an AP. How stable any particular setup is depends a lot on how well maintained are the various components. The networking may trigger what appears to be a bug in a driver and drivers are capable of taking down a system. Like I said, I will not use the Pi internal wifi as I do not trust the driver and I have little tolerance for unstable setups. I also use systemd-networkd for networking. It is a rock... I mean 24/7/365 solid. You can see my setup in the AP guide I have on the Main Menu here. My suggestions:
|
This is very useful. My Pi4 running Pi OS has this stack: ethtool -i wlan1driver: mt7921u The Pi4 running Ubuntu 23.10 has this stack: Neither one has the later stack you are running. I looked at the link to Section 3.2. My reading is that I should follow the instructions of the first mt7921u section. I tried this on the Ubuntu installation and A8000 was not recognized. I removed the new firmware and rebooted. The A8000 is still not recognized. |
Look in the directory where you copied the files to. If you see a .bin version and a .nz (compressed) version of the files you copied, delete the compressed files. |
The Pi descended into a pretty random state. I wound up burning the SD card with a new Ubuntu image, updating it, and then adding the patched firmware. This worked. Thanks for all your help. |
COVID for computers.
You are welcome. I hope this is a stable setup that works well with the A8000. |
Sadly, the new firmware did not change the system behavior. After a day or so, the machine falls into the incommunicado state. I enabled tracing in NetworkManager and managed to capture the transition in the attached journalctl transcript. Unfortunately, the log does indicate why the link drops (around line 3389). Curiously, there is no mention of wlan0 (the internal NIC configured as the AP/hotspot). |
That doesn't really help with what is causing the problem. Have you been able to test using a different AP/WiFi router? |
Please pardon the hiatus. I was away ... I have now tried several NICs (Netgear A8000, A6210, Comfast AX1800, and Panda AC1200) on a different router. All (repeat all) of these fail in the same way. The Pi stays up for some period of time and then the external NIC ceases to respond. I note that the internal NIC (configured as an access point) remains up. I can ssh in via the internal NIC. I can see that wlan1 exists but I cannot change its state. Neither iwlist nor nmcli succeed. If I restart the wpa_supplicant service, all networks stop and they do not restart. I'm beginning to suspect NetworkManager. |
The problem is likely due to the Pi's buggy USB implementation. I think you're right about Network Manager. |
Thanks for the input. The external NICs are directly connected (no cable, no hub). The OS is the latest Pi Bookworm. I am about to try Ubuntu just to see if the problem exists there as well. |
I found a reference to a scatter/gather problem associated with the NetGear NICs. When I disabled scatter/gather, the NIC became stable and the system remains up. https://github.com/morrownr/7612u/blob/main/mt76_usb.conf. |
Glad you were able to fix your issue. I mentioned the scatter gather workaround in my original description. Definitely a different root cause then. |
I was going to report that the problem resurfaced after several days. However, that problem turned out to be an issue with wayvnc. The network connections remain stable. |
I just reread this thread. I have been watching for similar issues and fixes but am not having much luck. It is very possible that your problems are not with the mt7921u driver. It takes more than the wifi driver to make these things go. Have you tried the adapter in a USB2 port? If it stabilizes in a USB2 port, that could indicate a problem with the USB3 hardware driver or with the USB3 hardware. What hardware are you running? Do you know the USB3 chip? |
My previous declaration of solved was premature. The scatter/gather patch fixes the problem for the 6210 NIC but not the 8000 NIC. With the 8000 in place, the network stack ceases to work after about a day. |
The infamous VL805 USB 3.0 controller. RasPi has made many bad hardware selections over the years and this is one of their worst. Try sticking the A8000 into one of the usb2 ports. It won't be as fast but if it works and stays up, that will give us an idea of where the problem is. |
I swapped the NIC to a USB 2 port. This made no difference (the communications dropped after a day). That suggests (but is not conclusive) that the problem is not with the USB controller. |
What you say is spot on. This type of problem can be hard to solve and there are a lot of things that are involved. It could be a setting or bug in your AP/router. It could be a bug in the driver or in the distro. One of the key things that I address in the README of the out-of-kernel driver repos here is to set the AP/router so that each ssid has a different name so you can control which band you are connected to. If you are using a dfs channel on 5 GHz, that could be an issue if a conflicting radio signal happens and the AP/router does not handle it well and many do not. Actually there are a lot of AP/router settings that could contribute to a problem like this. Could it be problems with power saving settings in the bios of your computer? Yes. Does it happen when you are using the system or do you only see this after the system has gone down into a power saving mode? |
I bought an A8000 (0846:9060), upgraded my kernel to 6.5 (from Debian bookworm-backports x86-64), and was happy with that it worked immediately and transferred data faster than my old USB WI-FI.
However, the wireless connection consistently fails after about 4 hours, sometimes closer to 5 hours. It has never failed at less than 4 hours and it has never stayed up for longer than 5 hours. Connectivity drops completely and the device can no longer even scan for networks. The failure is associated with this message in dmesg (note: these are separate events on separate boots of the OS)
In all cases it seems my only option is to reboot at that point.
What I've tried
Upgrading the device firmware.
It was originally:
After adding the new firmware to /lib/firmware/mediatek and rebooting:
Still no change in behaviour. It fails after 4 to 5 hours seconds with the "deauthenticated" message in dmesg.
Restarting the device
If I do this BEFORE the issue has occurred, it is successful and seems to reset the time until it fails:
(initial messages from 11886 to 11909 triggered by the above commands)
Then 14738 seconds later, the network device fails as usual with the same message!
[26647.892679] wlx9418655ec8xx: deauthenticated from fa:9f:c2:d4:52:xx (Reason: 2=PREV_AUTH_NOT_VALID)
If I try and do a
sudo systemctl stop NetworkManager
orsudo rmmod mt7921u
AFTER the issue has occurred, it doesn't work and the console keeps repeating:kernel:[76150.619079] unregister_netdevice: waiting for wlx9418655ec8xx to become free. Usage count = 2
No difference with
rmmod -f
.No difference with different orders of this and of unplugging and plugging in the device.
Scatter-gather
I have tried
echo 1 > /sys/module/mt76_usb/parameters/disable_usb_sg
with no effect.5GHz vs 2GHz
My AP (Unifi) supports 2GHz and 5GHz. I get the same behaviour whether I use the "floating" SSID (which could be 2GHz or 5GHz) or if I choose the specific 2GHz or 5GHz-specific SSID.
Any tips on what to try next?
The text was updated successfully, but these errors were encountered: