-
Notifications
You must be signed in to change notification settings - Fork 346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MT7981 5GHz occasionally cannot disconnect clients that have left and causes bad performance. #922
Comments
Is the second device also connected to the network? |
This devices on list is in 2.4GHz |
Can you list your wifi clients(device models)? |
I can't, due this device is running as AP on a restaurant for administrative and client's Wi-Fi |
Looks like Qualcomm QCA9377 + windows 10 driver + 5GHz can cause this. No problems on 2.4 band. |
Do you have driver 10.0.0.1272 for Windows installed? |
I not understood, Wi-Fi 5GHz adapter with QCA9377 is causing 5GHz network bad performance? I don't have QCA9377 on network and the router is mediatek. |
How can you be sure?
|
The clients only use smartphones. |
If QCA9377 can affect 5GHz AP on mt76+mt7915(mt7981), then maybe some other clients can do the same. I'm not an owner of QCA9377. I just helped a user to isolate the problem on openwrt 23.05.5 mt7981 device. @nbd168 what do you think about this? |
One thing you could try is copy the latest MT7981 firmware from https://github.com/openwrt/mt76/tree/master/firmware to your device. If that doesn't help, trying a recent snapshot might also be a good idea. |
Already done this, it didn't help.
That user didn't want to experiment with snapshot. Connecting QCA9377 to 2.4GHz AP solved issue with 5GHz AP for him. |
I'd say there are too little details we could help you |
Openwrt 23.05.5. H3C Magic NX30 Pro. Same issue here. Encountered it several times Almost zero speed (1kb/s) through 5G wifi. Enough for DHCP but anything else will be broken, even ping. I noticed that when this happening, there are 2 dead clients (which maybe leave the wifi range at the same time) in luci wifi page. With RX Rate / TX Rate 6.0 Mbit/s, 20 MHz. If I manually click the "Disconnect" button, the wifi works again immediately. |
More info Also, when I check the log. The log keeps showing that the two offline clients were still AP-STA-POLL-OK. Started when they were out of the wifi range, till I clicked the luci "Disconnect" button. P.S. OFFLINE:MAC:1 OFFLINE:MAC:2 are clients that went away. Wed Nov 20 19:33:33 2024 daemon.notice hostapd: phy1-ap0: AP-STA-POLL-OK **OFFLINE:MAC:1**
Wed Nov 20 19:35:31 2024 daemon.notice hostapd: phy1-ap0: AP-STA-POLL-OK **OFFLINE:MAC:2**
Wed Nov 20 19:38:44 2024 daemon.notice hostapd: phy1-ap0: AP-STA-POLL-OK **OFFLINE:MAC:1**
Wed Nov 20 19:40:51 2024 daemon.notice hostapd: phy1-ap0: AP-STA-POLL-OK **OFFLINE:MAC:2**
Wed Nov 20 19:44:03 2024 daemon.notice hostapd: phy1-ap0: AP-STA-POLL-OK **OFFLINE:MAC:1**
...
Wed Nov 20 20:06:42 2024 daemon.notice hostapd: phy1-ap0: AP-STA-DISCONNECTED **OFFLINE:MAC:1**
Wed Nov 20 20:06:44 2024 daemon.notice hostapd: phy1-ap0: AP-STA-DISCONNECTED **OFFLINE:MAC:2**
Wed Nov 20 20:06:47 2024 daemon.info hostapd: phy1-ap0: STA **OFFLINE:MAC:1** IEEE 802.11: deauthenticated due to local deauth request
Wed Nov 20 20:06:49 2024 daemon.info hostapd: phy1-ap0: STA **OFFLINE:MAC:2** IEEE 802.11: deauthenticated due to local deauth request When I restart the 5g wifi a few minutes later. Another sus log. Wed Nov 20 20:13:06 2024 kern.warn kernel: [2135649.716364] Ignoring NSS change in VHT Operating Mode Notification from **OFFLINE:MAC:1** with invalid nss 2
Wed Nov 20 20:13:06 2024 kern.info kernel: [2143605.339316] device phy1-ap0 left promiscuous mode
Wed Nov 20 20:13:06 2024 kern.info kernel: [2143605.354371] br-lan: port 5(phy1-ap0) entered disabled state
Wed Nov 20 20:13:07 2024 daemon.notice wpa_supplicant[1538]: Set new config for phy phy1
Wed Nov 20 20:13:07 2024 daemon.notice hostapd: Set new config for phy phy1: /var/run/hostapd-phy1.conf
Wed Nov 20 20:13:07 2024 daemon.notice hostapd: Reload config for bss 'phy1-ap0' on phy 'phy1'
Wed Nov 20 20:13:07 2024 daemon.notice hostapd: phy1-ap0: AP-STA-DISCONNECTED **AN:ONLINE:CLIENT:MAC:1**
Wed Nov 20 20:13:08 2024 daemon.notice hostapd: Reloaded settings for phy phy1
Wed Nov 20 20:13:08 2024 daemon.notice netifd: Wireless device 'radio1' is now up
Wed Nov 20 20:13:08 2024 daemon.notice netifd: Network device 'phy1-ap0' link is up
Wed Nov 20 20:13:08 2024 kern.info kernel: [2143607.148600] br-lan: port 5(phy1-ap0) entered blocking state
Wed Nov 20 20:13:08 2024 kern.info kernel: [2143607.154384] br-lan: port 5(phy1-ap0) entered disabled state
Wed Nov 20 20:13:08 2024 kern.info kernel: [2143607.160337] device phy1-ap0 entered promiscuous mode
Wed Nov 20 20:13:08 2024 kern.info kernel: [2143607.165646] br-lan: port 5(phy1-ap0) entered blocking state
Wed Nov 20 20:13:08 2024 kern.info kernel: [2143607.171424] br-lan: port 5(phy1-ap0) entered forwarding state
Wed Nov 20 20:13:09 2024 daemon.info dnsmasq[1]: read /etc/hosts - 12 names
Wed Nov 20 20:13:09 2024 daemon.info dnsmasq[1]: read /tmp/hosts/dhcp.cfg01411c - 4 names
Wed Nov 20 20:13:09 2024 daemon.info dnsmasq-dhcp[1]: read /etc/ethers - 0 addresses
... Wireless config cat /etc/config/wireless
config wifi-device 'radio0'
option type 'mac80211'
option path 'platform/18000000.wifi'
option channel '1'
option band '2g'
option htmode 'HT20'
option country 'CN'
option cell_density '0'
config wifi-iface 'default_radio0'
option device 'radio0'
option network 'lan'
option mode 'ap'
option ssid 'ssid1'
option encryption 'psk2+ccmp'
option key 'WIFIPASSWD'
config wifi-device 'radio1'
option type 'mac80211'
option path 'platform/18000000.wifi+1'
option channel '149'
option band '5g'
option htmode 'HE80'
option country 'CN'
option cell_density '0'
option txpower '27'
config wifi-iface 'default_radio1'
option device 'radio1'
option network 'lan'
option mode 'ap'
option ssid 'ssid2'
option encryption 'sae-mixed'
option key 'WIFIPASSWD' May related: |
I reproduced this bug. If a client leaves the WiFi coverage, there is a certain probability (10%? i guess) that the above bug will occur. It is almost the same as this issue openwrt/openwrt#14415 . But it also causes bad wifi performance. (In my case this is extremely bad, < 1kb/s, other clients can still connect but only enough for DHCP to complete and anything else will be broken, even ping.) Log keeps showing AP-STA-POLL-OK after the client left. (p.s. I added ...
Thu Nov 21 09:25:38 2024 daemon.notice hostapd: phy1-ap0: AP-STA-POLL-OK **WENT:AWAY:CLINET:MAC**
Thu Nov 21 09:26:46 2024 daemon.notice hostapd: phy1-ap0: AP-STA-POLL-OK **WENT:AWAY:CLINET:MAC**
Thu Nov 21 09:27:56 2024 daemon.notice hostapd: phy1-ap0: AP-STA-POLL-OK **WENT:AWAY:CLINET:MAC**
Thu Nov 21 09:29:04 2024 daemon.notice hostapd: phy1-ap0: AP-STA-POLL-OK **WENT:AWAY:CLINET:MAC**
Thu Nov 21 09:30:24 2024 daemon.notice hostapd: phy1-ap0: AP-STA-POLL-OK **WENT:AWAY:CLINET:MAC**
Thu Nov 21 09:31:33 2024 daemon.notice hostapd: phy1-ap0: AP-STA-POLL-OK **WENT:AWAY:CLINET:MAC**
Thu Nov 21 09:32:39 2024 daemon.notice hostapd: phy1-ap0: AP-STA-POLL-OK **WENT:AWAY:CLINET:MAC**
Thu Nov 21 09:33:44 2024 daemon.notice hostapd: phy1-ap0: AP-STA-POLL-OK **WENT:AWAY:CLINET:MAC**
Thu Nov 21 09:34:51 2024 daemon.notice hostapd: phy1-ap0: AP-STA-POLL-OK **WENT:AWAY:CLINET:MAC**
... iw shows the client still "associated". iw dev phy1-ap0 station dump
Station **WENT:AWAY:CLIENT:MAC** (on phy1-ap0)
inactive time: 46190 ms
rx bytes: 7315589
rx packets: 52352
tx bytes: 66444699
tx packets: 69473
tx retries: 6987
tx failed: 7033
rx drop misc: 2
signal: -95 [-97, -99] dBm
signal avg: -91 [-93, -95] dBm
tx bitrate: 6.0 MBit/s
tx duration: 83677141 us
rx bitrate: 6.0 MBit/s
rx duration: 4720659 us
last ack signal:-96 dBm
avg ack signal: -95 dBm
airtime weight: 256
authorized: yes
authenticated: yes
associated: yes
preamble: short
WMM/WME: yes
MFP: no
TDLS peer: no
DTIM period: 2
beacon interval:100
short preamble: yes
short slot time:yes
connected time: 8708 seconds
associated at [boottime]: 2183028.795s
associated at: 1732143676976 ms
current time: 1732152384528 ms p.s. Above device is a smartphone with snapdragon FastConnect 6800 (However, I do believe other clients can do the same.). It left the wifi range hour ago and kilometers away from wifi. If I manually click the "Disconnect" button in luci, the wifi works again immediately, (no restart). I'm using the offical unmodified Openwrt 23.05.5 image. openwrt/openwrt#14415 seems using a fork openwrt I did not set the wed_enable. cat /sys/module/mt7915e/parameters/wed_enable
N |
It's make sense, because the router as public Wi-Fi have client's entering and quiting the network at all time. |
You can try this patch from mtk |
I don't know how to use this |
Sorry. My router is a main device, It is hard for me to play with it. But I can provide log if needed. @victor186 I feel this is a common bug, for all MT7981, but it happens occasionally, hard to reproduce and notice. Maybe we could change the title to make it easier for more users to find? "MT7981 5GHz occasionally cannot disconnect clients that have left and causes bad performance." |
Done |
A dirty temp fix. Tested, works for me. Do not know if there is any side effect. Run this script every minute via cron. It will "disconnect" all clients that have a very very low signal strength (should be the clients that have already left the wifi coverage but still buggy as "associated".).
|
Maybe adding patch similar to https://github.com/freifunk-gluon/gluon/blob/main/patches/openwrt/0009-mt76-include-fixes-for-MT7603-MT7612.patch would help? |
This patch def does some good thing, before i had intermittent packet loss indication every min or less in games, now thats completely fixed with this patch. |
I tried this patch, and speed dropped 2x times with inactive WED. |
I dont notice a speed difference with WED enabled. |
Below client has left the house, but the MT6000 still sees/tracks it with a -92/-92 RSSI, ugh Using a pretty recent OpenWrt SNAPSHOT, r28242, with:
Stressing roamings with DAWN and or disconnects by walking of bounds seem to trigger that odd condition. I might try the cron job workarounnd. Since this is affecting my mesh network as batctl ends with nodes with 0.3 crawling link-speeds. |
Observed similar AP-STA-POLL-OK logs with my Flint 2 on 2.4G WiFi. |
I have adapted your solution and started using it to workaround this for my case too. gist:openwrt-mt76-disconnect-workaround This version can be added under init / rc scripts since it spawns a subshell on boot that keeps checking for the condition every N seconds. Another slight change is there is no need to set a threshold, it instead considers that if the signal is lower than the noise floor. We understand this is just a temporary workaround while we wait for the real solution, and also wonder if that MTK ref from losing the ACK on AX chips is related. |
I'm testing AX3000T on a restaurant for future network upgrade, but a've noticed poor speeds on 5GHz ramdomly, solved with radio restart, but when it occours, the network goes down due to low speed/high latency.
The AP is running on 80MHz/AX mode.
Openwrt 23.05.5.
The text was updated successfully, but these errors were encountered: