Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Home Assistant loses internet connection every day #2720

Closed
Nezz opened this issue Aug 31, 2023 · 25 comments
Closed

Home Assistant loses internet connection every day #2720

Nezz opened this issue Aug 31, 2023 · 25 comments
Labels
board/generic-x86-64 Generic x86-64 Boards (like Intel NUC) bug stale

Comments

@Nezz
Copy link

Nezz commented Aug 31, 2023

Describe the issue you are experiencing

Every day at some point during the day my home assistant loses internet connection. It cannot be reached from the network (including http://homeassistant.local:4357), nor can it reach any devices on the network (wifi devices become unavailable). It has happened every day since I updated to 2023.8.4, but it seems unlikely that this would be caused by updating core.

I run Home Assistant OS on VMware Workstation (bridged networking mode). The host machine is connected to the internet. Re-connecting to the network adapter to the VM does not resolve the issue. network reload does not resolve the issue. Restarting the VM resolves the issue.

What operating system image do you use?

generic-x86-64 (Generic UEFI capable x86-64 systems)

What version of Home Assistant Operating System is installed?

10.5

Did you upgrade the Operating System.

No

Steps to reproduce the issue

Unclear what causes this.

Anything in the Supervisor logs that might be useful for us?

Nothing relevant in supervisor logs

Anything in the Host logs that might be useful for us?

Nothing relevant in host logs

System information

No response

Additional information

Untitled
image
image
image

@Nezz
Copy link
Author

Nezz commented Aug 31, 2023

Let me know what kind of information I could grab to troubleshoot this. On a high level it seems to me that the ipv4 networking interface does not recover in HAOS.

Note that when the issue happens I can only access Home Assistant via the CLI.

@agners
Copy link
Member

agners commented Aug 31, 2023

It seems that you running without IPv4, is that always the case?

@agners agners added the board/generic-x86-64 Generic x86-64 Boards (like Intel NUC) label Aug 31, 2023
@Nezz
Copy link
Author

Nezz commented Aug 31, 2023

No, the VM normally gets 192.168.0.43 (that's when things work).

@Nezz
Copy link
Author

Nezz commented Sep 1, 2023

Let me know if I should grab any further info. I'm about to restart the VM, so the issue will occur again in 12-24 hours.

@Nezz
Copy link
Author

Nezz commented Sep 1, 2023

Running network update enp2s1 --ipv4-method auto brought my HA instance back online without restarting the VM.

@Nezz
Copy link
Author

Nezz commented Sep 2, 2023

It disconnected again and running the command from the previous comment helped again.

@uphillbattle
Copy link

My instance (OS 10.5, core 2023.8.4) has dropped of the network two days in a row. Running OS directly (not VM) on an old laptop (i7 processor from 2016 or so), connected over wifi, ipv4 and ipv6 enabled. Has been rock solid until two days ago. Will see if I can get something sensible from the logs when I get home (cannot, obviously, reach it remotely).

@agners
Copy link
Member

agners commented Sep 5, 2023

Running network update enp2s1 --ipv4-method auto brought my HA instance back online without restarting the VM.

Hm that is weird. Sounds like NetworkManager has issues acquiring a IPv4 address then.

Can you share the output of ha host logs --identifier NetworkManager -n 10000? You can redirect it to /config e.g. using:

ha host logs --identifier NetworkManager -n 10000 > /config/networkmanager.log

@Nezz
Copy link
Author

Nezz commented Sep 6, 2023

Here is the log:
networkmanager.log
A disconnect happened on Sep 2 22:56-23:29 (UTC+3). Another one happened on Sep 4 19:18-23:55.

@Nezz
Copy link
Author

Nezz commented Sep 6, 2023

Here is what my logs look like when things work as expected:

Sep 05 18:30:29 homeassistant NetworkManager[304]: <info>  [1693938629.9525] dhcp4 (enp2s1): activation: beginning transaction (timeout in 45 seconds)
Sep 05 18:30:29 homeassistant NetworkManager[304]: <info>  [1693938629.9565] dhcp4 (enp2s1): state changed no lease
Sep 05 18:30:30 homeassistant NetworkManager[304]: <info>  [1693938630.0124] manager: NetworkManager state is now DISCONNECTED
Sep 05 18:30:30 homeassistant NetworkManager[304]: <info>  [1693938630.0134] device (enp2s1): Activation: starting connection 'Supervisor enp2s1' (fd28114f-6784-4a46-913f-2277b1bbbf74)
Sep 05 18:30:30 homeassistant NetworkManager[304]: <info>  [1693938630.0425] device (enp2s1): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed')
Sep 05 18:30:30 homeassistant NetworkManager[304]: <info>  [1693938630.0436] manager: NetworkManager state is now CONNECTING
Sep 05 18:30:30 homeassistant NetworkManager[304]: <info>  [1693938630.0444] device (enp2s1): state change: prepare -> config (reason 'none', sys-iface-state: 'managed')
Sep 05 18:30:30 homeassistant NetworkManager[304]: <info>  [1693938630.0653] device (enp2s1): state change: config -> ip-config (reason 'none', sys-iface-state: 'managed')
Sep 05 18:30:30 homeassistant NetworkManager[304]: <info>  [1693938630.1086] dhcp4 (enp2s1): activation: beginning transaction (timeout in 45 seconds)
Sep 05 18:30:30 homeassistant NetworkManager[304]: <info>  [1693938630.3323] dhcp4 (enp2s1): state changed new lease, address=192.168.0.58
Sep 05 18:30:30 homeassistant NetworkManager[304]: <info>  [1693938630.3392] policy: set 'Supervisor enp2s1' (enp2s1) as default for IPv4 routing and DNS
Sep 05 18:30:30 homeassistant NetworkManager[304]: <info>  [1693938630.4572] device (enp2s1): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'managed')
Sep 05 18:30:30 homeassistant NetworkManager[304]: <info>  [1693938630.4857] device (enp2s1): state change: ip-check -> secondaries (reason 'none', sys-iface-state: 'managed')
Sep 05 18:30:30 homeassistant NetworkManager[304]: <info>  [1693938630.4994] device (enp2s1): state change: secondaries -> activated (reason 'none', sys-iface-state: 'managed')
Sep 05 18:30:30 homeassistant NetworkManager[304]: <info>  [1693938630.5221] manager: NetworkManager state is now CONNECTED_SITE
Sep 05 18:30:30 homeassistant NetworkManager[304]: <info>  [1693938630.5276] device (enp2s1): Activation: successful, device activated.
Sep 05 18:30:30 homeassistant NetworkManager[304]: <info>  [1693938630.7554] manager: NetworkManager state is now CONNECTED_GLOBAL

state changed no lease -> NetworkManager state is now DISCONNECTED -> recovery begins

Here is the first disconnect.

Sep 02 19:56:25 homeassistant NetworkManager[313]: <info>  [1693684585.1553] dhcp4 (enp2s1): activation: beginning transaction (timeout in 45 seconds)
Sep 02 19:56:25 homeassistant NetworkManager[313]: <info>  [1693684585.1555] dhcp4 (enp2s1): state changed no lease
Sep 02 20:06:25 homeassistant NetworkManager[313]: <info>  [1693685185.4444] manager: NetworkManager state is now CONNECTED_SITE
[ network update enp2s1 --ipv4-method auto is called ]
Sep 02 20:29:15 homeassistant NetworkManager[313]: <info>  [1693686555.5143] audit: op="connection-update" uuid="60e10c4d-470a-3fae-8f6f-e4ccd041ae60" name="Supervisor enp2s1" args="connection.timestamp" pid=969 uid=0 result="success"
Sep 02 20:29:15 homeassistant NetworkManager[313]: <info>  [1693686555.5202] device (enp2s1): state change: activated -> deactivating (reason 'new-activation', sys-iface-state: 'managed')
...
Sep 02 20:29:16 homeassistant NetworkManager[313]: <info>  [1693686556.3933] policy: set 'Supervisor enp2s1' (enp2s1) as default for IPv4 routing and DNS
Sep 02 20:29:16 homeassistant NetworkManager[313]: <info>  [1693686556.4126] device (enp2s1): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'managed')
Sep 02 20:29:16 homeassistant NetworkManager[313]: <info>  [1693686556.4157] device (enp2s1): state change: ip-check -> secondaries (reason 'none', sys-iface-state: 'managed')
Sep 02 20:29:16 homeassistant NetworkManager[313]: <info>  [1693686556.4163] device (enp2s1): state change: secondaries -> activated (reason 'none', sys-iface-state: 'managed')
Sep 02 20:29:16 homeassistant NetworkManager[313]: <info>  [1693686556.4173] manager: NetworkManager state is now CONNECTED_SITE
Sep 02 20:29:16 homeassistant NetworkManager[313]: <info>  [1693686556.4195] device (enp2s1): Activation: successful, device activated.
Sep 02 20:29:16 homeassistant NetworkManager[313]: <info>  [1693686556.4657] manager: NetworkManager state is now CONNECTED_GLOBAL

It seems that sometimes the network manager does not pick up that the lease has expired. Instead of changing the state to DISCONNECTED, it gets stuck in CONNECTED_SITE.

The disconnect I had on the 4th of September is another example of that:

Sep 03 16:17:57 homeassistant NetworkManager[313]: <info>  [1693757877.2504] device (enp2s1): Activation: successful, device activated.
Sep 03 16:17:57 homeassistant NetworkManager[313]: <info>  [1693757877.3082] manager: NetworkManager state is now CONNECTED_GLOBAL
Sep 04 16:17:56 homeassistant NetworkManager[313]: <info>  [1693844276.9655] dhcp4 (enp2s1): activation: beginning transaction (timeout in 45 seconds)
Sep 04 16:17:56 homeassistant NetworkManager[313]: <info>  [1693844276.9657] dhcp4 (enp2s1): state changed no lease
Sep 04 16:17:57 homeassistant NetworkManager[313]: <info>  [1693844277.3220] manager: NetworkManager state is now CONNECTED_SITE

state changed no lease -> NetworkManager state is now CONNECTED_SITE -> no recovery happens

@Nezz
Copy link
Author

Nezz commented Sep 6, 2023

The same network drop happened again:

Sep 06 18:30:30 homeassistant NetworkManager[304]: <info>  [1694025030.1153] dhcp4 (enp2s1): activation: beginning transaction (timeout in 45 seconds)
Sep 06 18:30:30 homeassistant NetworkManager[304]: <info>  [1694025030.1157] dhcp4 (enp2s1): state changed no lease
Sep 06 18:40:30 homeassistant NetworkManager[304]: <info>  [1694025630.5754] manager: NetworkManager state is now CONNECTED_SITE

@Nezz
Copy link
Author

Nezz commented Sep 8, 2023

Any tips how to fix this? I need to run network update enp2s1 every day to bring HA back online. Can I schedule that somehow to work around this?

@uphillbattle
Copy link

Not sure if we have the exact same problem, but it seems at least we have had the same symptoms. I tried downgrading to 10.4 but that didn’t help. I then downgraded to 10.3, but at the same time did the following changes:

  • I disabled ipv6 in HA
  • Instead of setting up HA with a static IP, I changed the setting to Auto and instead reserved the IP address at the router

The network connection has been stable since (about 3 days and counting). I’ll give it another day or two before upgrading again to see if it was the network settings or the downgrading that did the trick.

Below is a picture of the last lines of the NetworkManager logs when network connection had dropped with OS v10.4 (how can I get the logs out in file format when the network is down?).
IMG_2408

@Nezz
Copy link
Author

Nezz commented Sep 13, 2023

What did not work:

  • Having a reserved IP for Home Assistant in my router and using automatic IPv4 in HA
  • Not having a reserved IP and using automatic IPv4 in HA

What worked:

  • Having a reserved IP configured in the router and setting the static IPv4 in HA

@uphillbattle
Copy link

Having a reserved IP for Home Assistant in my router and using automatic IPv4 in HA has been stable with OS 10.3 for several days. Yesterday afternoon, I upgraded the OS to 10.5 (but did not change the IP settings). It has not yet fallen off the network (after 17 hours) but it's too soon to draw any conclusion.

@Nezz
Copy link
Author

Nezz commented Sep 13, 2023

Looking at the DHCP documentation, the lease should be renewed at half time (after 12 hours assuming the standard 24 hour lease). However, this does not seem to happen, or at least NetworkManager does not log about it.

@uphillbattle
Copy link

For information: I have had no more problems since upgrading the OS to 10.5, so in my case it seems the IP-settings did the trick. Can’t explain why that should matter, so it’s just an observation.

@dingausmwald
Copy link

dingausmwald commented Oct 4, 2023

Same here. DHCP drops connection every so often. Thanks to the workaround, static ip in HA (and Router, which was set anyway) did the trick.
Another thing: Changing the wifi drops the connection until duckdns renews ip (5min, maybe 10min interval?). Im talking about the onboard rasperry pi 4b wifi, HA OS. In addition, i have another wifi adapter via usb plugged. Banged my head around this for 3 days. Crazy behaviour where i couldn't connect to 2,4 wifi in any way. Used nmcli about 100 times. Wifi couldnt be found, password wasn't delivered and so on. Onboard wifi (wlan0) or ssid got blocked somehow.

Maybe this helps too: i had a bunch of lease files in var/lib/NetworkManager. Deleting them brought my wifi up on boot in notime. Maybe this isn't related. Its late...

@uphillbattle
Copy link

FWIW, my instance started dropping the network connection again after a couple of weeks. The instance was on WiFi. I gave up and got a USB Ethernet adapter. The instance has been running without problems ever since (more than a month now). So it seems it a wired network connection did the trick in my case.

@Nezz
Copy link
Author

Nezz commented Nov 24, 2023

My reserved IP solution over wifi is still working without problems. However, it'd be nice to fix this. The DHCP protocol turned 30 years old last month and it's be great if it worked in HA reliably.

@dprslt
Copy link

dprslt commented Jan 30, 2024

I got the same probleme while running HaOs on a x86 Gigabyte NUC GB-BXBT-207.
Setting a static adress + reserved IP in the DHCP server does not solve the problem and the only way to get it online again is a reboot...

I struggle to get more info to identify the problem...

It still connected to the wifi network from nmcli it still have an IP in ip but it cannot ping the router nor 8.8.8.8 and it's not detected by the router. In the router panel admin i can see the mac adress of the device without an ip (it got when freeslhy booted up)

The problem occur every day, what can i provide to help us identify the cause of the deconnection ?

@uphillbattle
Copy link

As mentioned above, my instance has not dropped out of the network since I ditched wifi and got a usb-ethernet adapter for wired network connection. More than 3 months now, without a network glitch.

Wifi is discouraged for stability reasons - the network drops may be the symptom that backs up the claim that HA OS on wifi is not sufficiently reliable.

I have no idea why the machine drops off the network when on wifi, so I can only contribute with the observation that in my case, going to wired network has solved the issue.

@Datel01
Copy link

Datel01 commented Mar 24, 2024

Hello, I'm out of ideas. Having similar issue loosing HA network connection occasionally. Only help is to switch Rpi4 off and leave it for about 10 min, if start again HA will start. Sometime it stays 1h, sometings 5min and sometimes 1day. Connected to router with ethernet connection. Changed from HDCP (auto) to static IP, disabled ipv6, Changed SD card. Running latest version of HA. Also tried other power supply.

Running this box almost year without these issues. I had this few times month ago, then it stopped without any changes and not this is back for about 2-3 weeks.

Here is log of networkmanager (strange thing why 24th March shows log 4th April also?):
Obrázek WhatsApp, 2024-03-24 v 19 35 38_64fd9a58

@l-marchesi
Copy link

Hello, I'm out of ideas. Having similar issue loosing HA network connection occasionally. Only help is to switch Rpi4 off and leave it for about 10 min, if start again HA will start. Sometime it stays 1h, sometings 5min and sometimes 1day. Connected to router with ethernet connection. Changed from HDCP (auto) to static IP, disabled ipv6, Changed SD card. Running latest version of HA. Also tried other power supply.

Running this box almost year without these issues. I had this few times month ago, then it stopped without any changes and not this is back for about 2-3 weeks.

Hi @Datel01
I'm having the same issue since 2-3 weeks on a Raspi 3B. HAOS seems to forget the WiFi network, however when logging in over the LAN IP, it still knows the connection.
On my phone I get a NSURL Error (obviously, can't find it) - also oftentimes it says "wrong credentials".
If you've found something, I'd appreciate a response - I will do so as well.

Copy link

github-actions bot commented Jul 5, 2024

There hasn't been any activity on this issue recently. To keep our backlog manageable we have to clean old issues, as many of them have already been resolved with the latest updates.
Please make sure to update to the latest Home Assistant OS version and check if that solves the issue. Let us know if that works for you by adding a comment 👍
This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale label Jul 5, 2024
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jul 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
board/generic-x86-64 Generic x86-64 Boards (like Intel NUC) bug stale
Projects
None yet
Development

No branches or pull requests

7 participants