Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

699 established TCP connections from HA to Denon AVR telnet port #119133

Closed
dannutu opened this issue Jun 8, 2024 · 11 comments · Fixed by #126703
Closed

699 established TCP connections from HA to Denon AVR telnet port #119133

dannutu opened this issue Jun 8, 2024 · 11 comments · Fixed by #126703
Assignees

Comments

@dannutu
Copy link

dannutu commented Jun 8, 2024

The problem

Hi there

I recently noticed my HA log file getting pretty large (in the GB range) pretty quickly (in just a couple of days). I'm running HA on a RPi4B so almost a GB of logs written each day is not insignificant for my setup (and this gets amplified because I also run full system backups to a separate btrfs partition every morning at 5am). Right now, out of the 9,244,339 lines the log file accumulated in the last 3.5 days, exactly 9,231,778 of them are from "[denonavr.api]".

The 2 messages from "[denonavr.api]" that I see in the HA log file are:
"Keep alive failed, disconnecting and reconnecting" (130,508 lines)
"Telnet reconnected" (9,101,270 lines)
Each of these lines is repeated many times each second (yes, second).

I happily used the Denon integration for a long time (~4-5 years) with next to no issues however recently my Ethernet switch started glitching (I have an open ticket with the manufacturer for this - it works fine for multiple hours, up to a few days, then, for no clear reason it suddenly drops the Link Aggregation groups and then around 10 minutes later as clockwork everything comes back online without any intervention).

I noticed that the time of the "Keep alive failed, disconnecting and reconnecting" HA log entries align with the time when the Ethernet switch decides to take a 10 minute break, with the "Telnet reconnected" log entries going on for quite a while after the Ethernet switch decides to start working again.

With the switch currently working netstat -napt tells me that one python process has exactly 699 established TCP connections to port 23 of the Denon AVR. I'd say that's a few too many! :)

I appreciate that the initial cause are the Ethernet switch flips, but I wonder if there is anything I could try to reduce the frequency and number of retries and log entries. For my purposes trying to reestablish the TCP connection to the Denon AVR every 5-10 seconds would be perfectly acceptable, I don't need it to retry every few milliseconds. Where can I go in an change the retry interval?

Many thanks!

What version of Home Assistant Core has the issue?

core-2024.1.6

What was the last working version of Home Assistant Core?

No response

What type of installation are you running?

Home Assistant Supervised

Integration causing the issue

denonavr

Link to integration documentation on our website

https://www.home-assistant.io/integrations/denonavr

Diagnostics information

No response

Example YAML snippet

No response

Anything in the logs that might be useful for us?

The 2 messages from "[denonavr.api]" that I see in the HA log file are:
"Keep alive failed, disconnecting and reconnecting" (130,508 lines)
"Telnet reconnected" (9,101,270 lines)
Each of these lines is repeated many times each second (yes, second).

Additional information

No response

@home-assistant
Copy link

home-assistant bot commented Jun 8, 2024

Hey there @ol-iver, @starkillerOG, mind taking a look at this issue as it has been labeled with an integration (denonavr) you are listed as a code owner for? Thanks!

Code owner commands

Code owners of denonavr can trigger bot actions by commenting:

  • @home-assistant close Closes the issue.
  • @home-assistant rename Awesome new title Renames the issue.
  • @home-assistant reopen Reopen the issue.
  • @home-assistant unassign denonavr Removes the current integration label and assignees on the issue, add the integration domain after the command.
  • @home-assistant add-label needs-more-information Add a label (needs-more-information, problem in dependency, problem in custom component) to the issue.
  • @home-assistant remove-label needs-more-information Remove a label (needs-more-information, problem in dependency, problem in custom component) on the issue.

(message by CodeOwnersMention)


denonavr documentation
denonavr source
(message by IssueLinks)

@dannutu
Copy link
Author

dannutu commented Jun 8, 2024

Just to add a little bit of info:

  • reloading just the denonavr integration persists those 699 TCP connections
  • restarting HA gets rid of all but one (as expected)

@ADHDSquir
Copy link

I am experiencing this as well, intermittently. Strangely, even when I disable the denonavr integration, the log flood continues until I restart Home Assistant.

@dannutu
Copy link
Author

dannutu commented Jun 13, 2024

So I'm not alone! :)
I've restarted the whole home assistant box 5 days ago, with just a handful of switch glitches since then I now have 12 concurrent connections from HA to the telnet port on the receiver.
I'd be very willing to help troubleshoot this, I feel it should be an easy thing to address for the code owners

@dannutu
Copy link
Author

dannutu commented Jun 21, 2024

Hi there, I'm now up to 31 concurrent telnet connections to the AVR.
Is there anything I can do to help troubleshoot this?

@ADHDSquir
Copy link

The upstream library denonavr specifies that "Receivers support only one active telnet connection." So I would expect either that library or this integration to ensure that all connections are closed before attempting to establish a new one.

Are you able to manually create the issue by disconnecting the ethernet cable for a minute and then reconnecting it? That would be helpful for debugging the issue.

@dannutu
Copy link
Author

dannutu commented Jun 27, 2024

I just checked and I had 40 established connections.

I tried to manually connect with telnet just to see if the connection is accepted and while I did not get a prompt from the AVR, netstat showed 41 established connections.
I disconnected and this resulted in the above connection going to TIME_WAIT (the other 40 from python were still showing ESTABLISHED).
I reconnected --> 41 ESTABLISHED + 1 TIME_WAIT
I disconnected & reconnected --> 41 ESTABLISHED + 2 TIME_WAIT
I disconnected --> 40 ESTABLISHED + 3 TIME_WAIT

I stopped HA --> 40 in FIN_WAIT + 3 TIME_WAIT, then all gone after a short while
I started HA --> only one established connection to the AVR - as it should be

Then, as I was monitoring every second all connections from HA to the AVR and I was preparing to disable the switch port, guess what! A 2nd ESTABLISHED connection appeared from HA to the AVR telnet port! So there were 2, both ESTABLISHED. I waited for about 10-15 minutes to see what happens and they both stayed ESTABLISHED and no new connections appeared.

I then disabled the port on the switch and another surprise followed. I expected both established connections to drop but that is not what happened. One of them went to FIN_WAIT1 and then disappeared, while the 2nd one stayed ESTABLISHED!

I left the port disabled for a couple of minutes, during which time I saw HA trying to initiate new connections (SYN_SENT) but this one connection remained ESTABLISHED throughout. I just cannot figure out HOW.

Then, once I reenabled the switch port HA established a new connection so I was left with 2, both ESTABLISHED.

@dannutu
Copy link
Author

dannutu commented Jun 27, 2024

Here's the time correlation:

Switch port:

2024-06-27 21:38:52 port 2 link down
2024-06-27 21:42:42 Port 2 link up, 1Gbps FULL duplex 

HA log:

2024-06-27 21:23:27.534 INFO (MainThread) [denonavr.api] 192.168.40.4: Keep alive failed, disconnecting and reconnecting
2024-06-27 21:23:27.544 INFO (MainThread) [denonavr.api] 192.168.40.4: Telnet reconnected
2024-06-27 21:23:27.548 INFO (MainThread) [denonavr.api] 192.168.40.4: Telnet reconnected
2024-06-27 21:39:57.590 INFO (MainThread) [denonavr.api] 192.168.40.4: Keep alive failed, disconnecting and reconnecting
2024-06-27 21:42:58.961 INFO (MainThread) [denonavr.api] 192.168.40.4: Telnet reconnected

@dannutu
Copy link
Author

dannutu commented Jul 13, 2024

Hmmm, I reckon nobody's interested in looking into this

@vincegre
Copy link

vincegre commented Sep 3, 2024

Hey

Did you try to deactivate the telnet option in the integration settings ? i'm investigating an other problem with Denon AVR and integration related to mute and in looking to find the issue I noticed on 3 AVR I have on one system 2 of them don't have the telnet option activated in the integration and the control works as well as the one that has the telnet option ;) In case it might help to avoid the log issue ;)

Vincèn

@ol-iver
Copy link
Contributor

ol-iver commented Sep 21, 2024

There have been leaked TCP transports when telnet connections have been disconnected. I might have fixed it with ol-iver/denonavr#302.
However, the fixed version is not added to HA yet.

@github-actions github-actions bot locked and limited conversation to collaborators Oct 25, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants