Don't close/reopen tcp connection on single modbus message timeout #2346

ahcm-dev · 2024-09-30T08:43:04Z

I am connecting to a TCP to RTU modbus gateway which transparently passes on modbus requests to serial slaves on the downstream side. All works fine most of the time.

The application (home assistant) uses the AsyncModbusTCPClient to send periodic requests from a number of different threads to fetch data from different sensors on the different configured slaves.

If one slave malfunctions, or is powered off, a modbus request to it times out (though the gateway ack's the request at the TCP layer). This then causes the connection to be closed, and all the queued pending requests to the other slaves are failed. (Even though they would have succeeded if allowed to proceed).

While the connection quickly re-establishes again, the same thing happens as the polling function repolls the same unresponsive slave, and hence blocks successful activity from other healthy slaves.

I get good behaviour if in the tcp client we do not actually close the connection if the command to close with reconnect==True is sent. This relies on the tcp socket being relied upon to perform an unexpected close if the tcp timers time out.

janiversen · 2024-09-30T08:56:09Z

pymodbus/client/tcp.py

@@ -86,8 +86,16 @@ def __init__(  # pylint: disable=too-many-arguments
        )

    def close(self, reconnect: bool = False) -> None:
-        """Close connection."""


This is wrong, this method is called by the app when it wants to close the connection.

Why would you prevent the app from closing the connection ? Your problem is down in the transport layer.

Not sure how this can be resolved in the transport layer. The issue is the call to close() in async_execute() in client/base.py after too many retries of a particular request. The remote end will never respond to this request as it addresses an inactive slave. It will however respond to other requests, and closing the connection as a result causes these to fail too.

Maybe a better fix is in async_execute but it feels wrong to have tcp specific code in there. Any suggestions welcome!

if the issue if the call to close in async_execute, then that would be a better place to fix the problem rather than prohibiting the app to close a connection.

Be aware your PR highlighted some old code, that should have been updated long time ago, it will be so very shortly.

janiversen · 2024-09-30T11:47:25Z

dev is updated.

janiversen · 2024-09-30T20:59:33Z

??? I thought you had a problem, that you tried to solve ?

Your solution, as I pointed out, had serious side-effects and thus was not viable...but a working solution would be reviewed positively.

Please be aware the changes I made to the reconnect= parameter does NOT affect the close in case of timeout.

ahcm-dev · 2024-09-30T21:38:07Z

Apologies trying to redo some changes.

…

On Mon, 30 Sept 2024, 21:59 jan iversen, ***@***.***> wrote: ??? I thought you had a problem, that you tried to solve ? Your solution, as I pointed out, had serious side-effects and thus was not viable...but a working solution would be reviewed positively. — Reply to this email directly, view it on GitHub <#2346 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/BLWSLS5WWGDNRQMW7SFE3GTZZG3UVAVCNFSM6AAAAABPCYRBAGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGOBUGEZTIOBZHE> . You are receiving this because you modified the open/close state.Message ID: ***@***.***>

ahcm-dev · 2024-09-30T21:55:19Z

Closed in error - trying to reopen

ahcm-dev · 2024-09-30T21:55:35Z

Pull request now updated

ahcm-dev added 2 commits September 29, 2024 22:07

new branch to allow timeout to slaves on tcp clients

66c307e

allow timeout to slaves on tcp clients

f1d0e7f

janiversen requested changes Sep 30, 2024

View reviewed changes

ahcm-dev closed this Sep 30, 2024

ahcm-dev deleted the tcprtu-multi branch September 30, 2024 18:45

ahcm-dev restored the tcprtu-multi branch September 30, 2024 18:45

ahcm-dev mentioned this pull request Sep 30, 2024

Resubmit: Don't close/reopen tcp connection on single modbus message timeout #2350

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't close/reopen tcp connection on single modbus message timeout #2346

Don't close/reopen tcp connection on single modbus message timeout #2346

ahcm-dev commented Sep 30, 2024

janiversen Sep 30, 2024

ahcm-dev Sep 30, 2024 •

edited

Loading

janiversen Sep 30, 2024

janiversen commented Sep 30, 2024

janiversen commented Sep 30, 2024 •

edited

Loading

ahcm-dev commented Sep 30, 2024 via email

ahcm-dev commented Sep 30, 2024

ahcm-dev commented Sep 30, 2024

Don't close/reopen tcp connection on single modbus message timeout #2346

Don't close/reopen tcp connection on single modbus message timeout #2346

Conversation

ahcm-dev commented Sep 30, 2024

janiversen Sep 30, 2024

Choose a reason for hiding this comment

ahcm-dev Sep 30, 2024 • edited Loading

Choose a reason for hiding this comment

janiversen Sep 30, 2024

Choose a reason for hiding this comment

janiversen commented Sep 30, 2024

janiversen commented Sep 30, 2024 • edited Loading

ahcm-dev commented Sep 30, 2024 via email

ahcm-dev commented Sep 30, 2024

ahcm-dev commented Sep 30, 2024

ahcm-dev Sep 30, 2024 •

edited

Loading

janiversen commented Sep 30, 2024 •

edited

Loading