Customizable receive timeout for SMB Connection #285

leonnis12 · 2024-08-14T12:34:54Z

This PR extends the configurability of the SMB Connection class and register_session methods to allow modifying the transport.recv timeout used for both keep-alive checks that keep Windows from closing a connection at ~16 minutes, and for checking the health of the connection.

In the usecase of my application, the SMB connection is used over an unstable network, which sometimes loses the connections between the machines. This triggers Connection timed out errors from the health check implemented in _process_message_thread, but only after 10+ minutes (the hardcoded timeout) where the application hangs. I would like to be able to configure this timeout to be smaller for use cases like this, to perform the health check more often, and be able to handle the timeout error in a reasonable ammount of time.

Current behaviour:

The connection is checked every 10 minutes using a SMB2_ECHO command.
If the connection is broken due to a network problem, the connection will be stuck until the timeout passes, the ECHO command is sent and the server does not respond to it.

Expected Behavior

The timeout should be configurable for the cases where network errors are common, and handling them often requires waiting 10+ minutes for each such problem.

This might also be a fix for #117 (mentions a similar problem), by allowing configuring a smaller timeout value, which does not keep the application stuck for long and allows the error to be handled in my application in a timely manner.

The PR extends the health checks added in #135 with the configurable health check interval value.

DragosFlorea · 2024-08-14T14:07:04Z

This would be very helpful for my use case as well

jborean93 · 2024-08-15T02:08:48Z

Thanks for the PR.

I am somewhat weary about exposing the retry mechanism timeout as a public API as I feel like the implementation is a bit of a bandaid over the real problem rather than a good solution. I know I've been dragging my heels on trying to rework the TCP transport side where my aim was to improve this situation but I'm on the fence about accepting this type of change.

The proper solution in my mind is to figure out how to detect when the socket has been "dropped" but potentially in your situation the client still thinks its connected and thus will wait for the full timeout.

DragosFlorea · 2024-08-15T21:28:08Z

I understand your concern... And yes the best solution is to know exactly when the connection has been dropped but a logic on the client side to cut off the connection after timeout is not bad either. I mean you can have this as backup if you cannot detect the dropped connection
If you do not want to expose the timeout as a public api, which is a low level configuration that can mess up things, what do you say about a environment variable marked as experimental until you have a better solution?

jborean93 · 2024-08-19T04:25:41Z

I think I can justify an environment variable here, especially if it's helping in some situations. I would have to stress it wouldn't be covered under the public API and could change/be removed in the future.

The socket stuff can be complicated, if you know the reason why the socket was closed but never reported back to the smbprotocol then I'll love to see a reproducer for it. That'll help me test out new scenarios in the future and hopefully avoid the need for the timeout altogether.

…ariable

leonnis12 · 2024-08-19T11:29:39Z

I agree that the proper solution would be to detect when the socket has been "dropped", but also that the environment variable "fix" should be enough to help in the situations where this happens, until a proper solution can be found at least.

I've updated the pull request to not expose the timeout through the public API, and instead be available to be overridden through an environment variable marked as experimental from the Connection class.

Regarding reproducing the issue, simulating a network that drops packets from time to time, or has random response time spikes should be enough to trigger this issue, for further testing in the future. The following is an example from the network where I encountered the issue constantly.

codecov · 2024-08-19T11:36:39Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.04%. Comparing base (2ce49ef) to head (8e16e2b).
Report is 1 commits behind head on master.

Additional details and impacted files

@@           Coverage Diff           @@
##           master     #285   +/-   ##
=======================================
  Coverage   99.04%   99.04%           
=======================================
  Files          24       24           
  Lines        5115     5116    +1     
=======================================
+ Hits         5066     5067    +1     
  Misses         49       49

Flag	Coverage Δ
macOS	`68.15% <100.00%> (+<0.01%)`	⬆️
py3.10	`99.00% <100.00%> (+<0.01%)`	⬆️
py3.11	`99.00% <100.00%> (+<0.01%)`	⬆️
py3.12	`99.00% <100.00%> (+<0.01%)`	⬆️
py3.8	`99.00% <100.00%> (+<0.01%)`	⬆️
py3.9	`99.04% <100.00%> (+<0.01%)`	⬆️
ubuntu	`96.81% <100.00%> (+<0.01%)`	⬆️
windows	`98.96% <100.00%> (+<0.01%)`	⬆️
x64	`99.04% <100.00%> (+<0.01%)`	⬆️
x86	`98.96% <100.00%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

jborean93 · 2024-08-19T19:40:41Z

I'll need to sort out the CI problems on macOS separately, I believe the image version now uses the arm64 builds so the x86_64 ones are failing without further work needed. That's not your problem to deal with so thanks for the changes and agreeing to the env var solution for now!

leonnis12 · 2024-08-21T07:36:13Z

Would it be possible to make a minor release which includes this fix, until the changes for 1.14 are ready? It would be really helpful.

jborean93 · 2024-08-25T21:26:57Z

v1.14.0 has been published on PyPI with this change https://pypi.org/project/smbprotocol/1.14.0/.

Added customizable receive timeout for SMB Connection

6dde537

alin-opswat approved these changes Aug 14, 2024

View reviewed changes

Refactored customizable receive timeout as experimental environment v…

8e16e2b

…ariable

jborean93 approved these changes Aug 19, 2024

View reviewed changes

jborean93 merged commit 1572393 into jborean93:master Aug 19, 2024
22 of 25 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Customizable receive timeout for SMB Connection #285

Customizable receive timeout for SMB Connection #285

leonnis12 commented Aug 14, 2024

DragosFlorea commented Aug 14, 2024

jborean93 commented Aug 15, 2024

DragosFlorea commented Aug 15, 2024 •

edited

Loading

jborean93 commented Aug 19, 2024

leonnis12 commented Aug 19, 2024

codecov bot commented Aug 19, 2024 •

edited

Loading

jborean93 commented Aug 19, 2024

leonnis12 commented Aug 21, 2024

jborean93 commented Aug 25, 2024

Customizable receive timeout for SMB Connection #285

Customizable receive timeout for SMB Connection #285

Conversation

leonnis12 commented Aug 14, 2024

Current behaviour:

Expected Behavior

DragosFlorea commented Aug 14, 2024

jborean93 commented Aug 15, 2024

DragosFlorea commented Aug 15, 2024 • edited Loading

jborean93 commented Aug 19, 2024

leonnis12 commented Aug 19, 2024

codecov bot commented Aug 19, 2024 • edited Loading

Codecov Report

jborean93 commented Aug 19, 2024

leonnis12 commented Aug 21, 2024

jborean93 commented Aug 25, 2024

DragosFlorea commented Aug 15, 2024 •

edited

Loading

codecov bot commented Aug 19, 2024 •

edited

Loading