Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(iot-dev): fix thread-leak issue with IotHubReconnectTask #1567

Merged
merged 5 commits into from
Jul 6, 2022

Conversation

brycewang-microsoft
Copy link
Collaborator

For #1563:

Before this fix, the reconnect thread was not cleared appropriately. As a result, an instance of IotHubReconnectTask would be created per connection-status change from "DISCONNECTED_RETRYING" to "CONNECTED", while the previous thread was still waiting until the connection turned into "DISCONNECTED" eventually.

Thread dump before this fix:
1

Thread dump after this fix:
2

@brycewang-microsoft
Copy link
Collaborator Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 4 pipeline(s).

@timtay-microsoft
Copy link
Member

I double checked that we didn't have any threads leaking here, and it seems your fix works in that regard. That said, when I go through a basic connection loss event, I see that the SDK behaves like this:

  • Client opens initial connection
    • Send/Receive/Reconnect threads start
  • Client loses connection
    • Send/Receive threads stop
  • Client regains connection
    • Send/Receive threads stop (already stopped, so this is a no-op)
    • Reconnect thread stops*
    • Reconnect thread starts**

I'd rather we didn't stop the reconnect thread here* just to create a new thread pool here**. If the reconnect task is already running, then we don't need to do anything to it after we regain connectivity.

Something like:

private void startWorkerThreads()
{
	stopSendAndReceiveThreads();

	this.sendTaskScheduler = Executors.newScheduledThreadPool(1);
	this.receiveTaskScheduler = Executors.newScheduledThreadPool(1);

	this.sendTaskScheduler.scheduleWithFixedDelay(this.sendTask, 0,
			sendPeriodInMilliseconds, TimeUnit.MILLISECONDS);
	this.receiveTaskScheduler.scheduleWithFixedDelay(this.receiveTask, 0,
			receivePeriodInMilliseconds, TimeUnit.MILLISECONDS);


	// This is only set to null if the client as a whole has been closed. This thread pool stays active through disconnected_retrying
	if (this.reconnectTaskScheduler == null)
	{
		this.reconnectTaskScheduler = Executors.newScheduledThreadPool(1);

		this.reconnectTaskScheduler.scheduleWithFixedDelay(this.reconnectTask, 0,
				receivePeriodInMilliseconds, TimeUnit.MILLISECONDS);	
	}
}

@brycewang-microsoft
Copy link
Collaborator Author

I'd rather we didn't stop the reconnect thread here* just to create a new thread pool here**. If the reconnect task is already running, then we don't need to do anything to it after we regain connectivity.

Yes, this seems more reasonable!

@brycewang-microsoft
Copy link
Collaborator Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 4 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants