You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Java runtime used: openjdk version "17.0.8" 2023-07-18
SDK version used: 2.1.1 and 2.1.0
Description of the issue
To contextualize a bit, I have a service running under my device that implements Azure DPS and allows clients to provision their devices using symmetric keys. I have a task that periodically retries the provisioning in case of failure, however, the user can start/cancel the provisioning process and the retry task (before the device is assigned in the cloud) at any time. Provisioning is done using the async ProvisioningDeviceClient.registerDevice() method.
However, during tests with some failure cases, after killing the provisioning process at some times I would have a azure-iot-sdk-RegisterTask named thread still lingering around in a WAITING state and it never finishes.
After some debugging, I believe that this happens when the provisioning process is asked to be stopped (thus ProvisioningDeviceClient.close() is called) too quickly after calling ProvisioningDeviceClient.registerDevice().
It appears to me that, if close() is called before the task from RegisterTask is done, the thread will never finish.
I was able to create a scenario to reproduce this problem using a sample code from the SDK with some modifications to use the async method instead (attached below).
If you take a thread dump after the Press any key to exit... message appears in the console, there should be a azure-iot-sdk-RegisterTask thread running.
If you switch the commented lines and wait a longer time, all DPS threads should be cleared. (During some tests I saw a ProvisioningTask thread still running, but that one was finalized after some seconds).
Code sample exhibiting the issue
packagesamples.com.microsoft.azure.sdk.iot;
importcom.microsoft.azure.sdk.iot.device.*;
importcom.microsoft.azure.sdk.iot.device.exceptions.IotHubClientException;
importcom.microsoft.azure.sdk.iot.provisioning.device.*;
importcom.microsoft.azure.sdk.iot.provisioning.device.internal.exceptions.ProvisioningDeviceClientException;
importcom.microsoft.azure.sdk.iot.provisioning.security.SecurityProviderSymmetricKey;
importjava.io.IOException;
importjava.nio.charset.StandardCharsets;
importjava.util.Scanner;
/** * Symmetric Key authenticated individual enrollment sample */@SuppressWarnings("CommentedOutCode") // Ignored in samples as we use these comments to show other options.publicclassProvisioningSymmetricKeyIndividualEnrollmentSample
{
// The scope Id of your DPS instance. This value can be retrieved from the Azure PortalprivatestaticfinalStringID_SCOPE = "removed";
// Typically "global.azure-devices-provisioning.net"privatestaticfinalStringGLOBAL_ENDPOINT = "removed";
// The symmetric key of the individual enrollment. Unlike with enrollment groups, this key can be used directly.// For the sake of security, you shouldn't save keys into String variables as that places them in heap memory. For the sake// of simplicity within this sample, though, we will save it as a string. Typically this key would be loaded as byte[] so that// it can be removed from stack memory.privatestaticfinalStringSYMMETRIC_KEY = "removed";
// The registration Id to provision the device to. When creating an individual enrollment prior to running this sample, you choose this value.privatestaticfinalStringREGISTRATION_ID = "removed";
// Uncomment one line to choose which protocol you'd like to useprivatestaticfinalProvisioningDeviceClientTransportProtocolPROVISIONING_DEVICE_CLIENT_TRANSPORT_PROTOCOL = ProvisioningDeviceClientTransportProtocol.HTTPS;
//private static final ProvisioningDeviceClientTransportProtocol PROVISIONING_DEVICE_CLIENT_TRANSPORT_PROTOCOL = ProvisioningDeviceClientTransportProtocol.MQTT;//private static final ProvisioningDeviceClientTransportProtocol PROVISIONING_DEVICE_CLIENT_TRANSPORT_PROTOCOL = ProvisioningDeviceClientTransportProtocol.MQTT_WS;//private static final ProvisioningDeviceClientTransportProtocol PROVISIONING_DEVICE_CLIENT_TRANSPORT_PROTOCOL = ProvisioningDeviceClientTransportProtocol.AMQPS;//private static final ProvisioningDeviceClientTransportProtocol PROVISIONING_DEVICE_CLIENT_TRANSPORT_PROTOCOL = ProvisioningDeviceClientTransportProtocol.AMQPS_WS;publicstaticvoidmain(String[] args) throwsException
{
System.out.println("Starting...");
System.out.println("Beginning setup.");
SecurityProviderSymmetricKeysecurityClientSymmetricKey;
securityClientSymmetricKey = newSecurityProviderSymmetricKey(SYMMETRIC_KEY.getBytes(StandardCharsets.UTF_8), REGISTRATION_ID);
ProvisioningDeviceClientprovisioningDeviceClient = ProvisioningDeviceClient.create(
GLOBAL_ENDPOINT,
ID_SCOPE,
PROVISIONING_DEVICE_CLIENT_TRANSPORT_PROTOCOL,
securityClientSymmetricKey);
provisioningDeviceClient.registerDevice(ProvisioningSymmetricKeyIndividualEnrollmentSample::registrationCallback, null);
// Thread.sleep(5000); // Wait enough time before closing, no lingering thread.Thread.sleep(500); // RegisterTask should still be running when Scanner line is reached and will never finish.provisioningDeviceClient.close();
System.out.println("Press any key to exit...");
newScanner(System.in, StandardCharsets.UTF_8.name()).nextLine();
}
privatestaticvoidregistrationCallback(
ProvisioningDeviceClientRegistrationResultprovisioningDeviceClientRegistrationResult,
Exceptionexception, Objectcontext)
{
System.out.println("Inside registration callback function");
if (provisioningDeviceClientRegistrationResult.getProvisioningDeviceClientStatus() == ProvisioningDeviceClientStatus.PROVISIONING_DEVICE_STATUS_ASSIGNED)
{
System.out.println("IotHub Uri : " + provisioningDeviceClientRegistrationResult.getIothubUri());
System.out.println("Device ID : " + provisioningDeviceClientRegistrationResult.getDeviceId());
}
}
}
Console log of the issue
Log of the code variant that presents the issue (Sleeping for 500ms)
Connected to the target VM, address: '127.0.0.1:36605', transport: 'socket'
Starting...
Beginning setup.
2023-08-30 02:15:19,518 DEBUG (main) [com.microsoft.azure.sdk.iot.provisioning.device.ProvisioningDeviceClient] - Initialized a ProvisioningDeviceClient instance using SDK version 2.1.1
2023-08-30 02:15:19,522 DEBUG (main) [com.microsoft.azure.sdk.iot.provisioning.device.ProvisioningDeviceClient] - Starting provisioning thread...
2023-08-30 02:15:19,529 INFO (global.azure-devices-provisioning.net-5c1594e3-CxnPendingConnectionId-azure-iot-sdk-ProvisioningTask) [com.microsoft.azure.sdk.iot.provisioning.device.internal.task.ProvisioningTask] - Opening the connection to device provisioning service...
2023-08-30 02:15:19,672 INFO (global.azure-devices-provisioning.net-5c1594e3-Cxn5c1594e3-azure-iot-sdk-ProvisioningTask) [com.microsoft.azure.sdk.iot.provisioning.device.internal.task.ProvisioningTask] - Connection to device provisioning service opened successfully, sending initial device registration message
2023-08-30 02:15:19,683 INFO (global.azure-devices-provisioning.net-5c1594e3-Cxn5c1594e3-azure-iot-sdk-RegisterTask) [com.microsoft.azure.sdk.iot.provisioning.device.internal.task.RegisterTask] - Authenticating with device provisioning service using symmetric key
Press any key to exit...
Thread dump of this run
Log of the code variant issue-free (Sleeping for 5 secs)
Connected to the target VM, address: '127.0.0.1:37143', transport: 'socket'
Starting...
Beginning setup.
2023-08-30 02:23:02,615 DEBUG (main) [com.microsoft.azure.sdk.iot.provisioning.device.ProvisioningDeviceClient] - Initialized a ProvisioningDeviceClient instance using SDK version 2.1.1
2023-08-30 02:23:02,619 DEBUG (main) [com.microsoft.azure.sdk.iot.provisioning.device.ProvisioningDeviceClient] - Starting provisioning thread...
2023-08-30 02:23:02,628 INFO (global.azure-devices-provisioning.net-9301ed5f-CxnPendingConnectionId-azure-iot-sdk-ProvisioningTask) [com.microsoft.azure.sdk.iot.provisioning.device.internal.task.ProvisioningTask] - Opening the connection to device provisioning service...
2023-08-30 02:23:02,739 INFO (global.azure-devices-provisioning.net-9301ed5f-Cxn9301ed5f-azure-iot-sdk-ProvisioningTask) [com.microsoft.azure.sdk.iot.provisioning.device.internal.task.ProvisioningTask] - Connection to device provisioning service opened successfully, sending initial device registration message
2023-08-30 02:23:02,741 INFO (global.azure-devices-provisioning.net-9301ed5f-Cxn9301ed5f-azure-iot-sdk-RegisterTask) [com.microsoft.azure.sdk.iot.provisioning.device.internal.task.RegisterTask] - Authenticating with device provisioning service using symmetric key
Inside registration callback function
Press any key to exit...
Thread dump of this run
The text was updated successfully, but these errors were encountered:
…t cleanup all threads
If the provisioning task was interrupted by the user closing the client, it needs to shutdown its executor service that runs the RegisterTask
This fix addresses #1736
Thanks for catching this. It looks like the thread running the ProvisioningTask (which owns the thread that runs the RegisterTask thread that is leaking) isn't catching interrupted exceptions and cleaning up appropriately.
…t cleanup all threads (#1741)
If the provisioning task was interrupted by the user closing the client, it needs to shutdown its executor service that runs the RegisterTask
This fix addresses #1736
Context
Description of the issue
To contextualize a bit, I have a service running under my device that implements Azure DPS and allows clients to provision their devices using symmetric keys. I have a task that periodically retries the provisioning in case of failure, however, the user can start/cancel the provisioning process and the retry task (before the device is assigned in the cloud) at any time. Provisioning is done using the async
ProvisioningDeviceClient.registerDevice()
method.However, during tests with some failure cases, after killing the provisioning process at some times I would have a
azure-iot-sdk-RegisterTask
named thread still lingering around in aWAITING
state and it never finishes.After some debugging, I believe that this happens when the provisioning process is asked to be stopped (thus
ProvisioningDeviceClient.close()
is called) too quickly after callingProvisioningDeviceClient.registerDevice()
.It appears to me that, if
close()
is called before the task fromRegisterTask
is done, the thread will never finish.I was able to create a scenario to reproduce this problem using a sample code from the SDK with some modifications to use the async method instead (attached below).
If you take a thread dump after the
Press any key to exit...
message appears in the console, there should be aazure-iot-sdk-RegisterTask
thread running.If you switch the commented lines and wait a longer time, all DPS threads should be cleared. (During some tests I saw a
ProvisioningTask
thread still running, but that one was finalized after some seconds).Code sample exhibiting the issue
Console log of the issue
Log of the code variant that presents the issue (Sleeping for 500ms)
Thread dump of this run
![Screenshot from 2023-08-30 02-15-41](https://private-user-images.githubusercontent.com/1234896/264102050-735b0931-2dfc-4bbd-8130-bb7c69de2f6d.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzkzODAwNzIsIm5iZiI6MTczOTM3OTc3MiwicGF0aCI6Ii8xMjM0ODk2LzI2NDEwMjA1MC03MzViMDkzMS0yZGZjLTRiYmQtODEzMC1iYjdjNjlkZTJmNmQucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI1MDIxMiUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNTAyMTJUMTcwMjUyWiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9MzhiNWVlM2Y4MmFhNTg3Y2YyYTk0Y2MwZDk2NDNhYzU2ZTIwMjg2MjZiMjg4N2FmZGIwYjZhY2FlZDNkMGFjMyZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.MIZfdHyhwhwlU0zb9KJb0OafsAWcGI_4WMBlCytwI_0)
Log of the code variant issue-free (Sleeping for 5 secs)
Thread dump of this run
![Screenshot from 2023-08-30 02-15-09](https://private-user-images.githubusercontent.com/1234896/264102190-6e3981dd-2bdd-4634-b620-0d0c4887582e.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzkzODAwNzIsIm5iZiI6MTczOTM3OTc3MiwicGF0aCI6Ii8xMjM0ODk2LzI2NDEwMjE5MC02ZTM5ODFkZC0yYmRkLTQ2MzQtYjYyMC0wZDBjNDg4NzU4MmUucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI1MDIxMiUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNTAyMTJUMTcwMjUyWiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9YzMzN2M4N2E2NTNhOWRiNDhmZjVhYjVjYjhjOTc3MTM1MGZjNDk3OTM1MWEzMzdiYmIzNjFjNjQxMWE4YTg1ZCZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.1lKzDH_f38hH07Rc0vJfqlOajv-ZwpN-OPsqPqoMVG8)
The text was updated successfully, but these errors were encountered: