-
Notifications
You must be signed in to change notification settings - Fork 735
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IoTHubDeviceClient_UploadMultipleBlocksToBlob_Async can miss some errors #2569
Comments
Hi @pulkomandy , could you please take a look at these new more-granular upload-to-blob API functions? They would give you more control over detecting intermediary errors and attemping retries. |
But it's well noted, your assessment of the issue is correct. We will look into fixing it. Update coming soon. |
Hello, We have been using this patch to fix the problem:
Sorry, we forgot to update the issue with the patch. We have confirmed that it works for us and solves the problem. We know of the new granular API, but if we can avoid rewriting our code with a simple fix like this, it saves us a lot of time. |
Thank you a lot, @pulkomandy . |
Hi @pulkomandy , we merged a fix for this issue. Thank you very much for reporting it! |
Hello, Thanks for the fix. I don't know if we can fully verify it, since we noticed the error after a problem on the Azure IoT cloud side, which we can't reproduce predictably. We will try to update the version of the SDK we use whenever possible (possibly waiting for the next release). |
Perfect. Thank you @pulkomandy , we will close this issue for now then. |
Hello @ewertons, we have started using this patch and it improves things, unfortunately we have hit another problematic case. The following scenario happens:
At this point, our code decides that the upload is succesful, we delete the corresponding local file and the upload context. However, later on there is an error in IotHubClient_LL_UploadToBlob_NotifyCompletion (this happens 1 minute later in our case since it is a timeout error):
since there were no errors during the upload phase, isCallbackInvokedWithError is not set, and so the error callback is called. When using a synchronous upload function such as IoTHubClientCore_LL_UploadMultipleBlobsToBlockEx, this would be equivalent to the callback reporting a success, but the function eventually returning an error code. As a result of this, we can't know for sure when an upload is really fully completed. Even after we received a notification with FILE_UPLOAD_OK and data or size == NULL, which should mean the upload is succesfully completed, we can still receive an error callback later on. So, we don't know when it is safe to consider an upload complete and successful. I'm not sure what's the best option here, some ideas are:
|
Hi @pulkomandy , I was investigating this latest issue and I think I got it wrong the first time.
That way, for the convenience layer the final callback will always be called after the whole upload to blob logic is completed. I'll post a PR with those changes, once I do that could you please validate it as well before we merged it? |
Hello,
Thanks for the update.
The solution you describe seems fine to me, and may be a bit simpler than what I suggested in my merge request. I am out of office this week, but that isn't a problem, let me know when you have a patch ready for testing and I'll forward it to my colleagues, or look into it when I'm back.
It will take some time before we ship a release to our customer and they are able to test it. Unfortunately, we could reproduce these problems only with on-the-field device and not with our testing platform.
|
This is the initial PR. I could not run our pipeline against it yet, but it worked locally in my sample run. |
PR #2615 posted and pipeline tests run. Please review if possible, we will be merging it soon. |
HI @pulkomandy, |
** Environment **
** Problem description **
We use IoTHubDeviceClient_UploadMultipleBlocksToBlob_Async to start an upload. However, there is an error during the upload setup. Logs extract:
These error come from
azure-iot-sdk-c/iothub_client/src/iothub_client_ll_uploadtoblob.c
Line 908 in 6b77538
azure-iot-sdk-c/iothub_client/src/iothub_client_core_ll.c
Line 2881 in 6b77538
In our case we are using asynchronous upload so this is called from here inside uploadMultipleBlock_thread:
https://github.com/Azure/azure-iot-sdk-c/blob/main/iothub_client/src/iothub_client_core.c#L2437
The thread function returns an error, but this is not propagated to the caller function that started the thread: https://github.com/Azure/azure-iot-sdk-c/blob/main/iothub_client/src/iothub_client_core.c#L2475 (and it can't be, because it's happening asynchronously in another thread).
There is no way for the caller code to know that the upload has failed. IoTHubClientCore_UploadMultipleBlocksToBlobAsync but the upload callback is never called. I think it could make sense in this case to call the callback (IOTHUB_CLIENT_FILE_UPLOAD_GET_DATA_CALLBACK_EX getDataCallbackEx) with an IOTHUB_CLIENT_FILE_UPLOAD_RESULT error code indicating a failed upload. Otherwise, our software has no way to know that the upload is failed and should be retried. In our case, we make sure to only have one upload at a time, so we stay blocked and never upload anything anymore if this happens.
The text was updated successfully, but these errors were encountered: