-
Notifications
You must be signed in to change notification settings - Fork 633
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HandleKeepAlive frequently calls MQTT_Ping #1915
Comments
It is updated here - https://github.com/FreeRTOS/coreMQTT/blob/main/source/core_mqtt.c#L1794.
It is needed for both idle Tx or Rx and the rationale is explained in this PR - FreeRTOS/coreMQTT#191.
This means the transport interface told us that it has sent 2 bytes.
How did you determine that no data was received on the server? |
So far, according to the client log, the sdk has successfully sent a 2bytes packet, but received no reply, and then returned to iot_core MQTTKeepAliveTimeout. We do not know if aws iot server received the data. We want to know what might cause this to happen and whether there is log support for debug on the server side. |
You can try this - https://docs.aws.amazon.com/iot/latest/developerguide/cloud-watch-logs.html. I'd also suggest to capture network traffic from the device. |
Would you please describe the problem that you discovered? I do not understand the problem from the image you shared. |
Thank you for explaining. No, we are not aware of any such issue. Would you please run the process in the debugger and see why is crashes? |
We opened cloudwatch logs on the server side according to your suggestion. After the camera sends data at 14:49:58, the server logs from 14:45:50 to 14:50:50 as follows (in the json file, the time difference is 8 hours). It seems that there is no more detailed log. Does it mean that the server has not received the package? And because MQTT uses TLS protocol encryption, it is difficult to repeat the problem and capture the packet. Is there any suggestion to help us debug? |
Assuming you are running on Linux, you should be able to use GDB. There is plenty of tutorials available on how to use GDB.
Most likely.
Your underlying transport is failing to send the data. Even though we wont be able to decrypt network traffic, it will tell us if the PINGREQ is put on wire or not. |
I would like to ask some specific questions. Before, you said that MQTT_Ping has a return value, indicating that the client has sent a message out. Under what circumstances will this return value be obtained? Did it actually send the data? Because according to your later statement, it seems that you suspect that the client is not sending the data at the bottom. And how to tell if the server did not receive the packet rather than received the packet but did not reply?Is there any way to confirm that the server really did not receive the packet? |
MQTT_Ping does have a return value as you can see here - https://github.com/FreeRTOS/coreMQTT/blob/main/source/core_mqtt.c#L2914.
A function's return value can be obtained under any circumstance. What do you mean by this question?
coreMQTT uses transport interface to send the data, the implementation of which is supplied by the application. In this case, you are likely using one of the following implementation of transport interface - https://github.com/aws/aws-iot-device-sdk-embedded-C/tree/main/platform/posix/transport/src. If the transport interface returns success indicating that it sent the data but does not send it, the MQTT library has no way to find if the data was not sent. Assuming you are using Linux, this scenario is possible because Linux copies the data into the kernel buffers and returns success to the application - it may later fail to send the data if the connection is broken.
That is why I have been asking you to capture network traffic and share? Are you able to do that? |
Related to aws#1915 Update `handleKeepAlive` in `source/core_mqtt.c` to correctly update `pContext->lastPacketTime` when data is received. * Ensure `MQTT_Ping` is not called frequently due to `pContext->lastPacketTime` not updating. * Retain the logic for sending ping packets based on both received and sent message intervals. * Address the issue related to the underlying transport failing to send the data, ensuring reliable data transmission.
@xiaomizhouya Did you get a chance to debug it further? |
Hey @xiaomizhouya, Looking at the following in the CloudWatch logs, the client was disconnected by the broker because of MQTT_KEEP_ALIVE_TIMEOUT :
According to this it means that there was no communication between the server and the client for 1.5x the keep alive time. Hence AWS IoT has not received the PINGREQ. Also the fact that |
Closing this issue. Feel free to re-open if you need anything else. |
When testing MQTT_ProcessLoop, we noticed a problem that MQTT_Ping was called frequently in the handleKeepAlive function, as shown in the red box in the figure, and every loop would come into this judgment logic. We found that in the SDK, when receiving a message from the server, pContext ->lastPacketTime does not change its value, which means that pContext ->lastPacketTime is always 0; When the current time exceeds PACKET-RX_TIMEOUT_S, it will continue to enter this logic and frequently send packets. Excuse me, is this a bug?
Also, I don't quite understand the logic of this logic. Why do I need to send ping packets based on the received message interval? My understanding is that sending ping packets based on the interval between the last message sent is sufficient. I hope to receive an answer.
Due to the above issue, we encountered a problem. We added log in the sdk and found that the device called sendBuffer function in MQTT_PING function to send data. When the return value is 2, can it indicate that the device has successfully sent data? However, according to the log of the server, no data packet was received from the device. I mean, what's the reason for this? Did the packet get lost in the network? May I ask if there is any log on the server side to assist us in debugging?
The text was updated successfully, but these errors were encountered: