Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix: Fix interrupted network read operation #120

Merged
merged 34 commits into from
Dec 5, 2020
Merged
Show file tree
Hide file tree
Changes from 15 commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
5ec1c2f
Fix issue of data recv being interrupted
aggarw13 Dec 3, 2020
4205109
Rename elapsedTimeMs variable to its express its new meaning
aggarw13 Dec 3, 2020
0704c72
Use configuration const for recvExact timeout
aggarw13 Dec 3, 2020
adce766
Remove timeout check from discardPacket and address CI check failures
aggarw13 Dec 3, 2020
f5aa1b3
Fix more CI check failures
aggarw13 Dec 3, 2020
76166d0
Remove another unused local variable
aggarw13 Dec 3, 2020
1d73bb7
Re-instate timeout in discard to reduce scope of changes
aggarw13 Dec 3, 2020
0c5a9e3
Remove unused variable again
aggarw13 Dec 3, 2020
34adc86
Fix failing unit test
aggarw13 Dec 3, 2020
2a3cc9a
Rename new config macro, and attempt to fix CBMC failures
aggarw13 Dec 3, 2020
70b181a
Doc: Improvement suggestions from code review
aggarw13 Dec 3, 2020
68fcb16
Fix quality check failures
aggarw13 Dec 3, 2020
4ba8c53
Add test case to check partial network reads with zero timeout durati…
aggarw13 Dec 3, 2020
b710790
style: Improving naming
aggarw13 Dec 3, 2020
f8586dc
Address complexity failure
aggarw13 Dec 3, 2020
c59d39b
Address comments
aggarw13 Dec 4, 2020
af1ca02
Doc: Add blocking time equation of Receive/ProcessLoop functions in t…
aggarw13 Dec 4, 2020
86798bf
Improvement in API doc
aggarw13 Dec 4, 2020
e1de841
Set MQTT_RECV_POLLING_TIMEOUT_MS so that recvExact runs in one iterat…
sarenameas Dec 4, 2020
9a792bc
Merge branch 'main' into fix/recv-timeout-issue
aggarw13 Dec 4, 2020
ebcfb92
doc: Add information about zero return value for Transport_Recv_t
aggarw13 Dec 4, 2020
fe0a071
Merge branch 'fix/recv-timeout-issue' of https://github.com/aggarw13/…
aggarw13 Dec 4, 2020
32f0291
fix: prevent possibility of infinite loop in timeout logic of Process…
aggarw13 Dec 4, 2020
dd91637
Merge branch 'main' into fix/recv-timeout-issue
aggarw13 Dec 4, 2020
a4bf796
Merge branch 'main' into fix/recv-timeout-issue
aggarw13 Dec 4, 2020
bf6f94d
style: Minor changes
aggarw13 Dec 4, 2020
092e7ff
Merge branch 'fix/recv-timeout-issue' of https://github.com/aggarw13/…
aggarw13 Dec 4, 2020
f0a1808
hygiene: minor name fix
aggarw13 Dec 4, 2020
fcaf4bb
fix: Add additional check in timeout logic of recvExact
aggarw13 Dec 4, 2020
b923ea5
fix: Add additional check on recvExact
aggarw13 Dec 4, 2020
b1eba61
hygiene: move timeout logic within else block for better readability
aggarw13 Dec 4, 2020
25e74d6
hygiene: remove always true condition from if block
aggarw13 Dec 4, 2020
b4b0c78
Update CBMC proofs that use recvExact to specify a transport recv stu…
sarenameas Dec 5, 2020
95abb93
Uncrustify.:
sarenameas Dec 5, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 31 additions & 21 deletions source/core_mqtt.c
Original file line number Diff line number Diff line change
Expand Up @@ -71,17 +71,25 @@ static uint32_t calculateElapsedTime( uint32_t later,
static MQTTPubAckType_t getAckFromPacketType( uint8_t packetType );

/**
* @brief Receive bytes into the network buffer, with a timeout.
* @brief Receive bytes into the network buffer.
*
* @param[in] pContext Initialized MQTT Context.
* @param[in] bytesToRecv Number of bytes to receive.
* @param[in] timeoutMs Time remaining to receive the packet.
*
* @note This operation calls the transport receive function
* repeatedly to read bytes from the network until either:
* 1. The requested number of bytes @a bytesToRecv are read.
* OR
* 2. No data is received from the network for MQTT_RECV_POLLING_TIMEOUT_MS duration.
*
* OR
* 3. There is an error in reading from the network.
*
*
* @return Number of bytes received, or negative number on network error.
*/
static int32_t recvExact( const MQTTContext_t * pContext,
size_t bytesToRecv,
uint32_t timeoutMs );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to the removal of this parameter, some more timeouts could probably be removed due to similar reasoning. I don't think it's critical though and definitely shouldn't be done in this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree that the timeout passed to discardPacket and the timeout logic in the function can also be removed (as hygiene). However, I decided to keep the scope of this PR limited to just fixing the recvExact issue.

size_t bytesToRecv );

/**
* @brief Discard a packet from the transport interface.
Expand Down Expand Up @@ -683,13 +691,12 @@ static MQTTPubAckType_t getAckFromPacketType( uint8_t packetType )
/*-----------------------------------------------------------*/

static int32_t recvExact( const MQTTContext_t * pContext,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static int32_t recvExact( const MQTTContext_t * pContext,
                          size_t bytesToRecv )
{
    uint8_t * pIndex = NULL;
    size_t bytesRemaining = bytesToRecv;
    int32_t totalBytesRecvd = 0, bytesRecvd;
    uint32_t timestampBeforeRecvMs = 0U, timeSpentInRecvMs = 0U;
    TransportRecv_t recvFunc = NULL;
    MQTTGetCurrentTimeFunc_t getTimeStampMs = NULL;
    bool receiveError = false;

    assert( pContext != NULL );
    assert( bytesToRecv <= pContext->networkBuffer.size );
    assert( pContext->getTime != NULL );
    assert( pContext->transportInterface.recv != NULL );
    assert( pContext->networkBuffer.pBuffer != NULL );

    pIndex = pContext->networkBuffer.pBuffer;
    recvFunc = pContext->transportInterface.recv;
    getTimeStampMs = pContext->getTime;

    while( ( bytesRemaining > 0U ) && ( receiveError == false ) )
    {
        timestampBeforeRecvMs = getTimeStampMs();

        bytesRecvd = recvFunc( pContext->transportInterface.pNetworkContext,
                               pIndex,
                               bytesRemaining );

        timeSpentInRecvMs = calculateElapsedTime( getTimeStampMs(), timestampBeforeRecvMs );

        if( bytesRecvd < 0 )
        {
            LogError( ( "Network error while receiving packet: ReturnCode=%ld.",
                        ( long int ) bytesRecvd ) );
            totalBytesRecvd = bytesRecvd;
            receiveError = true;
        }
        else if( bytesRecvd > 0 )
        {
            /* It is a bug in the application's transport receive implementation
             * if more bytes than expected are received. To avoid a possible
             * overflow in converting bytesRemaining from unsigned to signed,
             * this assert must exist after the check for bytesRecvd being
             * negative. */
            assert( ( size_t ) bytesRecvd <= bytesRemaining );

            bytesRemaining -= ( size_t ) bytesRecvd;
            totalBytesRecvd += ( int32_t ) bytesRecvd;
            pIndex += bytesRecvd;
            LogDebug( ( "BytesReceived=%ld, BytesRemaining=%lu, "
                        "TotalBytesReceived=%ld.",
                        ( long int ) bytesRecvd,
                        ( unsigned long ) bytesRemaining,
                        ( long int ) totalBytesRecvd ) );
        }
        else
        {
            /* No bytes were read from the network. */
        }

        /* If there is more data to be received and nothing was received
         * for MQTT_RECV_POLLING_TIMEOUT_MS, treat this as error. */
        if( ( bytesRemaining > 0U ) &&
            ( timeSpentInRecvMs >= MQTT_RECV_POLLING_TIMEOUT_MS ) &&
            ( bytesRecvd == 0 ) )
        {
            LogError( ( "Time expired while receiving packet." ) );
            receiveError = true;
        }
    }

    return totalBytesRecvd;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will only calculate the time spent for each call of recvFunc. Wouldn't there be an infinite loop if every recvFunc call returned 0 bytes but spent slightly less than MQTT_RECV_POLLING_TIMEOUT_MS?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In your code suggestion, timeSpentInRecvMs represents the time spent only in a single transport recv call, where as my changes look at the time spent across multiple calls to transport recv function that return zero bytes to determine if there has been a polling timeout.
With your code suggestion, there can be an issue of getting into an infinite loop of the transport recv function returning zero bytes, but the underlying SOCKET_TIMEOUT of transport recv being < MQTT_RECV_POLLING_TIMEOUT_MS, which will prevent the receiveError from being set to true and thus, the loop will not terminate

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. In that case, changing the variable names would make it a bit more readable:

entryTimeMs  --> lastDataRecvdTimestampMs
noDataRecvdTimeMs  --> timeSinceLastDataWasRecvd

size_t bytesToRecv,
uint32_t timeoutMs )
size_t bytesToRecv )
{
uint8_t * pIndex = NULL;
size_t bytesRemaining = bytesToRecv;
int32_t totalBytesRecvd = 0, bytesRecvd;
uint32_t entryTimeMs = 0U, elapsedTimeMs = 0U;
uint32_t lastDataRecvTimeMs = 0U, timeSinceLastDataWasRecvdMs = 0U;
TransportRecv_t recvFunc = NULL;
MQTTGetCurrentTimeFunc_t getTimeStampMs = NULL;
bool receiveError = false;
Expand All @@ -704,7 +711,8 @@ static int32_t recvExact( const MQTTContext_t * pContext,
recvFunc = pContext->transportInterface.recv;
getTimeStampMs = pContext->getTime;

entryTimeMs = getTimeStampMs();
/* Part of the MQTT packet has been read before calling this function. */
lastDataRecvTimeMs = getTimeStampMs();

while( ( bytesRemaining > 0U ) && ( receiveError == false ) )
{
Expand All @@ -719,8 +727,11 @@ static int32_t recvExact( const MQTTContext_t * pContext,
totalBytesRecvd = bytesRecvd;
receiveError = true;
}
else
else if( bytesRecvd > 0 )
{
/* Reset the starting time as we have received some data from the network. */
lastDataRecvTimeMs = getTimeStampMs();

/* It is a bug in the application's transport receive implementation
* if more bytes than expected are received. To avoid a possible
* overflow in converting bytesRemaining from unsigned to signed,
Expand All @@ -737,10 +748,14 @@ static int32_t recvExact( const MQTTContext_t * pContext,
( unsigned long ) bytesRemaining,
( long int ) totalBytesRecvd ) );
}
else
{
/* No bytes were read from the network. */
timeSinceLastDataWasRecvdMs = calculateElapsedTime( getTimeStampMs(), lastDataRecvTimeMs );
Copy link
Contributor

@yourslab yourslab Dec 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think mapping a return value of 0 to mean no data to receive is an assumption that needs to be documented more clearly.

In POSIX sockets and OpenSSL, for example, a return value of 0 means the server has closed the connection or that a close-notify has been sent. No data to receive will actually return a negative error code. In our transport implementations, that error code is mapped appropriately to 0, but this is not going to be immediately obvious to other engineers who try to make their own transport implementations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I have updated that in the transport interface API doc.

}

elapsedTimeMs = calculateElapsedTime( getTimeStampMs(), entryTimeMs );

if( ( bytesRemaining > 0U ) && ( elapsedTimeMs >= timeoutMs ) )
if( ( bytesRemaining > 0U ) &&
RichardBarry marked this conversation as resolved.
Show resolved Hide resolved
( timeSinceLastDataWasRecvdMs >= MQTT_RECV_POLLING_TIMEOUT_MS ) )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I don't really like this name. Not saying it's not accurate, it's just long and I don't think the extra words are very meaningful. For example, something like timeSinceRecvMs still conveys the relevant information, but in a more concise manner.

I don't think the variable needs to be this explicit if it's clear from reading the rest of the code. Feel free to disagree.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I like the recommendation. Have made the name concise.

{
LogError( ( "Time expired while receiving packet." ) );
receiveError = true;
Expand All @@ -760,7 +775,6 @@ static MQTTStatus_t discardPacket( const MQTTContext_t * pContext,
int32_t bytesReceived = 0;
size_t bytesToReceive = 0U;
uint32_t totalBytesReceived = 0U, entryTimeMs = 0U, elapsedTimeMs = 0U;
uint32_t remainingTimeMs = timeoutMs;
MQTTGetCurrentTimeFunc_t getTimeStampMs = NULL;
bool receiveError = false;

Expand All @@ -779,7 +793,7 @@ static MQTTStatus_t discardPacket( const MQTTContext_t * pContext,
bytesToReceive = remainingLength - totalBytesReceived;
}

bytesReceived = recvExact( pContext, bytesToReceive, remainingTimeMs );
bytesReceived = recvExact( pContext, bytesToReceive );

if( bytesReceived != ( int32_t ) bytesToReceive )
{
Expand All @@ -795,12 +809,8 @@ static MQTTStatus_t discardPacket( const MQTTContext_t * pContext,

elapsedTimeMs = calculateElapsedTime( getTimeStampMs(), entryTimeMs );

/* Update remaining time and check for timeout. */
if( elapsedTimeMs < timeoutMs )
{
remainingTimeMs = timeoutMs - elapsedTimeMs;
}
else
/* Check for timeout. */
if( elapsedTimeMs >= timeoutMs )
{
LogError( ( "Time expired while discarding packet." ) );
receiveError = true;
Expand Down Expand Up @@ -846,7 +856,7 @@ static MQTTStatus_t receivePacket( const MQTTContext_t * pContext,
else
{
bytesToReceive = incomingPacket.remainingLength;
bytesReceived = recvExact( pContext, bytesToReceive, remainingTimeMs );
bytesReceived = recvExact( pContext, bytesToReceive );

if( bytesReceived == ( int32_t ) bytesToReceive )
{
Expand Down
21 changes: 21 additions & 0 deletions source/include/core_mqtt_config_defaults.h
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,27 @@
#define MQTT_PINGRESP_TIMEOUT_MS ( 500U )
#endif

/**
* @brief The maximum duration between non-empty network reads while
* receiving an MQTT packet via the #MQTT_ProcessLoop or #MQTT_ReceiveLoop
* API functions.
*
* When an incoming MQTT packet is detected, the transport receive function
* may be called multiple times until all of the expected number of bytes of the
* packet are received. This timeout represents the maximum polling duration that
* is allowed without any data reception from the network for the incoming packet.
* If the timeout expires, the #MQTT_ProcessLoop and #MQTT_ReceiveLoop functions
* return #MQTTRecvFailed.
*
* <b>Possible values:</b> Any positive integer up to SIZE_MAX. Recommended to
* use a small timeout value. <br>
* <b>Default value:</b> `10`
*
*/
#ifndef MQTT_RECV_POLLING_TIMEOUT_MS
aggarw13 marked this conversation as resolved.
Show resolved Hide resolved
#define MQTT_RECV_POLLING_TIMEOUT_MS ( 10U )
#endif

/**
* @brief Macro that is called in the MQTT library for logging "Error" level
* messages.
Expand Down
14 changes: 14 additions & 0 deletions test/cbmc/include/core_mqtt_config.h
Original file line number Diff line number Diff line change
Expand Up @@ -70,4 +70,18 @@ struct NetworkContext
*/
#define MQTT_PINGRESP_TIMEOUT_MS ( 500U )

/**
* @brief The maximum duration of receiving no data over network when
* attempting to read an incoming MQTT packet by the #MQTT_ProcessLoop or
* #MQTT_ReceiveLoop API functions.
*
* When an incoming MQTT packet is detected, the transport receive function
* may be called multiple times until all the expected number of bytes for the
* packet are received. This timeout represents the maximum duration of polling
* for any data to be received over the network for the incoming.
* If the timeout expires, the #MQTT_ProcessLoop or #MQTT_ReceiveLoop functions
* return #MQTTRecvFailed.
*/
#define MQTT_RECV_POLLING_TIMEOUT_MS ( 3U )
aggarw13 marked this conversation as resolved.
Show resolved Hide resolved

#endif /* ifndef CORE_MQTT_CONFIG_H_ */
75 changes: 70 additions & 5 deletions test/unit-test/core_mqtt_utest.c
Original file line number Diff line number Diff line change
Expand Up @@ -132,6 +132,7 @@ typedef struct ProcessLoopReturns
MQTTStatus_t processLoopStatus; /**< @brief Return value of the process loop. */
bool incomingPublish; /**< @brief Whether the incoming packet is a publish. */
MQTTPublishInfo_t * pPubInfo; /**< @brief Publish information to be returned by the deserializer. */
uint32_t timeoutMs; /**< @brief The timeout value to call MQTT_ProcessLoop API with. */
} ProcessLoopReturns_t;

/**
Expand Down Expand Up @@ -365,6 +366,20 @@ static int32_t transportRecvOneByte( NetworkContext_t * pNetworkContext,
return 1;
}

/**
* @brief Mocked transport returning zero bytes to simulate reception
* of no data over network.
*/
static int32_t transportRecvNoData( NetworkContext_t * pNetworkContext,
void * pBuffer,
size_t bytesToRead )
{
( void ) pNetworkContext;
( void ) pBuffer;
( void ) bytesToRead;
return 0;
}

/**
* @brief Initialize the transport interface with the mocked functions for
* send and receive.
Expand Down Expand Up @@ -405,6 +420,7 @@ static void resetProcessLoopParams( ProcessLoopReturns_t * pExpectParams )
pExpectParams->processLoopStatus = MQTTSuccess;
pExpectParams->incomingPublish = false;
pExpectParams->pPubInfo = NULL;
pExpectParams->timeoutMs = MQTT_NO_TIMEOUT_MS;
}

/**
Expand Down Expand Up @@ -548,7 +564,7 @@ static void expectProcessLoopCalls( MQTTContext_t * const pContext,
}

/* Expect the above calls when running MQTT_ProcessLoop. */
mqttStatus = MQTT_ProcessLoop( pContext, MQTT_NO_TIMEOUT_MS );
mqttStatus = MQTT_ProcessLoop( pContext, pExpectParams->timeoutMs );
TEST_ASSERT_EQUAL( processLoopStatus, mqttStatus );

/* Any final assertions to end the test. */
Expand Down Expand Up @@ -853,7 +869,7 @@ void test_MQTT_Connect_partial_receive()

setupTransportInterface( &transport );
setupNetworkBuffer( &networkBuffer );
transport.recv = transportRecvOneByte;
transport.recv = transportRecvNoData;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to write a test to replicate the scenario which caught this issue:

  • revc returns less than requested data (possibly return n/2 bytes so that second call with return complete data).
  • processLoop is called with 0 timeout.

This test should fail without this change and pass with this change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can achieve that with the existing mock recv function that returns only byte at a time.
As the bug being addressed is primarily with "less than expected data received" when the "remaining timeout" value is nearing zero, I can add a test to mimic that by passing 0 as the timeout to MQTT_ProcessLoop and using the transportRecvOneByte mock function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the test


memset( &mqttContext, 0x0, sizeof( mqttContext ) );
MQTT_Init( &mqttContext, &transport, getTime, eventCallback, &networkBuffer );
Expand All @@ -864,14 +880,16 @@ void test_MQTT_Connect_partial_receive()
incomingPacket.type = MQTT_PACKET_TYPE_CONNACK;
incomingPacket.remainingLength = 2;

/* Not enough time to receive entire packet, for branch coverage. This is due
* to the fact the mocked receive function reads only one byte at a time. */
timeout = 1;
/* Timeout in receiving entire packet, for branch coverage. This is due to the fact that the mocked
* receive function always returns 0 bytes read. */
MQTT_GetIncomingPacketTypeAndLength_ExpectAnyArgsAndReturn( MQTTSuccess );
MQTT_GetIncomingPacketTypeAndLength_ReturnThruPtr_pIncomingPacket( &incomingPacket );
status = MQTT_Connect( &mqttContext, &connectInfo, NULL, timeout, &sessionPresent );
TEST_ASSERT_EQUAL_INT( MQTTRecvFailed, status );

/* Update to use mock receive function that receives one byte at a time for the
* rest of the test. */
mqttContext.transportInterface.recv = transportRecvOneByte;
timeout = 10;

/* Not enough space for packet, discard it. */
Expand Down Expand Up @@ -1573,6 +1591,53 @@ void test_MQTT_ProcessLoop_handleIncomingPublish_Error_Paths( void )
TEST_ASSERT_FALSE( isEventCallbackInvoked );
}

/**
* @brief This test checks that the ProcessLoop API function is able to
* support receiving an entire incoming MQTT packet over the network when
* the transport recv function only reads less than requested bytes at a
* time, and the timeout passed to the API is "0ms".
*/
void test_MQTT_ProcessLoop_Zero_Duration_And_Partial_Network_Read( void )
{
MQTTStatus_t mqttStatus;
MQTTContext_t context;
TransportInterface_t transport;
MQTTFixedBuffer_t networkBuffer;
ProcessLoopReturns_t expectParams = { 0 };

setupNetworkBuffer( &networkBuffer );

transport.send = transportSendSuccess;

/* Set the transport recv function for the test to the mock function that represents
* partial read of data from network (i.e. less than requested number of bytes)
* at a time. */
transport.recv = transportRecvOneByte;

/* Initialize the context. */
mqttStatus = MQTT_Init( &context, &transport, getTime, eventCallback, &networkBuffer );
TEST_ASSERT_EQUAL( MQTTSuccess, mqttStatus );

/* Set flag required for configuring behavior of expectProcessLoopCalls()
* helper function. */
modifyIncomingPacketStatus = MQTTSuccess;

/* Test the ProcessLoop() call with zero duration timeout to verify that it
* will be able to support reading the packet over network over multiple calls to
* the transport receive function. */
expectParams.timeoutMs = MQTT_NO_TIMEOUT_MS;

/* Test with an incoming PUBLISH packet whose payload is read only one byte
* per call to the transport recv function. */
currentPacketType = MQTT_PACKET_TYPE_PUBLISH;
/* Set expected return values during the loop. */
resetProcessLoopParams( &expectParams );
expectParams.stateAfterDeserialize = MQTTPubAckSend;
expectParams.stateAfterSerialize = MQTTPublishDone;
expectParams.incomingPublish = true;
expectProcessLoopCalls( &context, &expectParams );
}

/**
* @brief This test case covers all calls to the private method,
* handleIncomingAck(...),
Expand Down