-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix warnings in tcp/udp UT #2744
Fix warnings in tcp/udp UT #2744
Conversation
Drv/Ip/SocketReadTask.cpp
Outdated
@@ -72,13 +72,15 @@ void SocketReadTask::readTask(void* pointer) { | |||
SocketIpStatus status = SOCK_SUCCESS; | |||
SocketReadTask* self = reinterpret_cast<SocketReadTask*>(pointer); | |||
do { | |||
self->m_task_lock.lock(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense!
if (recv_thread) { | ||
//this->component.close(); | ||
if ((1 + i) == iterations) { | ||
this->component.m_task_lock.lock(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These locks should be done in the shutdown and close methods, not just the UT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved the mutex in the close/shutdown functions. I had to update the TcpServerTester.cpp
a bit for it to work: the client needs to be closed after the server is closed. I do not have the explanation for this behavior, but otherwise I was getting the following warning:
Connected to 127.0.0.1:57755 as a tcp client
Accepted client at 127.0.0.1:57755
[WARNING] Failed to recv from port with status -8 and errno 0
There was a comment stating: // Client must be closed first or the server risks binding to an existing address
. Let me know if it is a problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also added the muted on the stop function. Can you confirm that it is something that is expected?
@LeStarch is there something we can do in the UT to check that no warnings are logged using During my tests, I used that warning messages to realize that something was wrong. I also used a timeout to make sure that the tests were not taking too long. It was especially useful for checking that the TcpServer tests were working properly. The test could take a few seconds on failure, where it's supposed to be less than 500ms in nominal conditions on my hardware. However, I don't think that's something that can/should be implemented, since it's hardware dependent. |
// Properly stop the client on the last iteration | ||
if ((1 + i) == iterations && recv_thread) { | ||
this->component.shutdown(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LeStarch Can you confirm that we do not need this shutdown
, since it's already part of the stop
function?
Keeping it is causing a timeout on my setup (when the UT is tasking more than 1s)
@LeStarch @thomas-bc Let me know if there is something else to change in this PR |
@LeStarch @thomas-bc I don't really understand where the problem is coming from on the Mac OS side and I can not reproduce the problem on my setup. Is it possible for you to have a look at this? |
@JohanBertrand yes we can! |
@LeStarch I fixed the merge conflict and merged devel, it should be good to be merged once the issue on Mac OS is resolved |
We have been working to get #2683 merged and are about to do so. In that PR we substantially rearchitect the mutex handling. I will note which changes need to be kept and which are likely superseded. |
Drv/Ip/IpSocket.cpp
Outdated
@@ -219,8 +219,13 @@ SocketIpStatus IpSocket::recv(U8* data, U32& req_read) { | |||
for (U32 i = 0; (i < SOCKET_MAX_ITERATIONS) && (size <= 0); i++) { | |||
// Attempt to recv out data | |||
size = this->recvProtocol(data, req_read); | |||
|
|||
if ((size == -1) && (errno == EAGAIN)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JohanBertrand @csmith608 this file's changes should be kept, as they improve error handling (
EAGAIN
is differentiated from EINTR
).
Drv/Ip/SocketReadTask.cpp
Outdated
@@ -51,34 +51,42 @@ SocketIpStatus SocketReadTask::open() { | |||
} | |||
|
|||
void SocketReadTask::shutdown() { | |||
this->m_task_lock.lock(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JohanBertrand @csmith608 These likely need to be discarded as they affect locking.
Drv/Ip/SocketReadTask.hpp
Outdated
@@ -171,7 +171,7 @@ class SocketReadTask { | |||
Os::Task m_task; | |||
bool m_reconnect; //!< Force reconnection | |||
bool m_stop; //!< Stops the task when set to true | |||
|
|||
Os::Mutex m_task_lock; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JohanBertrand @csmith608 These likely need to be discarded as they affect locking.
@@ -106,7 +106,7 @@ void UdpTester::test_with_loop(U32 iterations, bool recv_thread) { | |||
} | |||
} | |||
// Properly stop the client on the last iteration | |||
if ((1 + i) == iterations && recv_thread) { | |||
if (((1 + i) == iterations) && recv_thread) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JohanBertrand @csmith608 keep, code clean-up
// Properly stop the client on the last iteration | ||
if ((1 + i) == iterations && recv_thread) { | ||
this->component.shutdown(); | ||
if (((1 + i) == iterations) && recv_thread) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JohanBertrand @csmith608 keep, code clean-up
@@ -88,15 +88,16 @@ void TcpServerTester ::test_with_loop(U32 iterations, bool recv_thread) { | |||
while (not m_spinner) {} | |||
} | |||
} | |||
client.close(); // Client must be closed first or the server risks binding to an existing address | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JohanBertrand @csmith608 Review with upcoming change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It appears I already changed where this occurs in my latest changes, although I can't remember exactly why. https://github.com/nasa/fprime/blob/devel/Drv/TcpServer/test/ut/TcpServerTester.cpp#L91-L103
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the position of client.close()
in the last commit, after updating the branch with the changes from devel
.
From what I have seen in my different tests, if client.close()
is called before this->component.close();
, there are some cases where ::accept(serverFd, nullptr, nullptr);
in Drv/Ip/TcpServerSocket.cpp
(See: https://github.com/nasa/fprime/blob/devel/Drv/Ip/TcpServerSocket.cpp#L110) will hang forever.
It usually happened about once every 1600 CI runs.
I just ran the UT for 100.000 times on my machine with the changes in this PR, without failures on the CI, timeouts or hang.
this->component.stop(); | ||
this->component.join(); | ||
} else { | ||
this->component.close(); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JohanBertrand @csmith608 Review with upcoming change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment about this placement already being changed, so I'm going to leave as is unless someone has thoughts
@LeStarch The branch has been updated with devel. I can see that the messages logged through I used it in this PR for checking that no Edit: After manually enabling print in the UT, I am still getting a lot of Out of curiosity, did you @csmith608 investigate this in your previous PR? |
Hey @JohanBertrand, these unit tests fail on Mac due to the first recv after the send sometimes getting an EAGAIN rather than the actual data. This issue does not appear on Linux. |
I am going to merge this, as it seems to correct the UT issues, is what I expect. Any other issues should become a new issue, or PR. |
Change Description
I am not sure we want to have those changes in the codebase. That could be used as a baseline for a better update.
Most of the warnings in the UT with TCP/UDP module should be fixed with those changes. A mutex is making sure that the receive/open/startup functions are called only if the stop/close function are not going to be called in an other thread.
I added the changes in the unit tests instead of the class itself (in open/close), because I did not find a way to make it work properly.
I also needed to make
recv
to stop iterating iferrno == EAGAIN
to avoid a timeout. I'm not sure if that's the proper way to fix this issue.The only warning still happening for me, after 10.000 tests, is the following (happened only once):
I feel like the warnings in the test should make the UT to fail. However, I am not sure how to catch the warning streamed through
Fw::Logger::logMsg
.Also fix an error in
StringUtils.cpp
when compiled on ubuntu20.04 with a FW_ASSERT: