Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wifi: eswifi: fix random crash due to work->handler set to NULL #50153

Merged

Conversation

mzajac-avs
Copy link
Contributor

@mzajac-avs mzajac-avs commented Sep 12, 2022

This fixes the random crash on L475 disco iot1 board described here: #49963

@zephyrbot zephyrbot added the area: Wi-Fi Wi-Fi label Sep 12, 2022
@mzajac-avs mzajac-avs changed the title WIFI: eswifi: fix random crash due to work->handler set to NULL wifi: eswifi: fix random crash due to work->handler set to NULL Sep 12, 2022
@mzajac-avs mzajac-avs force-pushed the l475_fix_random_crash branch from 51360aa to 4e14e90 Compare September 12, 2022 11:50
@@ -397,7 +397,7 @@ static int eswifi_off_put(struct net_context *context)
}

if (--socket->usage <= 0) {
memset(socket, 0, sizeof(*socket));
socket->context = NULL;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That certainly works, but wouldn't it be possible to cancel any ongoing/scheduled work instead? I mean once socket is 'released' it's no more assigned to a context, and can be reused at any time, so we don't want anything to access it anymore after this line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well I'm not sure did I try to do this correctly but I added a static struct k_work_sync work_sync to the file and called k_work_cancel_delayable_sync(&socket->read_work, &work_sync); before the memset(). I tired it a few times but it seems that the execution of the application frozen each time (but there was no crash). The stack showed that I'm in arch_cpu_idle () at /home/mzajac/Repo/Internal/zephyr-root/zephyr/arch/arm/core/aarch32/cpu_idle.S:105

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@loicpoulain Is it fine for you then ?

Copy link
Collaborator

@loicpoulain loicpoulain Oct 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@erwango yes, though it's more a workaround than a fix. but better than nothing.

loicpoulain
loicpoulain previously approved these changes Oct 14, 2022
nandojve
nandojve previously approved these changes Oct 14, 2022
@erwango
Copy link
Member

erwango commented Oct 14, 2022

@mzajac-avs Please check your commit message. It should conform to the following template:

<dir>: <dir>: <Title summarizing the change>

<A description useful to the reader>

Signed-off-by: foo <foo@bar.baz>

For references, I'd advise you to check:

This fixes a random crash caused by race condition in the eswifi
driver used by the disco L475 iot1 board.

Signed-off-by: Michał Zając <m.zajac@avsystem.com>
@mzajac-avs mzajac-avs dismissed stale reviews from nandojve and loicpoulain via 172b852 October 18, 2022 15:32
@mzajac-avs mzajac-avs force-pushed the l475_fix_random_crash branch from 4e14e90 to 172b852 Compare October 18, 2022 15:32
@mzajac-avs
Copy link
Contributor Author

@erwango Fixed the commit message

@erwango
Copy link
Member

erwango commented Oct 19, 2022

@loicpoulain, @nandojve can you approve again? Your approval got lost.

@loicpoulain loicpoulain self-requested a review October 19, 2022 09:06
@carlescufi carlescufi merged commit f93c682 into zephyrproject-rtos:main Oct 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants