Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lv_task_handler can fail to return in high load scenarios, causing watchdog resets #2124

Open
mark9064 opened this issue Sep 18, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@mark9064
Copy link
Member

The LVGL task handler re-evaluates higher priority tasks after executing lower priority ones
The animation task is higher priority than refresh tasks
If the animation task or refresh task takes a long time, after executing the refresh task the animation task may be ready again
This causes lv_task_handler to infinitely execute animation->refresh->animation->refresh etc without ever returning
If lv_task_handler doesn't return, the DisplayApp message queue becomes full (usually with touchscreen events) and blocks SystemTask, causing watchdog resets

This happens very rarely or even never in standard InfiniTime. But by either decreasing the LV_DISPLAY_REFR_PERIOD and increasing animation load (e.g animation scroll speed for labels in px/s) (I have not tested if both are required to trigger this behaviour, but I know that both are sufficient), it's possible to reproduce this lockup.

I'd suggest that we replace the LVGL task handler with one that provides a hard upper bound on the number of task executions. Perhaps by only executing each task once and exiting early if it's already spent too long doing work (many tasks waiting)? This would allow task priorities to still be usefully enforced. Lower priority refresh tasks might still get starved by animations though.

Possible cause for #2012 (the notification header scrolling animation)

@mark9064
Copy link
Member Author

Looking at LVGL main/master, they've replaced the scheduler with one that does have bounded execution time (provided timers/tasks are not being created or destroyed). Task priority has also been removed at some point. Both seem like sensible decisions. Removing priority is out of scope, but I think a scheduler that ignores it will be fine. We should really just bite the bullet on upgrading LVGL at some point

@mark9064
Copy link
Member Author

Even after changing the scheduler to a bounded time one locally I can still trigger lockups. So I think my original theory for the mechanism must be incorrect, there must be some other way for the tasks to take a long time to run. The animation code in LVGL isn't too easy for me to read

@mark9064 mark9064 added the bug Something isn't working label Nov 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant