Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fs/fcb: FCB has some async issues, fixing some obvious ones #3382

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

vrahane
Copy link
Contributor

@vrahane vrahane commented Mar 6, 2025

We are seeing some logging issues related to rotates, on analyzing the FCB code, it seems not protecting FCB access using the mutex might be one of the reasons and causes.

@github-actions github-actions bot added the size/s label Mar 6, 2025
@sjanc sjanc self-assigned this Mar 10, 2025
Copy link
Contributor

@sjanc sjanc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,

To avoid nested locking I'd suggest to introduce _nolock private variants of functions that are used under mutex similar to how fcb_getnext_nolock() and fcb_getnext() are handled.

and commit message could be a bit more descriptive, eg

fs/fcb: Fix FCB  async issues

FCB  API can be called from multiple threads. Add missing mutex locking.


rc = os_mutex_pend(&fcb->f_mtx, OS_WAIT_FOREVER);
if (rc && rc != OS_NOT_STARTED) {
return FCB_ERR_ARGS;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to note in header file that negative value means error and not that it is not empty (non-zero value)

(alternative would be to make this int fcb_is_empty(struct fcb *fcb, bool *empty) for more explicit error vs return value

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using the return code is good enough, will just update the header to include the updated return code.

if (rc && rc != OS_NOT_STARTED) {
return FCB_ERR_ARGS;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is fcb_getnext_nolock() which could be used in loop below to avoid nested locking


rc = os_mutex_pend(&fcb->f_mtx, OS_WAIT_FOREVER);
if (rc && rc != OS_NOT_STARTED) {
return FCB_ERR_ARGS;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to note in header file that negative value means error (or make cnt output parameter similar to suggestion in fcb_is_empty()) (either is fine for me)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update the header file with the description.

rc = os_mutex_pend(&fcb->f_mtx, OS_WAIT_FOREVER);
if (rc && rc != OS_NOT_STARTED) {
return FCB_ERR_ARGS;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

loop below could use _nolock() variants to avoid nested locking

Copy link
Contributor Author

@vrahane vrahane Mar 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think there is any issue with nested locking of a mutex, it uses t_lockcnt from os_task. Maybe more instructions to lock it which is minimal. So, it is either that or more code for nested locking variants, I choose less code.

@vrahane vrahane requested a review from sjanc March 12, 2025 19:24
@vrahane
Copy link
Contributor Author

vrahane commented Mar 12, 2025

@sjanc I have updated the PR addressing some review comments. Please take a look.

fs/fcb/src/fcb.c Outdated

rc = 0;
while (!fcb_is_empty(fcb)) {
Copy link
Contributor

@sjanc sjanc Mar 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so if there is no fsb_is_empty_nolock() this should handle error checking from fsb_is_empty()
same for fcb_free_sector_cnt() and other that had mutex added if those are used internally

I think overall adding _nolock() would make this simpler

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I understand what you mean for fcb_is_empty() but for fcb_free_sector_cnt() the fcb_getnext_area() gets called which is already a nolock version. There was no protection for it at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added fcb_is_empty_nolock()

@vrahane vrahane force-pushed the vipul/async_fcb_fixes branch 2 times, most recently from e334538 to 1235f52 Compare March 13, 2025 20:01
@vrahane vrahane requested a review from sjanc March 13, 2025 20:02
@vrahane vrahane force-pushed the vipul/async_fcb_fixes branch 2 times, most recently from fb48919 to 3e5b0bf Compare March 13, 2025 20:30
@vrahane vrahane force-pushed the vipul/async_fcb_fixes branch from 3e5b0bf to ed67e03 Compare March 13, 2025 20:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants