zed: a lock-based pickle indeed #11965

nabijaczleweli · 2021-04-28T21:01:22Z

Motivation and Context

#11963

Description

~~Lock around fork() and in the SIGCHLD handler. The latter should always lock as soon as a child dies and, if Linux does in fact re-use zombie PIDs, always come before the next fork().~~

~~But if this doesn't work then I genuinely don't know if this is possible under UNIX or if it's just always gonna race like this.~~ See commit message.

How Has This Been Tested?

~~It wasn't. I mean, it runs, but.~~ See commit message.

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Performance enhancement (non-breaking change which improves efficiency)
Code cleanup (non-breaking change which makes code smaller or more readable)
Breaking change (fix or feature that would cause existing functionality to change)
Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
Documentation (a change to man pages or other documentation)

Checklist:

My code follows the OpenZFS code style requirements.
I have updated the documentation accordingly.
I have read the contributing document.
I have added tests to cover my changes.
I have run the ZFS Test Suite with this change applied.
All commit messages are properly formatted and contain Signed-off-by.

nabijaczleweli · 2021-05-01T14:30:25Z

Rebased, replaced with a solution that I tried, understood, and tested. @don-brady mind giving'er a spin?

If you can afford to do it, setting max_pids to something hilarious like 400 greatly exacerbates this, I hit it every time within minutes, if that.

This can be very easily triggered by adding a sleep(1) before the wait4() on a PID-starved system: the reaper thread would wait for a child before its entry appeared, letting old entries accumulate: Invoking "all-debug.sh" eid=3021 pid=391 Finished "(null)" eid=0 pid=391 time=0.002432s exit=0 Invoking "all-syslog.sh" eid=3021 pid=336 Finished "(null)" eid=0 pid=336 time=0.002432s exit=0 Invoking "history_event-zfs-list-cacher.sh" eid=3021 pid=347 Invoking "all-debug.sh" eid=3022 pid=349 Finished "history_event-zfs-list-cacher.sh" eid=3021 pid=347 time=0.001669s exit=0 Finished "(null)" eid=0 pid=349 time=0.002404s exit=0 Invoking "all-syslog.sh" eid=3022 pid=370 Finished "(null)" eid=0 pid=370 time=0.002427s exit=0 Invoking "history_event-zfs-list-cacher.sh" eid=3022 pid=391 avl_find(tree, new_node, &where) == NULL ASSERT at ../../module/avl/avl.c:641:avl_add() Thread 1 "zed" received signal SIGABRT, Aborted. By employing this wider lock, we atomise [wait, remove] and [fork, add]: slowing down the reaper thread now just causes some zombies to accumulate until it can get to them Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes openzfs#11963

behlendorf

@nabijaczleweli thanks for running this down.

don-brady · 2021-05-07T17:03:33Z

We tried working around this by launching zed with -j 1 and that was still hitting the assert. I have yet to test with this change but wanted to understand why having jobs set to 1 would allow async execution?

nabijaczleweli · 2021-05-07T18:20:35Z

-j guarantees that at any given point j_N - ∑(fork() ≠ -1) + ∑(wait4() ∉ {-1, 0}) >= 0, as documented.

The reaping methodology is unaffected, and the sequence in that case remains limit>0 => fork()=A => A:exit_group() => reaper:SIGCHLD => reaper:lock => limit-=1 => lock:sleep => reaper:wait4()=A => reaper:find(A)=NULL => reaper:unlock => limit+=1 => lock:wake => add(A) => unlock. This can then repeat with another child with PID of A, panicking on the add(A) step, and, since the limit ends at the same value that it started, you can see how it plays no role in this.

The updated sequence would look like limit>0 => lock => fork()=A => A:exit_group() => reaper:SIGCHLD => reaper:lоck:sleep() => add(A) => unlock => limit-=1 => reaper:lоck:wake => reaper:wait4()=A => reaper:find(A)=A* => reaper:unlock => limit+=1.

This can be very easily triggered by adding a sleep(1) before the wait4() on a PID-starved system: the reaper thread would wait for a child before its entry appeared, letting old entries accumulate: Invoking "all-debug.sh" eid=3021 pid=391 Finished "(null)" eid=0 pid=391 time=0.002432s exit=0 Invoking "all-syslog.sh" eid=3021 pid=336 Finished "(null)" eid=0 pid=336 time=0.002432s exit=0 Invoking "history_event-zfs-list-cacher.sh" eid=3021 pid=347 Invoking "all-debug.sh" eid=3022 pid=349 Finished "history_event-zfs-list-cacher.sh" eid=3021 pid=347 time=0.001669s exit=0 Finished "(null)" eid=0 pid=349 time=0.002404s exit=0 Invoking "all-syslog.sh" eid=3022 pid=370 Finished "(null)" eid=0 pid=370 time=0.002427s exit=0 Invoking "history_event-zfs-list-cacher.sh" eid=3022 pid=391 avl_find(tree, new_node, &where) == NULL ASSERT at ../../module/avl/avl.c:641:avl_add() Thread 1 "zed" received signal SIGABRT, Aborted. By employing this wider lock, we atomise [wait, remove] and [fork, add]: slowing down the reaper thread now just causes some zombies to accumulate until it can get to them Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Don Brady <don.brady@delphix.com> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes openzfs#11963 Closes openzfs#11965

nabijaczleweli mentioned this pull request Apr 28, 2021

ZED still crashing after PR-11928 fix integrated #11963

Closed

behlendorf added the Status: Code Review Needed Ready for review and testing label Apr 28, 2021

nabijaczleweli force-pushed the rickle-pick branch 2 times, most recently from 19a69d4 to 96f3b75 Compare May 1, 2021 14:27

nabijaczleweli force-pushed the rickle-pick branch from 96f3b75 to 46a4e38 Compare May 1, 2021 14:36

behlendorf requested a review from don-brady May 3, 2021 23:41

behlendorf approved these changes May 6, 2021

View reviewed changes

don-brady approved these changes May 7, 2021

View reviewed changes

behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels May 7, 2021

behlendorf merged commit 3bd6b0e into openzfs:master May 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

zed: a lock-based pickle indeed #11965

zed: a lock-based pickle indeed #11965

nabijaczleweli commented Apr 28, 2021 •

edited

Loading

nabijaczleweli commented May 1, 2021 •

edited

Loading

behlendorf left a comment

don-brady commented May 7, 2021

nabijaczleweli commented May 7, 2021 •

edited

Loading

zed: a lock-based pickle indeed #11965

zed: a lock-based pickle indeed #11965

Conversation

nabijaczleweli commented Apr 28, 2021 • edited Loading

Motivation and Context

Description

How Has This Been Tested?

Types of changes

Checklist:

nabijaczleweli commented May 1, 2021 • edited Loading

behlendorf left a comment

Choose a reason for hiding this comment

don-brady commented May 7, 2021

nabijaczleweli commented May 7, 2021 • edited Loading

nabijaczleweli commented Apr 28, 2021 •

edited

Loading

nabijaczleweli commented May 1, 2021 •

edited

Loading

nabijaczleweli commented May 7, 2021 •

edited

Loading