Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a deadlock in syscallbuf unmapping after vfork #3826

Conversation

KJTsanaktsidis
Copy link
Contributor

This should be 🤞 a fix for #3807

It does two things to address that issue:

Firstly, moves the code for doing syscallbuf unmap after vfork/exec out of Task::post_exec, and instead defers doing that until we next run a task in that address space. This means we should never leak address space just because there were no available threads to perform the unmapping. This doesn't actually do anything to technically fix the issue, but it means we can write some reliable tests around the syscallbuf unmapping stuff, so it seemed worthwhile.

Secondly, we now look at the desched signal state in AutoRemoteSyscalls. If the desched signal is armed, we temporarily disarm it (and mask it too, in case the signal was pending but unprocessed). This is what should fix our issue, I think

It's hard to prove that this really has fixed the issue, but I left my reproduction script running for 12 hours last night and it didn't hang, and that had always been long enough to trigger the hang before. So I guess either this fixes the issue, or I got (un)lucky.

I'm sure my C++ isn't going to win any beauty contests here but hopefully this demonstrates the outlines of a fix.

If two processes share an address space (via vfork(2) or clone(2)
CLONE_VM), and one processes calls execve(2), the process's syscallbufs
will remain mapped in the shared address space. At the moment, we
attempt to unmap this after execve(2) by finding a stopped process in
the shared address space, and using it to do the unmapping. If there is
no such process, though, the buffers are leaked.

This patch changes that situation to ensure that such syscallbufs are
unmapped in all circumstances. This is done by recording that such
unmapping must happen after execve(2), but deferring the unmapping until
a suitable process is scheduled and in a state to perform the unmapping.
If we're running on a thread with a desched event enabled, we can enter
an infinite loop where stepping through the syscall infinitely
retriggers the desched event, and thus the syscall can never complete.
Copy link
Collaborator

@rocallahan rocallahan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really excellent work in a very difficult area. Bravo!

<< "previously exec'd processes";
AutoRemoteSyscalls remote(this);
std::vector<MemoryRange> regions_pending_unmap;
std::swap(regions_pending_unmap, as->regions_pending_unmap);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be a move, but it's fine.

@rocallahan rocallahan merged commit ceeff12 into rr-debugger:master Sep 17, 2024
5 checks passed
@rocallahan
Copy link
Collaborator

Honestly, really extraordinary work.

@KJTsanaktsidis
Copy link
Contributor Author

Thank you! I really appreciated your pointing in the right direction. This was definitely a fun exercise and I learned quite a bit about how rr works. And brushed off a bit of my C++ too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants