Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid or reduce shutdown costs #26

Open
davidlattimore opened this issue Aug 12, 2024 · 12 comments
Open

Avoid or reduce shutdown costs #26

davidlattimore opened this issue Aug 12, 2024 · 12 comments

Comments

@davidlattimore
Copy link
Owner

          > Depending on how fast hashing is, hashing the output might still be OK if it can be done in parallel with closing all the input files (which is surprisingly slow - run the linker with --time to see).

Unrelated to this issue, but have you considered any of:

  • Using io-uring to close files async?
  • Closing them concurrently from a thread pool?
  • What if we leave them open and let the OS clean up on process exit?
  • In the readme you also mention mold actually does work in a forked background process to avoid cleanup blocking the user, this could also be used by wild to deal with this i presume.

(Sorry to hijack the issue, but those ideas just came up as solutions to that subproblem.)

Originally posted by @VorpalBlade in #15 (comment)

Let's move discussion to a new issue, since I'd like to keep that other issue as a good first issue for people to work on.

I've tried a couple of those things. In particular I've tried unmapping files from separate threads and also just leaving them and letting the OS clean them up on exit. Neither seemed to make any difference.

In the case of using multiple threads, I think the issue is that munmap acquires a per-process kernel lock, so the threads end up waiting for each other anyway.

In the case of just leaving things for shutdown, I'm less sure, but I think process termination causes the kernel to do the same unmapping. So termination ends up taking longer. This is somewhat surprising and also feels like the kind of thing that could change with future changes in the Linux kernel, so should probably be revisited at some point.

I haven't tried using io-uring to munmap files. Is that possible? Given the above results, I'd be surprised if this were a faster way to close files.

I have considered using io-uring for all file IO. It'd be a pretty big change from how things currently work though and I suspect might make performance worse rather than better, although that's just a hunch. My reason for suspecting that, is that AFAIK, io_uring will copy the data, even if it's in cache, whereas mmap can reuse the pages of the cache and only needs to update the page tables of the process.

The forking trick would likely work. I haven't done it as yet, but there's a moderate chance that I will eventually.

@VorpalBlade
Copy link

You are almost certainly right about io-uring vs mmap, since you said close not unmap in the other issue I misunderstood what you were doing.

Then it sounds like the fork and letting a background process do the actual work is the best way to go. (That actually sounds like a trick I myself might have use for in a program of mine. It should be more widely publicised as a trick.)

@andrewdavidmackenzie
Copy link
Contributor

Focusing on using the fork of a child method to do the linking work, and having the parent exit as soon as the output file is written:

I imagine it would be desirable to have a --fork/--nofork command line option to be able to enable/disable this behaviour, combined with a default of "nofork" until proven reliable and no indesireable side-effects, then switch to "fork" be the default after that?

Parent should watch for death of a child signal, so that any unexpected exit (panic) by the child, or uncontrolled exit path, would not leave the parent hanging.

Any other thoughts on that particular feature?

@andrewdavidmackenzie
Copy link
Contributor

Considering options on how to "fork" the sub-process

  • Use libc directly to fork: This is already a dependency via a few crates, so no additional dependency added. Is unsafe. Experiments shows child is not producing stdout :-( I think we'd need to roll our own stuff to do things like Command does...
  • Use some crate like nix to do a fork. This would add a new dependency.
  • Or a smaller fork crate. This would add a new dependency.
  • Use Command (which does a fork then an exec). We would have to introduce a new binary into the build and run it from the main binary via Command. This may introduce some overhead vs just a fork? The child can inherit file descriptors I think (needed if a named pipe comms method between parent and child were chosen)

After looking at them all, I sent a WIP PR using fork(): #192

@davidlattimore
Copy link
Owner Author

Thanks for working on this @andrewdavidmackenzie. Regarding the Command option, you wouldn't need a separate binary, you just need to make sure that you set some flag or environment variable so that the subprocess knows that its the subprocess and doesn't fork again (fork bombs are bad).

@andrewdavidmackenzie
Copy link
Contributor

That's true, it could "exec" itself...

@VorpalBlade
Copy link

Considering options on how to "fork" the sub-process [...]

Nix is also unsafe (for the same reason as libc: fork when multi-threaded is complex). The ni fork appears to be a thin wrapper around libc.

It looks to me like the fork crate should be unsafe but isn't. Fishy. It too is a thin wrapper around libc.

Either way, I don't think the unsafe is a big issue for wild in this case. The fork will presumably happen as early as possible, at which point the parent process will still be single threaded. After the point of fork, either process can do whatever it wants with threads.

I unfortunately have no clear idea into what might be going on with stdio. It should work from a child, as long as stdio isn't locked at the moment of forking. Does wild lock stdio once early and run with it? That might cause issues, it would be better to do that after the fork.

@andrewdavidmackenzie
Copy link
Contributor

I imagine Fork crate just has an unsafe {} inside the function, then public function doesn't need to be unsafe.

@davidlattimore
Copy link
Owner Author

I imagine Fork crate just has an unsafe {} inside the function, then public function doesn't need to be unsafe.

Yeah, but I think what @VorpalBlade is questioning is whether it's legitimate for them to mark the function safe. I can see an argument for it being safe. Forking when threads have been started can lead to deadlock. I'm not aware of a way for it to lead to arbitrary undefined behaviour. But maybe there is. If nothing else, it'd be good to have a comment justifying why it isn't marked unsafe.

@andrewdavidmackenzie
Copy link
Contributor

andrewdavidmackenzie commented Nov 23, 2024

I've discussed this before with others.

My thought was that if a low-level bit of code is unsafe (e.g It calls some unsafe function in libc), and the "rule" is to mark the function containing it unsafe....then what about the functions calling that unsafe function?

If you apply the rule consistently, unsafe will percolate up and you will end up with main() being unsafe.

My approach was to make the unsafe scope as small as possible...

But, I think it's debatable....

Maybe the rule could be to percolate up to a crate API?

@davidlattimore
Copy link
Owner Author

It's not that anything that uses unsafe should itself be marked as unsafe. It's more that your function should be unsafe unless you can be sure that you're upholding all of the safety requirements no matter how users of your function call it.

@VorpalBlade
Copy link

VorpalBlade commented Nov 24, 2024

https://www.man7.org/linux/man-pages/man3/fork.3p.html (note the 3p, indicating this is from the POSIX standard). If you search the page for "undefined" there are several hits. Some of them seem to be about atfork handlers, but not all.

https://www.man7.org/linux/man-pages/man2/fork.2.html says

After a fork() in a multithreaded program, the child can safely call only async-signal-safe functions (see signal-safety(7)) until such time as it calls execve(2).

This doesn't say that doing otherwise is undefined, but it also doesn't say what would happen if you do break that rule.

https://www.man7.org/linux/man-pages/man7/signal-safety.7.html only uses wording like "unpredictable results". (But this is a Linux man page, not a POSIX one)

Given that arbitrary signal handlers are generally considered unsafe in Rust, and that this is adjecent to that minefield I'm not comfortable with the idea of just reexposing fork as safe without explaining why it safe in this specific use. That could be "this application is currently single threaded" (which is a decision a library can never make), or "we only execute async signal safe functions after the fork" (which precludes arbitrary caller provided code from being executed).

Note that the standard library also take this stance: CommandExt::pre_exec is unsafe.

As such I believe the fork library to be unsound.

EDIT: See also https://internals.rust-lang.org/t/why-no-fork-in-std-process/13770/4 and the transitivly linked discussions.

@andrewdavidmackenzie
Copy link
Contributor

andrewdavidmackenzie commented Nov 24, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants