-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add DocFS layer to rustdoc #60971
Add DocFS layer to rustdoc #60971
Conversation
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
7f52e24
to
cc11f7b
Compare
I think this is complete now; if it fails CI obviously I'll fix it, but barring review feedback, I have no further planned changes. |
src/librustdoc/docfs.rs
Outdated
Err(e) => { | ||
// In principle these should get displayed at the top | ||
// level, but just in case, send to stderr as well. | ||
eprintln!("\"{}\": {}", path.display(), e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just like your comment stated, isn't it a duplicate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I'm not 100% sure. I'd hate for a user to have a bad experience tracking down a problem because we failed to report an error reliably. The cost is perhaps showing the error twice. The benefit is increased surity it is delivered. A different code structure where e.g. we actually collected the results rather than tossing them into the void and letting the threadpool infra collect panics would be preferrable, but that will require considerably more work.
Ultimately though, I'm not going to get the bug reports from users that wonder why things are not reported, if in fact there are any reporting issues, so, if you want the eprintln! removed, just say so and I'll remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about storing all those errors in a struct that is created outside the threads and just return from the current thread. Then print all the stored errors. Wouldn't it be a bit better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really know enough about how and where the rustdoc internals are used to know what assumptions are safe to make in that regard. Is librustdoc used as a library, or is it really just the used by the CLI? Are there other entry points in, or is gathering in render appropriate? We could use an mpsc to gather feedback from threads and examine that in run()?
That could be done in a follow up patch too though, it doesn't need to be done in this iteration.
TBH, I'd like to hand this over to someone interested in maintaining the code: it was mainly a drive by test to see if the lessons learnt in rustup were applicable to rustdoc based on a twitter conversation: they appear to be so. I've polished it to the point where I believe that its correct - it should not violated any existing assumptions in the code (e.g. the flock requirements for shared file writing), and I believe that this patch as-is is unlikely to create new bug reports for the team). Equally I'm sure that many more things can be done to it to make it much better - such as gathering the results of the IO calls into a mpsc queue or similar. Further profiling would also allow identifying how much of the remaining time is compiler performance and so on; a minute is a long time - unpacking the entire rust std docs is only 17 seconds or so on this machine with a very similar approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll make a few changes on your branch then if you don't mind. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do, I'd hate the effort of prototyping to be wasted, but you're in a much better position to assess the interactions raised here, unless I find a chunk of time I just don't have right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's very nice! Just a few questions but otherwise I think we're good to go!
I've toggled on 'allow edits from maintainers', (its night time here, rather than waiting a day for me if I've gone to sleep etc etc).so if you want to remove the eprintln! you can do so yourself. |
We're not in a hurry and I prefer to not interfere directly with contributors code if not needed as much possible. We'll come back to it when you're back. Like I said: we're not in a hurry. Thanks for your PR in any case! |
I replied to that review comment anyhow - currently waiting on a reply to that reply :). |
Bah, github just hadn't shown me your reply; have replied now. |
@rbtcollins: Tell me what you think of my changes (just look at the last commit for that). |
☔ The latest upstream changes (presumably #61343) made this pull request unmergeable. Please resolve the merge conflicts. |
Hi sorry for not getting back more quickly. There is a race condition in the patch added I think. add_error is atomic, but there is nothing that tells write_errors whether there is pending incomplete io that has yet to report an error. So between the end of cx.krate() and the call to write_errors(), io that has started but not completed will carry on after the end of run() - which it did in the first patch too. But then when the rayon runtime waits for it to complete, the error that it has will report into the ErrorStorage struct, nothing will consult the ErrorStorage struct, and the error will go unreported. |
I thought about it as well but was wondering if it'd slow down your implementation too much, at which point, making it useless. I'll update to wait for all threads to be completed before returning. |
I think if you use an mpsc, hand a tx.clone() into the thread, and send either a Result() or an Option back, and in run() you can just iter() the results, you'll be race free and won't need to muck with mutexes or anything else. Just make sure to drop the tx channel after krate() returns (so that iter() won't block once the last IO completes). |
Ok, I went for full channel system. However I'm worried about the possibility that rustdoc gets stuck in case a channel error occurs. |
src/librustdoc/docfs.rs
Outdated
/// Prints all stored errors. Returns the number of printed errors. | ||
pub fn write_errors(&self, diag: &errors::Handler) -> usize { | ||
let mut printed = 0; | ||
let nb_threads: usize = *self.nb_threads.borrow(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove nb_threads.
drop self.sender here, otherwise if a thread has paniced and not sent a response, you will block. (see below).
drop(self.sender)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can't just drop it like that. I'll try to find a workaround.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, you can - I have in similar code elsewhere. Might need to use mem::replace in a pinch. Would you like me to do this iteration? I have a couple hours available.
src/librustdoc/docfs.rs
Outdated
pub struct ErrorStorage { | ||
sender: Sender<Option<String>>, | ||
receiver: Receiver<Option<String>>, | ||
nb_threads: RefCell<usize>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove nb_threads (see below)
0a59944
to
064c4d1
Compare
Updated! |
* Move fs::create_dir_all calls into DocFS to provide a clean extension point if async extension there is needed. * Convert callsites of create_dir_all to ensure_dir to reduce syscalls. * Convert fs::write usage to DocFS.write (which also removes a lot of try_err! usage for easier reading) * Convert File::create calls to use Vec buffers and then DocFS.write in order to consistently reduce syscalls as well, make deferring to threads cleaner and avoid leaving dangling content if writing to existing files.... * Convert OpenOptions usage similarly - I could find no discussion on the use of create_new for that one output file vs all the other files render creates, if link redirection attacks are a concern DocFS will provide a good central point to introduce systematic create_new usage. (fs::write/File::create is vulnerable to link redirection attacks). * DocFS::write defers to rayon for IO on Windows producing a modest speedup: before this patch on my development workstation: $ time cargo +mystg1 doc -p winapi:0.3.7 Documenting winapi v0.3.7 Finished dev [unoptimized + debuginfo] target(s) in 6m 11s real 6m11.734s Afterwards: $ time cargo +mystg1 doc -p winapi:0.3.7 Compiling winapi v0.3.7 Documenting winapi v0.3.7 Finished dev [unoptimized + debuginfo] target(s) in 49.53s real 0m49.643s I haven't measured how much time is in the compilation logic vs in the IO and outputting etc, but this takes it from frustating to tolerable for me, at least for now.
064c4d1
to
65f1295
Compare
Weird considering that the rebase had no conflicts but whatever... |
@bors: r=rbtcollins,GuillaumeGomez |
📌 Commit 65f1295 has been approved by |
…GuillaumeGomez Add DocFS layer to rustdoc * Move fs::create_dir_all calls into DocFS to provide a clean extension point if async extension there is needed. * Convert callsites of create_dir_all to ensure_dir to reduce syscalls. * Convert fs::write usage to DocFS.write (which also removes a lot of try_err! usage for easier reading) * Convert File::create calls to use Vec buffers and then DocFS.write in order to both consistently reduce syscalls as well as make deferring to threads cleaner. * Convert OpenOptions usage similarly - I could find no discussion on the use of create_new for that one output file vs all the other files render creates, if link redirection attacks are a concern DocFS will provide a good central point to introduce systematic create_new usage. * DocFS::write defers to rayon for IO on Windows producing a modest speedup: before this patch on my development workstation: $ time cargo +mystg1 doc -p winapi:0.3.7 Documenting winapi v0.3.7 Finished dev [unoptimized + debuginfo] target(s) in 6m 11s real 6m11.734s user 0m0.015s sys 0m0.000s Afterwards: $ time cargo +mystg1 doc -p winapi:0.3.7 Compiling winapi v0.3.7 Documenting winapi v0.3.7 Finished dev [unoptimized + debuginfo] target(s) in 49.53s real 0m49.643s user 0m0.000s sys 0m0.015s I haven't measured how much time is in the compilation logic vs in the IO and outputting etc, but this takes it from frustating to tolerable for me, at least for now.
…GuillaumeGomez Add DocFS layer to rustdoc * Move fs::create_dir_all calls into DocFS to provide a clean extension point if async extension there is needed. * Convert callsites of create_dir_all to ensure_dir to reduce syscalls. * Convert fs::write usage to DocFS.write (which also removes a lot of try_err! usage for easier reading) * Convert File::create calls to use Vec buffers and then DocFS.write in order to both consistently reduce syscalls as well as make deferring to threads cleaner. * Convert OpenOptions usage similarly - I could find no discussion on the use of create_new for that one output file vs all the other files render creates, if link redirection attacks are a concern DocFS will provide a good central point to introduce systematic create_new usage. * DocFS::write defers to rayon for IO on Windows producing a modest speedup: before this patch on my development workstation: $ time cargo +mystg1 doc -p winapi:0.3.7 Documenting winapi v0.3.7 Finished dev [unoptimized + debuginfo] target(s) in 6m 11s real 6m11.734s user 0m0.015s sys 0m0.000s Afterwards: $ time cargo +mystg1 doc -p winapi:0.3.7 Compiling winapi v0.3.7 Documenting winapi v0.3.7 Finished dev [unoptimized + debuginfo] target(s) in 49.53s real 0m49.643s user 0m0.000s sys 0m0.015s I haven't measured how much time is in the compilation logic vs in the IO and outputting etc, but this takes it from frustating to tolerable for me, at least for now.
…GuillaumeGomez Add DocFS layer to rustdoc * Move fs::create_dir_all calls into DocFS to provide a clean extension point if async extension there is needed. * Convert callsites of create_dir_all to ensure_dir to reduce syscalls. * Convert fs::write usage to DocFS.write (which also removes a lot of try_err! usage for easier reading) * Convert File::create calls to use Vec buffers and then DocFS.write in order to both consistently reduce syscalls as well as make deferring to threads cleaner. * Convert OpenOptions usage similarly - I could find no discussion on the use of create_new for that one output file vs all the other files render creates, if link redirection attacks are a concern DocFS will provide a good central point to introduce systematic create_new usage. * DocFS::write defers to rayon for IO on Windows producing a modest speedup: before this patch on my development workstation: $ time cargo +mystg1 doc -p winapi:0.3.7 Documenting winapi v0.3.7 Finished dev [unoptimized + debuginfo] target(s) in 6m 11s real 6m11.734s user 0m0.015s sys 0m0.000s Afterwards: $ time cargo +mystg1 doc -p winapi:0.3.7 Compiling winapi v0.3.7 Documenting winapi v0.3.7 Finished dev [unoptimized + debuginfo] target(s) in 49.53s real 0m49.643s user 0m0.000s sys 0m0.015s I haven't measured how much time is in the compilation logic vs in the IO and outputting etc, but this takes it from frustating to tolerable for me, at least for now.
Rollup of 9 pull requests Successful merges: - #60971 (Add DocFS layer to rustdoc) - #61146 (SliceConcatExt::connect defaults to calling join) - #61181 (Fix theme-checker failure) - #61267 (rustc-book: Update the rustc/clang compatibility table for xLTO.) - #61270 (Remove warnings about incr. comp. generating less debugging output.) - #61681 (Changed the error message to more clearly explain what is allowed) - #61984 (More NodeId pruning) - #62016 (Add test for issue-27697) - #62019 (Remove needless lifetimes) Failed merges: r? @ghost
👋 @rbtcollins this is a really cool improvement - is there any reason you made it windows-only? |
At the time I put this together we hadn't identified IO performance
bottlenecks in rustup on non windows platforms, and the increased memory
pressure of the naive approach this takes can be poor for small resource
platforms such as raspberry pi.
That said, we didn't test it on small platforms to see if in the rustdoc
case it would be ok, perhaps it would have been.
The current version in rustup with hard memory bounds and incremental
writes for the occasional very large file would be much more suitable for
unconditional use.
…On Sun, 22 Aug 2021, 03:38 Joshua Nelson, ***@***.***> wrote:
👋 @rbtcollins <https://github.com/rbtcollins> this is a really cool
improvement - is there any reason you made it windows-only?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#60971 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AADZ7XQFFVWYIWUUTIDTUI3T6BIIJANCNFSM4HN6BDMQ>
.
|
Oh, so there's an updated version of this in use by rustup? That seems useful :) rather than copying the code back and forth, would you be ok with having a repo in rust-lang both rustup and rustdoc use? |
https://github.com/rust-lang/rustup/tree/master/src/diskio is what we have. Its not exactly the same interface, and never really was, as the solution put together for rustdoc. I'm entirely fine with work being done to make that usable by rustdoc, but:
So perhaps start by looking at what we have and see how well it fits with your needs - its not beautiful code to be honest; we hope that the next iteration, async based, with appropriate platform specific drivers (e.g. iouring(linux) + iocp(windows) + kqueue(mac)... will be lovely and clean, now we've learnt all the lessons we needed to. |
rustdoc: reduce number of copies when using parallel IO This is Windows-only for now; I was getting really bad slowdowns from this on linux for some reason. Helps with rust-lang#82741. Follow-up to rust-lang#60971.
extension point if async extension there is needed.
(which also removes a lot of try_err! usage for easier reading)
in order to both consistently reduce syscalls as well as make
deferring to threads cleaner.
the use of create_new for that one output file vs all the other
files render creates, if link redirection attacks are a concern
DocFS will provide a good central point to introduce systematic
create_new usage.
speedup: before this patch on my development workstation:
$ time cargo +mystg1 doc -p winapi:0.3.7
Documenting winapi v0.3.7
Finished dev [unoptimized + debuginfo] target(s) in 6m 11s
real 6m11.734s
user 0m0.015s
sys 0m0.000s
Afterwards:
$ time cargo +mystg1 doc -p winapi:0.3.7
Compiling winapi v0.3.7
Documenting winapi v0.3.7
Finished dev [unoptimized + debuginfo] target(s) in 49.53s
real 0m49.643s
user 0m0.000s
sys 0m0.015s
I haven't measured how much time is in the compilation logic vs in the
IO and outputting etc, but this takes it from frustating to tolerable
for me, at least for now.