-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance of almost fresh builds #8837
Changes from all commits
0c5f6f0
3836dfb
49b63b7
0583081
50c0221
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -162,10 +162,30 @@ impl std::fmt::Display for JobId { | |
} | ||
} | ||
|
||
/// A `JobState` is constructed by `JobQueue::run` and passed to `Job::run`. It includes everything | ||
/// necessary to communicate between the main thread and the execution of the job. | ||
/// | ||
/// The job may execute on either a dedicated thread or the main thread. If the job executes on the | ||
/// main thread, the `output` field must be set to prevent a deadlock. | ||
pub struct JobState<'a> { | ||
/// Channel back to the main thread to coordinate messages and such. | ||
/// | ||
/// When the `output` field is `Some`, care must be taken to avoid calling `push_bounded` on | ||
/// the message queue to prevent a deadlock. | ||
messages: Arc<Queue<Message>>, | ||
|
||
/// Normally output is sent to the job queue with backpressure. When the job is fresh | ||
/// however we need to immediately display the output to prevent a deadlock as the | ||
/// output messages are processed on the same thread as they are sent from. `output` | ||
/// defines where to output in this case. | ||
/// | ||
/// Currently the `Shell` inside `Config` is wrapped in a `RefCell` and thus can't be passed | ||
/// between threads. This means that it isn't possible for multiple output messages to be | ||
/// interleaved. In the future, it may be wrapped in a `Mutex` instead. In this case | ||
/// interleaving is still prevented as the lock would be held for the whole printing of an | ||
/// output message. | ||
output: Option<&'a Config>, | ||
|
||
/// The job id that this state is associated with, used when sending | ||
/// messages back to the main thread. | ||
id: JobId, | ||
|
@@ -231,12 +251,24 @@ impl<'a> JobState<'a> { | |
.push(Message::BuildPlanMsg(module_name, cmd, filenames)); | ||
} | ||
|
||
pub fn stdout(&self, stdout: String) { | ||
self.messages.push_bounded(Message::Stdout(stdout)); | ||
pub fn stdout(&self, stdout: String) -> CargoResult<()> { | ||
if let Some(config) = self.output { | ||
writeln!(config.shell().out(), "{}", stdout)?; | ||
} else { | ||
self.messages.push_bounded(Message::Stdout(stdout)); | ||
} | ||
Ok(()) | ||
} | ||
|
||
pub fn stderr(&self, stderr: String) { | ||
self.messages.push_bounded(Message::Stderr(stderr)); | ||
pub fn stderr(&self, stderr: String) -> CargoResult<()> { | ||
if let Some(config) = self.output { | ||
let mut shell = config.shell(); | ||
shell.print_ansi(stderr.as_bytes())?; | ||
shell.err().write_all(b"\n")?; | ||
} else { | ||
self.messages.push_bounded(Message::Stderr(stderr)); | ||
} | ||
Ok(()) | ||
} | ||
|
||
/// A method used to signal to the coordinator thread that the rmeta file | ||
|
@@ -826,16 +858,9 @@ impl<'cfg> DrainState<'cfg> { | |
self.note_working_on(cx.bcx.config, unit, fresh)?; | ||
} | ||
|
||
let doit = move || { | ||
let state = JobState { | ||
id, | ||
messages: messages.clone(), | ||
rmeta_required: Cell::new(rmeta_required), | ||
_marker: marker::PhantomData, | ||
}; | ||
|
||
let doit = move |state: JobState<'_>| { | ||
let mut sender = FinishOnDrop { | ||
messages: &messages, | ||
messages: &state.messages, | ||
id, | ||
result: None, | ||
}; | ||
|
@@ -854,7 +879,9 @@ impl<'cfg> DrainState<'cfg> { | |
// we need to make sure that the metadata is flagged as produced so | ||
// send a synthetic message here. | ||
if state.rmeta_required.get() && sender.result.as_ref().unwrap().is_ok() { | ||
messages.push(Message::Finish(id, Artifact::Metadata, Ok(()))); | ||
state | ||
.messages | ||
.push(Message::Finish(state.id, Artifact::Metadata, Ok(()))); | ||
} | ||
|
||
// Use a helper struct with a `Drop` implementation to guarantee | ||
|
@@ -880,10 +907,31 @@ impl<'cfg> DrainState<'cfg> { | |
}; | ||
|
||
match fresh { | ||
Freshness::Fresh => self.timings.add_fresh(), | ||
Freshness::Dirty => self.timings.add_dirty(), | ||
Freshness::Fresh => { | ||
self.timings.add_fresh(); | ||
// Running a fresh job on the same thread is often much faster than spawning a new | ||
// thread to run the job. | ||
doit(JobState { | ||
id, | ||
messages: messages.clone(), | ||
output: Some(cx.bcx.config), | ||
rmeta_required: Cell::new(rmeta_required), | ||
_marker: marker::PhantomData, | ||
}); | ||
} | ||
Freshness::Dirty => { | ||
self.timings.add_dirty(); | ||
scope.spawn(move |_| { | ||
doit(JobState { | ||
id, | ||
messages: messages.clone(), | ||
output: None, | ||
rmeta_required: Cell::new(rmeta_required), | ||
_marker: marker::PhantomData, | ||
}) | ||
}); | ||
} | ||
} | ||
scope.spawn(move |_| doit()); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was an amount of back and forth about this awhile ago. Originally Cargo spawned threads for everything, but then I eventually optimized it to do what you have here. In #7838 we reverted back to what we have today, but IIRC @ehuss did some measurements and found the cost to be negligible. I can't seem to find a record of that conversation though, @ehuss do you remember that? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. From the PR description of #7838:
This roughly matches my results. This PR keeps not buffering the rustc output, but for fresh jobs it directly sends it to the terminal without the indirection through the message queue, thereby slightly improving performance. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. One issue I believe with this PR is that it can still deadlock. If the message queue is full and a fresh job tries to push onto it then the deadlock happens. That won't happen for stdout but other messages go through the queue as well (like build plans and such). Another issue which I'm having trouble remembering is that we want the main "event loop" to complete quickly each iteration, and if we're copying a lot of rustc output onto the console that isn't happening. I forget the consequences of a slow loop iteration though. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The other messages don't use the bounded There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Just the note in that PR that it was expected to cost about 5%. That number will vary significantly based on the project and the system. I ran some tests with this PR, and it is pretty much the same (no difference on macos, about 5% on Linux). My main concern from back then with this approach is that it introduces some subtle complexity that didn't really seem to be worth the 5% improvement, but I can't think of any specific problems with this PR other than being harder to understand. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah sorry I forgot about @ehuss I'm curious, how are you measuring? I'd expect that thread creation on Windows and macOS to be a good deal more expensive than on Linux There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just using hyperfine, like this:
In various projects of various sizes. Larger projects should show a bigger difference. I just re-ran the test on a larger project (libra, 600+ packages), and the numbers look comparable to linux, so I was probably just comparing a smaller project where the difference was less noticeable.
|
||
|
||
Ok(()) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you include some notes that it is crucial that this is only set on the main thread? There are several concerns:
RefMut
from shell can cause panics if used between threads.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
Config
is notSync
, so it isn't possible to assign it when not running on the main thread. Also becauseShell
is wrapped in aRefCell
it isn't possible to borrow it twice at the same time to interleave output.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Still, please add a comment discussing the concerns about the coordination of ownership of the output. My concern about interleaving is about the future, and things to watch out for if this is ever changed (like wrapping output in a mutex). It might also help to explain how
JobState
works with respect to how it is constructed and passed into the job threads.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interleaving would still not be possible when the
RefCell
is replaced with aMutex
as the mutex would be locked for the duration of the printing of a single message. It would be hard to accidentally unlock the mutex in the middle of printing.I can add some more docs to
JobState
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done