-
-
Notifications
You must be signed in to change notification settings - Fork 15.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cc-wrapper: improve response file parsing speed #26974
cc-wrapper: improve response file parsing speed #26974
Conversation
@ryantrinkle, thanks for your PR! By analyzing the history of the files in this pull request, we identified @orivej, @edolstra, @pikajude, @LnL7 and @copumpkin to be potential reviewers. |
Ah, the merge conflict is because @orivej also pushed a performance improvement to staging. This should still make it quite a bit faster than that too, though. |
FYI I believe this might be the PR @Ericson2314 was talking about #26554 |
d2b3480
to
2dd4760
Compare
A question to decide on is whether we want to keep around the bash version as a fall back, rather than breaking. The error message says "not allowed when bootstrapping", but then again it's actually quite early in the bootstrapping process that this is allowed. |
If bash parser in |
2dd4760
to
d07f30f
Compare
@orivej I have no complaint if you'd rather use your code. My goal was just to solve the issue while keeping the code, dependencies, and bootstrapping logic as simple as possible. I'm not familiar enough with C++'s toolchain to know whether it would present any difficulties. I did have to restrain myself from trying to do it in Haskell, though—attoparsec would've made this quite a bit nicer! BTW, the file I've been testing with is responsefile.txt. |
I had not considered that C++ toolchain is not available as soon as C toolchain is. I appreciate the work you did to integrate compiled rsp parser into bootstrap process, thank you! Yet |
I preferred the C++ version, assuming it had the same semantics. |
If C++ is an option for early Nix bootstrap, I made updated version available at the cpp branch. The only semantical change was about handling incorrect .rsp files as gcc, e.g. Current timings of parsing responsefile.txt with C, |
Alternatively, bash version could be optimized somewhat further: |
Can we get anything which improves the situation now as opposed to what might be nice to have some later date in the future? The initial 75,000 factor of improvement seems to be interesting enough to include regardless of whether it could be made faster in another PR. At least that's the process I would like to see. |
Yes, merge nixpkgs
This PR is for the |
OK, so we are now just waiting for someone to push the merge button then, unless there are more comments from others. |
No, merging this PR has nothing to do with merging |
@orivej I did not say there was, since the top says |
I'd wait for @ryantrinkle on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am in favor of merging this change. I don't fully understand the implications of adding a C program into the cc wrapper, but intuitively I'd assume that this is no problem. It would be nice to hear what @vcunat thinks about that issue.
name = "parse-response-file"; | ||
src = ./parseResponseFile.c; | ||
buildCommand = '' | ||
# Make sure the output file doesn't refer to the input nix path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean? Why not "$CC" -o "$out" "$src"
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We got a runtime dependency on the source file, probably because __FILE__
in the assert expansion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's plausible, but why is it undesirable?
I think C++ should be fine. Also, if I recall correctly, there is some |
This PR is ready for merge. |
Let's wait on approval from @Ericson2314 and @ryantrinkle and merge :) |
@domenkozar This is good to go. |
void append(String *s, const char *data, size_t len) { | ||
resize(s, s->len + len); | ||
memcpy(s->data + s->len - len, data, len); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's this custom string type (above) + append for? Why can't this use std::string
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I got confused that there's a proposed C++ implementation, but this one is not that, this is a C implementation. So nevermind.
I found that even after this PR merged (nixpkgs commit aad2a21 for me), |
Fixed in #27657. |
Motivation for this change
Although #25205 improved the correctness of response file handling in cc-wrapper, it also devastated performance for large response files, e.g. building Haskell packages with many dependencies. This commit replaces the bash-based parser with one written in C. Although bootstrapping is somewhat more complex with this approach, the new code is ~75,000x faster for my test case.
Things done
(nix.useSandbox on NixOS,
or option
build-use-sandbox
innix.conf
on non-NixOS)
nix-shell -p nox --run "nox-review wip"
./result/bin/
)CC @orivej