-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve incremental build: make ninja handle dynamic outputs #1953
base: master
Are you sure you want to change the base?
Conversation
This tool list all the output generated by the graph, including dynamic output if they have been built.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documentation is missing. At least the mention of the dynout
stuff and the grammar for the expected format of it. Test cases would also be really nice to see.
Other than that, it looks sensible.
src/graph.cc
Outdated
@@ -614,6 +628,113 @@ bool ImplicitDepLoader::ProcessDepfileDeps( | |||
return true; | |||
} | |||
|
|||
|
|||
bool ImplicitDepLoader::LoadDynOutFile(State* state, DiskInterface* disk_interface, Edge* edge, const string& path, | |||
vector<Node*>* nodes, int* outputs_count, string* err) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is best written in a more straight-forward parser way (see lexer.in.cc
and its re2c
comments).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have moved the parser code into dynout_parser.cpp
, it did not make sense to have it in graph.cc
but only used in build.cc
. I am not using re2c
because the parsing is for now very straightforward. It is just looking for new line.
src/ninja.cc
Outdated
@@ -1009,6 +1047,8 @@ const Tool* ChooseTool(const string& tool_name) { | |||
Tool::RUN_AFTER_LOGS, &NinjaMain::ToolQuery }, | |||
{ "targets", "list targets by their rule or depth in the DAG", | |||
Tool::RUN_AFTER_LOAD, &NinjaMain::ToolTargets }, | |||
{ "outputs", "list all outputs of the build graph, include dynamic outputs if there are and they have been built.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if there are
→ if there are any
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
@@ -116,6 +116,8 @@ bool DepsLog::RecordDeps(Node* node, TimeStamp mtime, | |||
size |= 0x80000000; // Deps record: set high bit. | |||
if (fwrite(&size, 4, 1, file_) < 1) | |||
return false; | |||
if (fwrite(&outputs_count, 4, 1, file_) < 1) | |||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this change the deps log format version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have bumped to deps log version from 4 to 5.
@mathstuf thank you, I will revise my implementation. |
@mathstuf I have added the documentation for the |
I have added a debug option |
94dbb74
to
94a5f65
Compare
I think the implementation can be considered feature complete. I have added test, debug option, consider the I will continue refining it if necessary to make it reach the quality to be integrated to master. |
@@ -641,6 +655,68 @@ bool ImplicitDepLoader::LoadDepsFromLog(Edge* edge, string* err) { | |||
return true; | |||
} | |||
|
|||
bool ImplicitDepLoader::LoadOutputsFromLog(Edge* edge, string* err) { | |||
// NOTE: deps are only supported for single-target edges. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was fixed in #1534.
case DiskInterface::Okay: | ||
break; | ||
case DiskInterface::NotFound: | ||
err->clear(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
err
can by nullptr
; this must be guarded. Same with all other uses of err
.
@@ -855,6 +855,12 @@ keys. | |||
stored as `.ninja_deps` in the `builddir`, see <<ref_toplevel,the | |||
discussion of `builddir`>>. | |||
|
|||
`dynout`:: path to an optional _dynout file_ that contains the list | |||
of outputs generated by the rule. The dynout file syntax except one | |||
path per line. This is to make Ninja aware of dynamic outputs, so |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this is expeccts one path per line
? Are there any escaping mechanisms? I guess this means that paths with embedded newlines are not supported (not that I expect they are anywhere else either, but it'd be good to know before reviewing the parsing code)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My most recent comments still seem to have not been addressed. Just formalizing into a review.
Yes, I am not focusing on that project for now. I do not know when I will
go back to it.
Le mar. 20 juil. 2021 à 21:42, Ben Boeckel ***@***.***> a
écrit :
… ***@***.**** requested changes on this pull request.
My most recent comments still seem to have not been addressed. Just
formalizing into a review.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#1953 (review)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAHSSFDWQWSHBEC3XFCEVUTTYVVMVANCNFSM427TVDZQ>
.
|
Hi! |
@HampusAdolfsson I don't mind. I don't have the energy to finish it and I will be happy to see it merged! |
We've been using this solution for quite some years, successfully. Until now, when we uncovered a hidden problem that it can cause. Assume I have one code generator creating file a.h. Ninja becomes aware of that via the use of dynout. Hence, it stores in .ninja_deps that this one code generator creates a.h. This example was simplified for easier understanding, in fact, we switch back and forth between two code generators whereas only one of them will be active per build. But both create a.h and write to .ninja_deps. As we switch back and forth, simply deleting .ninja_deps is not really a solution for us. Dropping this here to warn others, but also happy to take suggestions how this could be solved. |
@johanneslerch This PR is outdated, please use #2366 where discussion of this feature is currently ongoing. To address your point: This would likely also solve your issue. If for some build, node X stops producing a.h and node Y starts producing it, they will each produce a new dynout, which is then stored in the depslog. At the start of the next build, the build graph produced from the depslog will be both valid and up-to-date. |
// Add the dyndep-discovered outputs to the edge. | ||
edge->outputs_.insert(edge->outputs_.end(), implicit_outputs.begin(), | ||
implicit_outputs.end()); | ||
edge->implicit_outs_ += implicit_outputs.size(); | ||
|
||
// Add this edge as incoming to each new output. | ||
for (std::vector<Node*>::const_iterator i = implicit_outputs.begin(); | ||
i != implicit_outputs.end(); ++i) { | ||
if (Edge* old_in_edge = (*i)->in_edge()) { | ||
// This node already has an edge producing it. Fail with an error | ||
// unless the edge was generated by ImplicitDepLoader, in which | ||
// case we can replace it with the now-known real producer. | ||
if (!old_in_edge->generated_by_dep_loader_) { | ||
*err = "multiple rules generate " + (*i)->path(); | ||
return false; | ||
} | ||
old_in_edge->outputs_.clear(); | ||
} | ||
(*i)->set_in_edge(edge); | ||
} | ||
|
||
return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi again, I'm aware that this PR is outdated, but that's what we based our version of Ninja on, so we have some interest to fix this one here. However, we do want to contribute back our insights that may help others and are of course interested in receiving feedback if we miss something.
// Add the dyndep-discovered outputs to the edge. | |
edge->outputs_.insert(edge->outputs_.end(), implicit_outputs.begin(), | |
implicit_outputs.end()); | |
edge->implicit_outs_ += implicit_outputs.size(); | |
// Add this edge as incoming to each new output. | |
for (std::vector<Node*>::const_iterator i = implicit_outputs.begin(); | |
i != implicit_outputs.end(); ++i) { | |
if (Edge* old_in_edge = (*i)->in_edge()) { | |
// This node already has an edge producing it. Fail with an error | |
// unless the edge was generated by ImplicitDepLoader, in which | |
// case we can replace it with the now-known real producer. | |
if (!old_in_edge->generated_by_dep_loader_) { | |
*err = "multiple rules generate " + (*i)->path(); | |
return false; | |
} | |
old_in_edge->outputs_.clear(); | |
} | |
(*i)->set_in_edge(edge); | |
} | |
return true; | |
bool hasConflict = false; | |
for (std::vector<Node*>::const_iterator i = implicit_outputs.begin(); | |
i != implicit_outputs.end(); ++i) { | |
if (Edge* old_in_edge = (*i)->in_edge()) { | |
// This node already has an edge producing it. | |
// This can mean that there is a conflict of two build statements producing the same output, | |
// but it can also mean that the dynout information within .ninja_deps file is outdated and | |
// the dynamic output is now an explicit output of another build statement. | |
// We can not be sure here if the conflict is real or not, so we return false to mark the output as dirty. | |
// The conflict detection needs to happen when processing the dynout file from within Builder::FinishCommand. | |
EXPLAIN("Dynamic output '%s' of a previous execution can now be created via '%s', so we consider '%s' as dirty.", | |
(*i)->path().c_str(), old_in_edge->rule_->name().c_str(), output->path().c_str()); | |
hasConflict = true; | |
} else { | |
edge->outputs_.push_back(*i); | |
edge->implicit_outs_++; | |
(*i)->set_in_edge(edge); | |
} | |
} | |
return !hasConflict; |
We've put some examples here that we used to reproduce the problem mentioned earlier, and then also to confirm it's gone: https://github.com/johanneslerch/ninja-dynouts-example
ninja-1.11.1.conti.2.exe is based on Ninja 1.11.1 plus some contributions like this PR.
ninja-1.11.1.conti.3.exe is having the fix suggested in this review comment. This version works like we expect it to.
ninja-pr2366-Jun12.exe is built from #2366. It is not behaving according to our expectations. It's basically re-executing build statements always and also not detecting conflicts of multiple build statements producing the same output file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Coming back to this one. In the meantime we discovered this does not work either in all situations. For now, we discontinued working on this one.
This PR is about making ninja being aware of dynamic outputs, it is related to this discussion on google groups.
The Problem
The general problem I am trying to solve is improving the correctness of the incremental build. I think we can say the incremental build is correct when the result of running ninja is the same as doing a full build (enforcing all rule command to be executed again). Today there are at least two scenarios where this is not the case:
Example In Practice
In my project, we are currently having a C++ code generatorwhich generates several header files out of the same source of data (one header per class). If someone delete or modified the header files in local, ninja will not be aware of that and won't re-run the generator. This issue can break the build or worse result in unexpected behavior of the application. When people report issues which could be related to incremental build issues, the first step is often "have you rebuild from scratch?". I would like to avoid the situation where people have to rebuilt from scratch to be sure their build state is correct.
My Solution
One solution to make ninja aware about dynamic outputs, is to have a mechanism similar to depfile to inform ninja about dynamic outputs during the build. My implementation introduce the
dynout
attribute which indicate the path to a file generated by the rule command containing the list of the outputs. The current syntax is simply having file path per line.Example:
The dynamic outputs are stored in the deps log. As a matter of implementation it is very straightforward, I think it is the way to go. But it regarding naming, "deps" would not fit the concept anymore, because deps log will now contain dynamic dependencies as well as dynamic outputs.
I have also added a tool to list all the output including dynamic outputs in order to diagnosis the build.