-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restat rules that generate build.ninja can crash ninja #874
Comments
The stack looks similar to the stack I'm seeing in the first comments in issue #867 – are you getting a |
I don't get a multiple rules warning, so might not be exactly the same issue as #867. Here is a reduced build.ninja that crashes for me:
Then:
It's very sensitive to tiny modifications of the build.ninja file. Reordering the build statements, removing the redundant looking build statements, or remove the default statements will all cause the crash to stop happening. |
I think I've found the problem. Here's a slightly simplified build.ninja that triggers it (the depfile was a red herring):
The important files are 1, dep, and 5. 2, 3 and 4 are necessary to get 5 building twice concurrently in order to show the crash - without them 5 builds twice sequentially which doesn't cause the running_edges_ crash - but are not involved in the dependency graph that causes the double building. When ninja is run, it first attempts to rebuild build.ninja. This causes the dependency chain 1 -> dep -> build.ninja to rebuild, but the final rule to rebuild build.ninja is a restat rule that doesn't update the output file, so ninja.RebuildManifest returns false and ninja continues to ninja.RunBuild. At this point the output edges for 1 are marked outputs_ready_ == true, but mtime_ is still set to the original mtime before ninja rebuilt 1. The mismatch in state is what triggers the double building and then crashing later. ninja.RunBuild calls AddTarget on the default targets in file order; 2, 3, 4, and 5, and then finally 1. When it calls AddTarget on 5 it sees that all its inputs are ready, because 1 is has set outputs_ready_ == true, and it starts building 5. Then it calls AddTarget on 1, which checks the old cached mtime_ of 1 against the mtime of in.file, sees that its older, and builds 1 again. When 1 finishes it iterates over its out_edges, sees 5 has all its inputs ready, and schedules 5 again. If 5 builds twice concurrently (helped in the build.ninja file above by the sleep .1), running_edges_.find() returns map::end which crashes running_edges_.erase(). I'm not sure what the right fix is here. Update the cached mtime to be the mtime of the most recent input when a node finishes? Always restat after the RebuildManifest step by restarting the cycle loop? |
Great analysis! Maybe we can forbid |
That change is only an optimization so I think it would be fine to revert it for now. We should eventually fix running multiple builds over the same manifest. This will eventually be necessary for continuous builds to work, and I have occasionally observed incorrect builds on my continuous build branch which may be related to this issue. That can all come later though. |
Should be better at head. |
As a fix for ninja-build#874, we started reloading the entire manifest even if the manifest was never rebuilt due to a restat rule. But this can be slow, so call State::Reset instead, which also fixes the original crash. Fixes ninja-build#987
As a fix for ninja-build#874, we started reloading the entire manifest even if the manifest was never rebuilt due to a restat rule. But this can be slow, so call State::Reset instead, which also fixes the original crash. Fixes ninja-build#987 Test: replicate original crash, test with this change Test: do a build that rebuilds manifest, ensure it reloads Test: do a build that uses restat, verify that it's faster Change-Id: Ifeada4afa1717a3691f2e787d1135c0489864629 Signed-off-by: mydongistiny <jaysonedson@gmail.com>
I'm seeing a crash in ninja (master and v1.5.3):
(gdb) bt
#0 0x00007f3b268dbbb9 in __GI_raise (sig=sig@entry=6)
#1 0x00007f3b268defc8 in __GI_abort () at abort.c:89
#2 0x00007f3b26918e14 in __libc_message (do_abort=do_abort@entry=1,
#3 0x00007f3b269250ee in malloc_printerr (ptr=,
#4 _int_free (av=, p=, have_lock=0)
#5 0x0000000000409d65 in deallocate (this=,
#6 _M_put_node (this=, __p=)
#7 _M_destroy_node (this=, __p=)
#8 _M_erase_aux (__position=..., this=0x2c14a40)
#9 erase (__position=..., this=0x2c14a40)
#10 erase (__position=..., this=0x2c14a40)
#11 BuildStatus::BuildEdgeFinished (this=0x2c14a20, edge=edge@entry=0x2391220,
#12 0x000000000040a14c in Builder::FinishCommand (
#13 0x000000000040ab9f in Builder::Build (this=this@entry=0x7fff9a8c00f0,
#14 0x0000000000405b1f in RunBuild (argv=,
#15 (anonymous namespace)::real_main (argc=,
#16 0x00007f3b268c6ec5 in __libc_start_main (
#17 0x0000000000402ac0 in _start ()
I've tracked this down to a file being built twice, which results in downstream dependencies being built twice. If the downstream dependency is slow to build it can run concurrently, and the second one to finish crashes when running_edges_.end() returned fromo running_edges_.find(edge) is passed into running_edges_.erase(i).
This vastly simplified build.ninja repros the spurious second build (but not the crash, even if I try to add a slow to build file that depends on out1.file, so it may not be triggering the exact same problem):
rule notouch
command = # touch ${out}
restat = true
rule touch
command = touch ${out}
build out1.file : touch in.file
default out1.file
build out2.file: notouch out1.file
build build.ninja : notouch
depfile = build.ninja.d
And build.ninja.d:
build.ninja:
out2.file
Repro steps:
touch in.file && ninja -d explain -v
ninja explain: output out1.file older than most recent input in.file (1417576914 vs 1417576918)
ninja explain: out1.file is dirty
ninja explain: out2.file is dirty
[1/3] touch out1.file
[2/3] # touch out2.file
ninja explain: output out1.file older than most recent input in.file (1417576914 vs 1417576918)
[1/1] touch out1.file
The text was updated successfully, but these errors were encountered: