Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSX: must run make twice to build json dependency #6

Closed
dsummersl opened this issue Jun 8, 2013 · 5 comments
Closed

OSX: must run make twice to build json dependency #6

dsummersl opened this issue Jun 8, 2013 · 5 comments

Comments

@dsummersl
Copy link

When I build this on OSX, I get the following error the first time I run make:

cc  -m64 -I/usr/local/Cellar/pcre/8.33/include -Wall -Wextra -Wdeclaration-after-statement -g -gdwarf-2 -fno-omit-frame-pointer   -o tests/bser.t tests/tests_bser_t-bser.o tests_bser_t-bser.o -L. -lwmanjson -ltap -L/usr/local/Cellar/pcre/8.33/lib -lpcre
ld: library not found for -lwmanjson
clang: error: linker command failed with exit code 1 (use -v to see invocation)

The second time I execute the make command, the library builds.

Any ideas why it wouldn't work the first time around?

@wez
Copy link
Contributor

wez commented Jun 8, 2013

Sounds like a dependency problem. Are you running make -j?

@dsummersl
Copy link
Author

No, I'm just using make (no -j).

First time around, I see:

...
...
mv -f thirdparty/jansson/.deps/libwmanjson_a-utf.Tpo thirdparty/jansson/.deps/libwmanjson_a-utf.Po
cc -DHAVE_CONFIG_H -I.  -Ithirdparty/jansson -D_REENTRANT  -m64 -I/usr/local/Cellar/pcre/8.33/include -Wall -Wextra -Wdeclaration-after-statement -g -gdwarf-2 -fno-omit-frame-pointer -MT thirdparty/libtap_a-tap.o -MD -MP -MF thirdparty/.deps/libtap_a-tap.Tpo -c -o thirdparty/libtap_a-tap.o `test -f 'thirdparty/tap.c' || echo './'`thirdparty/tap.c
mv -f thirdparty/.deps/libtap_a-tap.Tpo thirdparty/.deps/libtap_a-tap.Po
mv -f thirdparty/jansson/.deps/libwmanjson_a-value.Tpo thirdparty/jansson/.deps/libwmanjson_a-value.Po
...
...
cc  -m64 -I/usr/local/Cellar/pcre/8.33/include -Wall -Wextra -Wdeclaration-after-statement -g -gdwarf-2 -fno-omit-frame-pointer   -o tests/log.t tests/tests_log_t-log.o tests_log_t-log.o -L. -lwmanjson -ltap -L/usr/local/Cellar/pcre/8.33/lib -lpcre
ld: library not found for -lwmanjson
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[1]: *** [tests/log.t] Error 1
make[1]: *** Waiting for unfinished jobs....
mv -f tests/.deps/tests_argv_t-argv.Tpo tests/.deps/tests_argv_t-argv.Po
make: *** [all] Error 2

(which fails at the line I mentioned earlier).

Second time around I see:

rm -f libwmanjson.a
ar cru libwmanjson.a thirdparty/jansson/libwmanjson_a-dump.o thirdparty/jansson/libwmanjson_a-error.o thirdparty/jansson/libwmanjson_a-hashtable.o thirdparty/jansson/libwmanjson_a-load.o thirdparty/jansson/libwmanjson_a-memory.o thirdparty/jansson/libwmanjson_a-pack_unpack.o thirdparty/jansson/libwmanjson_a-strbuffer.o thirdparty/jansson/libwmanjson_a-strconv.o thirdparty/jansson/libwmanjson_a-utf.o thirdparty/jansson/libwmanjson_a-value.o 
rm -f libtap.a
ar cru libtap.a thirdparty/libtap_a-tap.o 
ranlib libwmanjson.a
ranlib libtap.a
cc  -m64 -I/usr/local/Cellar/pcre/8.33/include -Wall -Wextra -Wdeclaration-after-statement -g -gdwarf-2 -fno-omit-frame-pointer   -o watchman watchman-argv.o watchman-opt.o watchman-cfg.o watchman-log.o watchman-json.o watchman-bser.o watchman-hash.o watchman-ht.o cmds/watchman-find.o cmds/watchman-info.o cmds/watchman-log.o cmds/watchman-query.o cmds/watchman-since.o cmds/watchman-subscribe.o cmds/watchman-trigger.o cmds/watchman-watch.o query/watchman-base.o query/watchman-parse.o query/watchman-init.o query/watchman-eval.o query/watchman-type.o query/watchman-suffix.o query/watchman-match.o query/watchman-pcre.o query/watchman-name.o query/watchman-fieldlist.o query/watchman-since.o query/watchman-empty.o watchman-listener.o watchman-clientmode.o watchman-main.o watchman-root.o watchman-rules.o watchman-state.o watchman-string.o -L. -lwmanjson -L/usr/local/Cellar/pcre/8.33/lib -lpcre
cc  -m64 -I/usr/local/Cellar/pcre/8.33/include -Wall -Wextra -Wdeclaration-after-statement -g -gdwarf-2 -fno-omit-frame-pointer   -o tests/argv.t tests/tests_argv_t-argv.o tests_argv_t-argv.o -L. -lwmanjson -ltap -L/usr/local/Cellar/pcre/8.33/lib -lpcre
cc  -m64 -I/usr/local/Cellar/pcre/8.33/include -Wall -Wextra -Wdeclaration-after-statement -g -gdwarf-2 -fno-omit-frame-pointer   -o tests/log.t tests/tests_log_t-log.o tests_log_t-log.o -L. -lwmanjson -ltap -L/usr/local/Cellar/pcre/8.33/lib -lpcre
cc  -m64 -I/usr/local/Cellar/pcre/8.33/include -Wall -Wextra -Wdeclaration-after-statement -g -gdwarf-2 -fno-omit-frame-pointer   -o tests/bser.t tests/tests_bser_t-bser.o tests_bser_t-bser.o -L. -lwmanjson -ltap -L/usr/local/Cellar/pcre/8.33/lib -lpcre

It looks like the wmanjson library isn't compiled into an archive the first time
around.

@wez
Copy link
Contributor

wez commented Jun 8, 2013

Try with this diff:

diff --git a/Makefile.am b/Makefile.am
index aa481f8..d632fa5 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -4,7 +4,7 @@ doc_DATA = README.markdown
 docdir = ${prefix}/share/doc/watchman-$(VERSION)

 THIRDPARTY_CPPFLAGS = -I$(top_srcdir)/thirdparty/jansson -I$(top_builddir)/thirdparty/jansson
-JSON_LIB = -L. -lwmanjson
+JSON_LIB = -L. libwmanjson.a

 watchman_CPPFLAGS = $(THIRDPARTY_CPPFLAGS) @IRONMANCFLAGS@
 watchman_LDADD = $(JSON_LIB)
@@ -72,7 +72,7 @@ libwmanjson_a_SOURCES = \
 # bundled testing library
 libtap_a_CPPFLAGS = $(THIRDPARTY_CPPFLAGS)
 libtap_a_SOURCES = thirdparty/tap.c
-TAP_LIB = -ltap
+TAP_LIB = libtap.a

 # unit tests
 TESTS = tests/argv.t tests/log.t tests/bser.t

@dsummersl
Copy link
Author

A slight mod to the patch (for v2.5.1 that I'm using) works fine, thanks! BTW I'm submitting a push for homebrew and I'll include this patch with it for v2.5.1 (Homebrew/legacy-homebrew#20349). Will this patch be included for future releases?

@wez
Copy link
Contributor

wez commented Jun 9, 2013

Yep, pushed d335591 and tagged it as 2.6; you can update your recipe to point to that; thanks for reporting this, and thanks for getting it into homebrew.

@wez wez closed this as completed Jun 9, 2013
wez pushed a commit that referenced this issue Oct 28, 2013
Summary: A failing (on OSX) test case as evidence that the current OSX implementation doesn't notice file changes.

Test Plan:
unit test

Here is the full unit test suite output (`make integration`) - this new test is not the only one failing on OSX. They all also failed before this test, except that this test seems to trigger a failure in the trigger test that isn't there before (some kind of teardown necessary?). Pun intended.

  arc unit --everything
  /Users/fugalh/src/phab/arcanist/bin/arc
  [2013-10-26 19:34:42] ERROR 8: Undefined index: files at [/Users/fugalh/src/watchman/tests/integration/pcre.php:19]
    #0 pcreTestCase::testPCRE()
    #1 call_user_func_array(Array of size 2 starting with: { 0 => Object pcreTestCase }, Array ) called at [/Users/fugalh/src/phab/arcanist/src/unit/engine/phutil/ArcanistPhutilTestCase.php:445]
    #2 ArcanistPhutilTestCase::run() called at [/Users/fugalh/src/watchman/arcanist/lib/WatchmanIntegrationEngine.php:68]
    #3 WatchmanIntegrationEngine::runIntegrationTests() called at [/Users/fugalh/src/watchman/arcanist/lib/WatchmanIntegrationEngine.php:10]
    #4 WatchmanIntegrationEngine::run() called at [/Users/fugalh/src/phab/arcanist/src/workflow/ArcanistUnitWorkflow.php:173]
    #5 ArcanistUnitWorkflow::run() called at [/Users/fugalh/src/phab/arcanist/scripts/arcanist.php:318]
  [2013-10-26 19:34:42] ERROR 8: Undefined index: files at [/Users/fugalh/src/watchman/tests/integration/pcre.php:19]
    #0 pcreTestCase::testPCRE()
    #1 call_user_func_array(Array of size 2 starting with: { 0 => Object pcreTestCase }, Array ) called at [/Users/fugalh/src/phab/arcanist/src/unit/engine/phutil/ArcanistPhutilTestCase.php:445]
    #2 ArcanistPhutilTestCase::run() called at [/Users/fugalh/src/watchman/arcanist/lib/WatchmanIntegrationEngine.php:73]
    #3 WatchmanIntegrationEngine::runIntegrationTests() called at [/Users/fugalh/src/watchman/arcanist/lib/WatchmanIntegrationEngine.php:10]
    #4 WatchmanIntegrationEngine::run() called at [/Users/fugalh/src/phab/arcanist/src/workflow/ArcanistUnitWorkflow.php:173]
    #5 ArcanistUnitWorkflow::run() called at [/Users/fugalh/src/phab/arcanist/scripts/arcanist.php:318]
  [2013-10-26 19:34:43] ERROR 8: Undefined index: files at [/Users/fugalh/src/watchman/tests/integration/pcre.php:19]
    #0 pcreTestCase::testPCRE()
    #1 call_user_func_array(Array of size 2 starting with: { 0 => Object pcreTestCase }, Array ) called at [/Users/fugalh/src/phab/arcanist/src/unit/engine/phutil/ArcanistPhutilTestCase.php:445]
    #2 ArcanistPhutilTestCase::run() called at [/Users/fugalh/src/watchman/arcanist/lib/WatchmanIntegrationEngine.php:73]
    #3 WatchmanIntegrationEngine::runIntegrationTests() called at [/Users/fugalh/src/watchman/arcanist/lib/WatchmanIntegrationEngine.php:10]
    #4 WatchmanIntegrationEngine::run() called at [/Users/fugalh/src/phab/arcanist/src/workflow/ArcanistUnitWorkflow.php:173]
    #5 ArcanistUnitWorkflow::run() called at [/Users/fugalh/src/phab/arcanist/scripts/arcanist.php:318]
     PASS   <1ms★  argv.t
     PASS    1ms★  bser.t
     PASS   <1ms★  log.t
     FAIL  testBigPCRE
  Assertion failed at line 48 in bigpcre.php: got useful message: failed to parse query: unknown expression term 'pcre'
  Expected: true
    Actual: false
     FAIL  testBigPCRE [CLI: bser/json]
  Assertion failed at line 48 in bigpcre.php: got useful message: failed to parse query: unknown expression term 'pcre'
  Expected: true
    Actual: false
     FAIL  testBigPCRE [CLI: json/json]
  Assertion failed at line 48 in bigpcre.php: got useful message: failed to parse query: unknown expression term 'pcre'
  Expected: true
    Actual: false
     PASS    7ms★  testBSDishTopLevel
     PASS   34ms★  testBSDishTopLevel [CLI: bser/json]
     PASS   35ms★  testBSDishTopLevel [CLI: json/json]
     PASS    2ms★  testClock
     PASS   12ms★  testClock [CLI: bser/json]
     PASS   13ms★  testClock [CLI: json/json]
     PASS    8ms★  testCursor
     PASS   74ms   testCursor [CLI: bser/json]
     PASS   73ms   testCursor [CLI: json/json]
     PASS    5ms★  testEmpty
     PASS   33ms★  testEmpty [CLI: bser/json]
     PASS   34ms★  testEmpty [CLI: json/json]
     PASS    4ms★  testFields
     PASS   21ms★  testFields [CLI: bser/json]
     PASS   22ms★  testFields [CLI: json/json]
     PASS   10ms★  testFind
     PASS   26ms★  testFind [CLI: bser/json]
     PASS   25ms★  testFind [CLI: json/json]
     PASS   16ms★  testFishy
     PASS    3ms★  testIgnoreGit
     PASS    9ms★  testIgnoreGit [CLI: bser/json]
     PASS    9ms★  testIgnoreGit [CLI: json/json]
     PASS   <1ms★  testSockName
     PASS    6ms★  testSockName [CLI: bser/json]
     PASS    6ms★  testSockName [CLI: json/json]
     PASS    3ms★  testInvalidSyncTimeout
     PASS    1ms★  testInvalidExprTerm
     PASS   19ms★  testInvalidSyncTimeout [CLI: bser/json]
     PASS   13ms★  testInvalidExprTerm [CLI: bser/json]
     PASS   13ms★  testInvalidExprTerm [CLI: json/json]
     PASS   19ms★  testInvalidSyncTimeout [CLI: json/json]
     PASS   <1ms★  testInvalidRoot
     PASS    6ms★  testInvalidRoot [CLI: bser/json]
     PASS    6ms★  testInvalidRoot [CLI: json/json]
     FAIL  testModify
  Assertion failed at line 27 in modified.php
  Expected vs Actual Output Diff
  --- Old Value
  +++ New Value
  @@ -1,5 +1 @@
  -Array
  -(
  -    [0] => foo/111
  -)
  -
  +null

     PASS   40ms★  testEvenMoreMoves
     PASS   27ms★  testMoreMoves
     PASS   44ms★  testMoreMoves [CLI: bser/json]
     PASS   57ms   testEvenMoreMoves [CLI: bser/json]
     PASS   57ms   testEvenMoreMoves [CLI: json/json]
     PASS   44ms★  testMoreMoves [CLI: json/json]
     PASS    9ms★  testMoveReAdd
     PASS   83ms   testMoveReAdd [CLI: bser/json]
     PASS   82ms   testMoveReAdd [CLI: json/json]
     PASS    7ms★  testNameExpr
     PASS   79ms   testNameExpr [CLI: bser/json]
     PASS   82ms   testNameExpr [CLI: json/json]
     PASS    5ms★  testOtherCookies
     PASS   11ms★  testOtherCookies [CLI: bser/json]
     PASS   10ms★  testOtherCookies [CLI: json/json]
     FAIL  testPCRE
  Assertion failed at line 19 in pcre.php
  Expected: foo.c
    Actual: null
     FAIL  testPCRE [CLI: bser/json]
  Assertion failed at line 19 in pcre.php
  Expected: foo.c
    Actual: null
     FAIL  testPCRE [CLI: json/json]
  Assertion failed at line 19 in pcre.php
  Expected: foo.c
    Actual: null
     PASS   20ms★  testRemove
     PASS   43ms★  testRemove [CLI: bser/json]
     PASS   41ms★  testRemove [CLI: json/json]
     PASS    4ms★  testRemoveRoot
     PASS   10ms★  testRemoveRoot [CLI: bser/json]
     PASS    9ms★  testRemoveRoot [CLI: json/json]
     PASS   20ms★  testRootRestrict
     PASS   62ms   testRootRestrict [CLI: bser/json]
     PASS   61ms   testRootRestrict [CLI: json/json]
     PASS    8ms★  testSinceExpr
     PASS   64ms   testSinceExpr [CLI: bser/json]
     PASS   64ms   testSinceExpr [CLI: json/json]
     PASS    6ms★  testSinceIssue2
     PASS    4ms★  testRecrawlFreshInstance
     PASS    8ms★  testSinceFreshInstance
     PASS    5ms★  testReaddWatchFreshInstance
     PASS    6ms★  testSinceIssue1
     PASS   64ms   testSinceFreshInstance [CLI: bser/json]
     PASS   23ms★  testSinceIssue1 [CLI: bser/json]
     PASS   27ms★  testRecrawlFreshInstance [CLI: bser/json]
     PASS   33ms★  testReaddWatchFreshInstance [CLI: bser/json]
     PASS   59ms   testSinceIssue2 [CLI: bser/json]
     PASS   28ms★  testRecrawlFreshInstance [CLI: json/json]
     PASS   57ms   testSinceIssue2 [CLI: json/json]
     PASS   64ms   testSinceFreshInstance [CLI: json/json]
     PASS   34ms★  testReaddWatchFreshInstance [CLI: json/json]
     PASS   22ms★  testSinceIssue1 [CLI: json/json]
     PASS  210ms   testSubscribe
     PASS    6ms★  testSuffixExpr
     PASS   23ms★  testSuffixExpr [CLI: bser/json]
     PASS   20ms★  testSuffixExpr [CLI: json/json]
     FAIL  testTrigger
  EXCEPTION (Exception): expected a log response, got {"version":"2.8.2","clock":"c:1382816076:60845:86:4","since":"c:1382816076:60845:82:1","is_fresh_instance":true,"files":["bar.txt","foo.c"],"subscription":"test","root":"/private/var/folders/j1/y_xh789x5mj2dgc_k76mtl_r47sbp0/T/2646poi2hjlwcgc0"}

  #0 /Users/fugalh/src/watchman/arcanist/lib/WatchmanTestCase.php(126): WatchmanInstance->waitForLog('/posix_spawnp/', 60)
  #1 /Users/fugalh/src/watchman/arcanist/lib/WatchmanTestCase.php(130): WatchmanTestCase->waitForLog('/posix_spawnp/', 60)
  #2 /Users/fugalh/src/watchman/tests/integration/trigger.php(86): WatchmanTestCase->assertWaitForLog('/posix_spawnp/', 60)
  #3 [internal function]: triggerTestCase->testTrigger()
  #4 /Users/fugalh/src/phab/arcanist/src/unit/engine/phutil/ArcanistPhutilTestCase.php(445): call_user_func_array(Array, Array)
  #5 /Users/fugalh/src/watchman/arcanist/lib/WatchmanIntegrationEngine.php(68): ArcanistPhutilTestCase->run()
  #6 /Users/fugalh/src/watchman/arcanist/lib/WatchmanIntegrationEngine.php(10): WatchmanIntegrationEngine->runIntegrationTests()
  #7 /Users/fugalh/src/phab/arcanist/src/workflow/ArcanistUnitWorkflow.php(173): WatchmanIntegrationEngine->run()
  #8 /Users/fugalh/src/phab/arcanist/scripts/arcanist.php(318): ArcanistUnitWorkflow->run()
  #9 {main}
     PASS   1.0s   testTwoDeep
     PASS   1.1s   testTwoDeep [CLI: bser/json]
     PASS   1.0s   testTwoDeep [CLI: json/json]
     PASS    4ms★  testTypeExpr
     PASS   33ms★  testTypeExpr [CLI: bser/json]
     PASS   31ms★  testTypeExpr [CLI: json/json]
  make: *** [integration] Error 2

Reviewers: wez, sid0

CC: net-systems@

Differential Revision: https://phabricator.fb.com/D1029383
wez added a commit that referenced this issue Apr 26, 2015
Summary:
looking at #96 (comment)

we see a trace like:

```
thread #3: tid = 0x16a58c, 0x000000010438cdc0 watchman`w_ht_del
frame #0: 0x000000010438cdc0 watchman`w_ht_del
frame #1: 0x000000010439aa40 watchman`age_out_dir + 144
frame #2: 0x000000010439aa28 watchman`age_out_dir + 120
frame #3: 0x0000000104398738 watchman`age_out_file + 248
frame #4: 0x000000010439a9ee watchman`age_out_dir + 62
frame #5: 0x0000000104398738 watchman`age_out_file + 248
frame #6: 0x000000010439a9ee watchman`age_out_dir + 62
frame #7: 0x0000000104398738 watchman`age_out_file + 248
frame #8: 0x000000010439a9ee watchman`age_out_dir + 62
frame #9: 0x0000000104398738 watchman`age_out_file + 248
frame #10: 0x000000010439a9ee watchman`age_out_dir + 62
frame #11: 0x0000000104398738 watchman`age_out_file + 248
frame #12: 0x000000010439a9ee watchman`age_out_dir + 62
frame #13: 0x0000000104398738 watchman`age_out_file + 248
frame #14: 0x000000010439a9ee watchman`age_out_dir + 62
frame #15: 0x0000000104398738 watchman`age_out_file + 248
frame #16: 0x000000010439a9ee watchman`age_out_dir + 62
frame #17: 0x0000000104398738 watchman`age_out_file + 248
```

In cases where we have a large number of files that are eligible to age
out across a reasonable number of dirs, we can walk the same portions of
the tree we maintain in memory multiple times.

This diff records the names of the dir nodes for which we would previously
recursively call age_out_dir(); we use a hash to unique the names.

After we have walked all eligible file nodes we then iterate the list
of saved dirs and delete them.

This reduces the complexity of age out processing to a single scan of the file
node list followed by a single scan of the list of dirs that we accumulated in
the file node scan.

Test Plan:
`arc unit tests/integration/age.php` exercises this specifically.  also: `make integration`

a more rigorous test on our www repo:

```
hg up -C master~40000 ; sleep 10 ; hg up -C master ; sleep 10 ; time watchman debug-ageout . 1
hg up -C master~40000 ; sleep 10 ; hg up -C master ; sleep 10 ; time watchman debug-ageout . 1
```

(it's important to run it twice in succession because we had a long
standing bug that we hadn't noticed until now!)

The time portion of this outputs:

```
watchman debug-ageout . 1  0.00s user 0.00s system 2% cpu 0.157 total
```

which is a good bit better than 20+ minutes.

Refs: #96

Reviewers: sid0

Reviewed By: sid0

Differential Revision: https://reviews.facebook.net/D37683
wez added a commit that referenced this issue Aug 20, 2015
Summary:
here's an example of the deadlock:

```
Thread 53 (Thread 0x7f3f17fff700 (LWP 3247723)):
 #0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
 #1  0x00007f3f1dadbf2c in __GI___pthread_mutex_lock (mutex=mutex@entry=0x68f8e0 <root_lock>) at ../nptl/pthread_mutex_lock.c:80
 #2  0x0000000000417b66 in remove_root_from_watched (root=0x692380) at root.c:1872
 #3  w_root_stop_watch (root=root@entry=0x692380) at root.c:2251
 #4  0x000000000041854b in consider_reap (root=0x692380) at root.c:1601
 #5  notify_thread (root=0x692380) at root.c:1656
 #6  run_notify_thread (arg=0x692380) at root.c:2148
 #7  0x00007f3f1dad97f1 in start_thread (arg=0x7f3f17fff700) at pthread_create.c:310
 #8  0x00007f3f1d81046d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 52 (Thread 0x7f3f175fd700 (LWP 3247724)):
 #0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
 #1  0x00007f3f1dadbfb0 in __GI___pthread_mutex_lock (mutex=mutex@entry=0x692398) at ../nptl/pthread_mutex_lock.c:115
 #2  0x0000000000415255 in w_root_lock (root=root@entry=0x692380) at root.c:313
 #3  0x0000000000418bdc in w_root_save_state (state=state@entry=0x7f3f080480d0) at root.c:2444
 #4  0x0000000000418ff8 in w_state_save () at state.c:98
 #5  0x0000000000417bef in w_root_stop_watch (root=root@entry=0x693780) at root.c:2255
 #6  0x000000000041854b in consider_reap (root=0x693780) at root.c:1601
 #7  notify_thread (root=0x693780) at root.c:1656
 #8  run_notify_thread (arg=0x693780) at root.c:2148
 #9  0x00007f3f1dad97f1 in start_thread (arg=0x7f3f175fd700) at pthread_create.c:310
 #10 0x00007f3f1d81046d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
```

Test Plan:
augmented an integration test, but I couldn't get this to trigger on demand :-/
In the problem case, both watches were established from the state file and aged at at
the exact same time.

Reviewers: sid0

Differential Revision: https://reviews.facebook.net/D44841
facebook-github-bot pushed a commit that referenced this issue Oct 7, 2022
Summary:
We have seen deadlock running `terminationHandler` -> `hasSubscribers` in 2 threads.
It's unclear which other thread is holding the lock.

To make things easier to debug next time, let's change terminationHandler (and
also main.cpp) to bypass the logging lock and write to stderr directly.

Related threads (all threads in P536343453):

  Thread 11 (LWP 3275661):
  #0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
  #1  0x0000000001cc995b in folly::detail::(anonymous namespace)::nativeFutexWaitImpl (addr=<optimized out>, expected=<optimized out>, absSystemTime=<optimized out>, absSteadyTime=<optimized out>, waitMask=<optimized out>) at fbcode/folly/detail/Futex.cpp:126
  #2  folly::detail::futexWaitImpl (futex=0x89, futex@entry=0x7f1c3ac2ef90, expected=994748889, absSystemTime=absSystemTime@entry=0x0, absSteadyTime=<optimized out>, absSteadyTime@entry=0x0, waitMask=waitMask@entry=1) at fbcode/folly/detail/Futex.cpp:254
  #3  0x0000000001d34bce in folly::detail::futexWait<std::atomic<unsigned int> > (futex=0x7f1c3ac2ef90, expected=137, waitMask=1) at buck-out/v2/gen/fbcode/110b607930331a92/folly/detail/__futex__/headers/folly/detail/Futex-inl.h:96
  #4  folly::SharedMutexImpl<false, void, std::atomic, folly::SharedMutexPolicyDefault>::WaitForever::doWait (this=<optimized out>, futex=..., expected=137, waitMask=1) at buck-out/v2/gen/fbcode/110b607930331a92/folly/__shared_mutex__/headers/folly/SharedMutex.h:718
  #5  folly::SharedMutexImpl<false, void, std::atomic, folly::SharedMutexPolicyDefault>::futexWaitForZeroBits<folly::SharedMutexImpl<false, void, std::atomic, folly::SharedMutexPolicyDefault>::WaitForever> (this=0x7f1c3ac2ef90, state=@0x7f1c149f88e4: 118379409, goal=128, waitMask=1, ctx=...) at buck-out/v2/gen/fbcode/110b607930331a92/folly/__shared_mutex__/headers/folly/SharedMutex.h:1184
  #6  0x0000000001cd42b2 in folly::SharedMutexImpl<false, void, std::atomic, folly::SharedMutexPolicyDefault>::yieldWaitForZeroBits<folly::SharedMutexImpl<false, void, std::atomic, folly::SharedMutexPolicyDefault>::WaitForever> (this=0x7f1c3ac2ef90, state=@0x7f1c149f88e4: 118379409, goal=128, waitMask=1, ctx=...) at buck-out/v2/gen/fbcode/110b607930331a92/folly/__shared_mutex__/headers/folly/SharedMutex.h:1151
  #7  folly::SharedMutexImpl<false, void, std::atomic, folly::SharedMutexPolicyDefault>::waitForZeroBits<folly::SharedMutexImpl<false, void, std::atomic, folly::SharedMutexPolicyDefault>::WaitForever> (this=0x7f1c3ac2ef90, state=@0x7f1c149f88e4: 118379409, goal=128, waitMask=1, ctx=...) at buck-out/v2/gen/fbcode/110b607930331a92/folly/__shared_mutex__/headers/folly/SharedMutex.h:1109
  #8  0x0000000001e7e14c in folly::SharedMutexImpl<false, void, std::atomic, folly::SharedMutexPolicyDefault>::lockSharedImpl<folly::SharedMutexImpl<false, void, std::atomic, folly::SharedMutexPolicyDefault>::WaitForever> (this=0x7f1c3ac2ef90, state=@0x7f1c149f88e4: 118379409, token=0x0, ctx=...) at buck-out/v2/gen/fbcode/110b607930331a92/folly/__shared_mutex__/headers/folly/SharedMutex.h:1664
  #9  folly::SharedMutexImpl<false, void, std::atomic, folly::SharedMutexPolicyDefault>::lockSharedImpl<folly::SharedMutexImpl<false, void, std::atomic, folly::SharedMutexPolicyDefault>::WaitForever> (this=0x7f1c3ac2ef90, token=0x0, ctx=...) at buck-out/v2/gen/fbcode/110b607930331a92/folly/__shared_mutex__/headers/folly/SharedMutex.h:1356
  #10 folly::SharedMutexImpl<false, void, std::atomic, folly::SharedMutexPolicyDefault>::lock_shared (this=0x7f1c3ac2ef90) at buck-out/v2/gen/fbcode/110b607930331a92/folly/__shared_mutex__/headers/folly/SharedMutex.h:495
  #11 std::shared_lock<folly::SharedMutexImpl<false, void, std::atomic, folly::SharedMutexPolicyDefault> >::shared_lock (this=<optimized out>, __m=...) at fbcode/third-party-buck/platform010/build/libgcc/include/c++/trunk/shared_mutex:727
  #12 0x0000000002d765fd in folly::LockedPtr<folly::Synchronized<watchman::Publisher::state, folly::SharedMutexImpl<false, void, std::atomic, folly::SharedMutexPolicyDefault> > const, folly::detail::SynchronizedLockPolicy<(folly::detail::SynchronizedMutexLevel)2, (folly::detail::SynchronizedMutexMethod)0> >::doLock<folly::SharedMutexImpl<false, void, std::atomic, folly::SharedMutexPolicyDefault>, std::shared_lock<folly::SharedMutexImpl<false, void, std::atomic, folly::SharedMutexPolicyDefault> >, folly::detail::SynchronizedLockPolicy<(folly::detail::SynchronizedMutexLevel)2, (folly::detail::SynchronizedMutexMethod)0>, 0> (mutex=...) at buck-out/v2/gen/fbcode/110b607930331a92/folly/__synchronized__/headers/folly/Synchronized.h:1493
  #13 folly::LockedPtr<folly::Synchronized<watchman::Publisher::state, folly::SharedMutexImpl<false, void, std::atomic, folly::SharedMutexPolicyDefault> > const, folly::detail::SynchronizedLockPolicy<(folly::detail::SynchronizedMutexLevel)2, (folly::detail::SynchronizedMutexMethod)0> >::LockedPtr (this=0x7f1c149f8928, parent=<optimized out>) at buck-out/v2/gen/fbcode/110b607930331a92/folly/__synchronized__/headers/folly/Synchronized.h:1272
  #14 folly::SynchronizedBase<folly::Synchronized<watchman::Publisher::state, folly::SharedMutexImpl<false, void, std::atomic, folly::SharedMutexPolicyDefault> >, (folly::detail::SynchronizedMutexLevel)2>::rlock (this=<optimized out>) at buck-out/v2/gen/fbcode/110b607930331a92/folly/__synchronized__/headers/folly/Synchronized.h:229
  #15 watchman::Publisher::hasSubscribers (this=<optimized out>) at fbcode/watchman/PubSub.cpp:117
  #16 0x0000000002eca798 in watchman::Log::log<char const (&) [39], char const*, char const (&) [3]> (this=<optimized out>, level=level@entry=watchman::ABORT, args=..., args=..., args=...) at buck-out/v2/gen/fbcode/110b607930331a92/watchman/__logging__/headers/watchman/Logging.h:42
  #17 0x0000000002ec9ba7 in watchman::log<char const (&) [39], char const*, char const (&) [3]> (level=watchman::ABORT, args=..., args=..., args=...) at buck-out/v2/gen/fbcode/110b607930331a92/watchman/__logging__/headers/watchman/Logging.h:121
  #18 (anonymous namespace)::terminationHandler () at fbcode/watchman/SignalHandler.cpp:159
  #19 0x00007f1c3b0c7b3a in __cxxabiv1::__terminate (handler=<optimized out>) at ../../.././libstdc++-v3/libsupc++/eh_terminate.cc:48
  #20 0x00007f1c3b0c7ba5 in std::terminate () at ../../.././libstdc++-v3/libsupc++/eh_terminate.cc:58
  #21 0x0000000001c38c8b in __clang_call_terminate ()
  #22 0x0000000003284c9e in folly::detail::terminate_with_<std::runtime_error, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&> (args=...) at buck-out/v2/gen/fbcode/110b607930331a92/folly/lang/__exception__/headers/folly/lang/Exception.h:93
  #23 0x0000000003281bae in folly::terminate_with<std::runtime_error, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&> (args=...) at buck-out/v2/gen/fbcode/110b607930331a92/folly/lang/__exception__/headers/folly/lang/Exception.h:123
  #24 folly::SingletonVault::fireShutdownTimer (this=<optimized out>) at fbcode/folly/Singleton.cpp:499
  #25 0x0000000003281ad9 in folly::(anonymous namespace)::fireShutdownSignalHelper (sigval=...) at fbcode/folly/Singleton.cpp:454
  #26 0x00007f1c3b42b939 in timer_sigev_thread (arg=<optimized out>) at ../sysdeps/unix/sysv/linux/timer_routines.c:55
  #27 0x00007f1c3b41fc0f in start_thread (arg=<optimized out>) at pthread_create.c:434
  #28 0x00007f1c3b4b21dc in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

  ...

  Thread 1 (LWP 3201992):
  #0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
  #1  0x0000000001cc995b in folly::detail::(anonymous namespace)::nativeFutexWaitImpl (addr=<optimized out>, expected=<optimized out>, absSystemTime=<optimized out>, absSteadyTime=<optimized out>, waitMask=<optimized out>) at fbcode/folly/detail/Futex.cpp:126
  #2  folly::detail::futexWaitImpl (futex=0x89, futex@entry=0x7f1c3ac2ef90, expected=994748889, absSystemTime=absSystemTime@entry=0x0, absSteadyTime=<optimized out>, absSteadyTime@entry=0x0, waitMask=waitMask@entry=1) at fbcode/folly/detail/Futex.cpp:254
  #3  0x0000000001d34bce in folly::detail::futexWait<std::atomic<unsigned int> > (futex=0x7f1c3ac2ef90, expected=137, waitMask=1) at buck-out/v2/gen/fbcode/110b607930331a92/folly/detail/__futex__/headers/folly/detail/Futex-inl.h:96
  #4  folly::SharedMutexImpl<false, void, std::atomic, folly::SharedMutexPolicyDefault>::WaitForever::doWait (this=<optimized out>, futex=..., expected=137, waitMask=1) at buck-out/v2/gen/fbcode/110b607930331a92/folly/__shared_mutex__/headers/folly/SharedMutex.h:718
  #5  folly::SharedMutexImpl<false, void, std::atomic, folly::SharedMutexPolicyDefault>::futexWaitForZeroBits<folly::SharedMutexImpl<false, void, std::atomic, folly::SharedMutexPolicyDefault>::WaitForever> (this=0x7f1c3ac2ef90, state=@0x7ffd2d5be924: 118379408, goal=128, waitMask=1, ctx=...) at buck-out/v2/gen/fbcode/110b607930331a92/folly/__shared_mutex__/headers/folly/SharedMutex.h:1184
  #6  0x0000000001cd42b2 in folly::SharedMutexImpl<false, void, std::atomic, folly::SharedMutexPolicyDefault>::yieldWaitForZeroBits<folly::SharedMutexImpl<false, void, std::atomic, folly::SharedMutexPolicyDefault>::WaitForever> (this=0x7f1c3ac2ef90, state=@0x7ffd2d5be924: 118379408, goal=128, waitMask=1, ctx=...) at buck-out/v2/gen/fbcode/110b607930331a92/folly/__shared_mutex__/headers/folly/SharedMutex.h:1151
  #7  folly::SharedMutexImpl<false, void, std::atomic, folly::SharedMutexPolicyDefault>::waitForZeroBits<folly::SharedMutexImpl<false, void, std::atomic, folly::SharedMutexPolicyDefault>::WaitForever> (this=0x7f1c3ac2ef90, state=@0x7ffd2d5be924: 118379408, goal=128, waitMask=1, ctx=...) at buck-out/v2/gen/fbcode/110b607930331a92/folly/__shared_mutex__/headers/folly/SharedMutex.h:1109
  #8  0x0000000001e7e14c in folly::SharedMutexImpl<false, void, std::atomic, folly::SharedMutexPolicyDefault>::lockSharedImpl<folly::SharedMutexImpl<false, void, std::atomic, folly::SharedMutexPolicyDefault>::WaitForever> (this=0x7f1c3ac2ef90, state=@0x7ffd2d5be924: 118379408, token=0x0, ctx=...) at buck-out/v2/gen/fbcode/110b607930331a92/folly/__shared_mutex__/headers/folly/SharedMutex.h:1664
  #9  folly::SharedMutexImpl<false, void, std::atomic, folly::SharedMutexPolicyDefault>::lockSharedImpl<folly::SharedMutexImpl<false, void, std::atomic, folly::SharedMutexPolicyDefault>::WaitForever> (this=0x7f1c3ac2ef90, token=0x0, ctx=...) at buck-out/v2/gen/fbcode/110b607930331a92/folly/__shared_mutex__/headers/folly/SharedMutex.h:1356
  #10 folly::SharedMutexImpl<false, void, std::atomic, folly::SharedMutexPolicyDefault>::lock_shared (this=0x7f1c3ac2ef90) at buck-out/v2/gen/fbcode/110b607930331a92/folly/__shared_mutex__/headers/folly/SharedMutex.h:495
  #11 std::shared_lock<folly::SharedMutexImpl<false, void, std::atomic, folly::SharedMutexPolicyDefault> >::shared_lock (this=<optimized out>, __m=...) at fbcode/third-party-buck/platform010/build/libgcc/include/c++/trunk/shared_mutex:727
  #12 0x0000000002d765fd in folly::LockedPtr<folly::Synchronized<watchman::Publisher::state, folly::SharedMutexImpl<false, void, std::atomic, folly::SharedMutexPolicyDefault> > const, folly::detail::SynchronizedLockPolicy<(folly::detail::SynchronizedMutexLevel)2, (folly::detail::SynchronizedMutexMethod)0> >::doLock<folly::SharedMutexImpl<false, void, std::atomic, folly::SharedMutexPolicyDefault>, std::shared_lock<folly::SharedMutexImpl<false, void, std::atomic, folly::SharedMutexPolicyDefault> >, folly::detail::SynchronizedLockPolicy<(folly::detail::SynchronizedMutexLevel)2, (folly::detail::SynchronizedMutexMethod)0>, 0> (mutex=...) at buck-out/v2/gen/fbcode/110b607930331a92/folly/__synchronized__/headers/folly/Synchronized.h:1493
  #13 folly::LockedPtr<folly::Synchronized<watchman::Publisher::state, folly::SharedMutexImpl<false, void, std::atomic, folly::SharedMutexPolicyDefault> > const, folly::detail::SynchronizedLockPolicy<(folly::detail::SynchronizedMutexLevel)2, (folly::detail::SynchronizedMutexMethod)0> >::LockedPtr (this=0x7ffd2d5be968, parent=<optimized out>) at buck-out/v2/gen/fbcode/110b607930331a92/folly/__synchronized__/headers/folly/Synchronized.h:1272
  #14 folly::SynchronizedBase<folly::Synchronized<watchman::Publisher::state, folly::SharedMutexImpl<false, void, std::atomic, folly::SharedMutexPolicyDefault> >, (folly::detail::SynchronizedMutexLevel)2>::rlock (this=<optimized out>) at buck-out/v2/gen/fbcode/110b607930331a92/folly/__synchronized__/headers/folly/Synchronized.h:229
  #15 watchman::Publisher::hasSubscribers (this=<optimized out>) at fbcode/watchman/PubSub.cpp:117
  #16 0x0000000002ecac20 in watchman::Log::log<char const (&) [59]> (this=<optimized out>, level=level@entry=watchman::ABORT, args=...) at buck-out/v2/gen/fbcode/110b607930331a92/watchman/__logging__/headers/watchman/Logging.h:42
  #17 0x0000000002ec9b24 in watchman::log<char const (&) [59]> (level=watchman::ABORT, args=...) at buck-out/v2/gen/fbcode/110b607930331a92/watchman/__logging__/headers/watchman/Logging.h:121
  #18 (anonymous namespace)::terminationHandler () at fbcode/watchman/SignalHandler.cpp:165
  #19 0x00007f1c3b0c7b3a in __cxxabiv1::__terminate (handler=<optimized out>) at ../../.././libstdc++-v3/libsupc++/eh_terminate.cc:48
  #20 0x00007f1c3b0c7ba5 in std::terminate () at ../../.././libstdc++-v3/libsupc++/eh_terminate.cc:58
  #21 0x0000000002d8cde1 in std::thread::~thread (this=0x7f1c3ac2ef90) at fbcode/third-party-buck/platform010/build/libgcc/include/c++/trunk/bits/std_thread.h:152
  #22 0x00007f1c3b3cc8f8 in __run_exit_handlers (status=1, listp=0x7f1c3b598658 <__exit_funcs>, run_list_atexit=<optimized out>, run_dtors=<optimized out>) at exit.c:113
  #23 0x00007f1c3b3cca0a in __GI_exit (status=<optimized out>) at exit.c:143
  #24 0x00007f1c3b3b165e in __libc_start_call_main (main=0x2d11220 <main(int, char**)>, argc=2, argv=0x7ffd2d5bec78) at ../sysdeps/nptl/libc_start_call_main.h:74
  #25 0x00007f1c3b3b1718 in __libc_start_main_impl (main=0x2d11220 <main(int, char**)>, argc=2, argv=0x7ffd2d5bec78, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffd2d5bec68) at ../csu/libc-start.c:409
  #26 0x0000000002d0e181 in _start () at ../sysdeps/x86_64/start.S:116

Reviewed By: xavierd

Differential Revision: D40166374

fbshipit-source-id: 7017e20234e5e0a9532eb61a63ac49ac0020d443
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants