Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

terminated by signal SIGSEGV (Address boundary error) #72

Closed
talbergs opened this issue Dec 12, 2020 · 17 comments · Fixed by #81
Closed

terminated by signal SIGSEGV (Address boundary error) #72

talbergs opened this issue Dec 12, 2020 · 17 comments · Fixed by #81

Comments

@talbergs
Copy link

I am no expert in nodejs - this happens to me when running larger queries. Is this a bug or this means I should do some sort special nodejs configuration?

@talbergs
Copy link
Author

Few last lines of node --trace ./app.js run

   7:       ~get+0(this=0x2ddfdaa47eb9 <Object map = 0xcf571460d61>) {
   7:       } -> 0x0f3686113a21 <Object map = 0xcf57145bd81>
   7:       ~query+0(this=0x0f0cf6de6239 <MainContext map = 0xcf571467019>, 0x1f06e37f4519 <String[#3]: php>, 0x0136007fb1b1 <String[#342]\: \n      (\n        expression_statement (\n          assignment_expression\n          left: (variable_name (name) @var-name)\n        )\n      )\n      (\n        function_call_expression\n        function: (\n          qualified_name (name) @fn-name\n        )\n        arguments: (\n          arguments (variable_name (name) @paa)\n        )\n      )\n    >) {
   8:        ~getSyntax+0(this=0x0f0cf6de6239 <MainContext map = 0xcf571467019>, 0x1f06e37f4519 <String[#3]: php>) {
   9:         ~getSource+0(this=0x0f0cf6de5e09 <FileContext map = 0xcf571466e69>) {
  10:          ~getBuffer+0(this=0x0f0cf6de5e09 <FileContext map = 0xcf571466e69>) {
  10:          } -> 0x0f0cf6de5cb9 <Uint8Array map = 0x39d010521199>
  10:          ~toString+3(this=0x0f0cf6de5cb9 <Uint8Array map = 0x39d010521199>, 0x0c0aaa780471 <undefined>, 0x0c0aaa780471 <undefined>, 0x0c0aaa780471 <undefined>) {
  10:          } -> 0x0f0cf6de65e1 <String[169]\: <?php\n$fields = [\n"000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000"\n];\ncheck_fields($fields);>
   9:         } -> 0x0f0cf6de65e1 <String[169]\: <?php\n$fields = [\n"000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000"\n];\ncheck_fields($fields);>
   9:         new ~Parser+0(this=0x0f0cf6de6859 <Parser map = 0xcf571467061>, 0x0f0cf6de65e1 <String[169]\: <?php\n$fields = [\n"000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000"\n];\ncheck_fields($fields);>, 0x0f0cf6dca499 <Object map = 0xcf571465ea9>) {
  10:          ~loadChain+0(this=0x0f0cf6de6859 <Parser map = 0xcf5714671c9>, 0x0f0cf6dca499 <Object map = 0xcf571465ea9>, 0x0c0aaa780471 <undefined>) {
  11:           ~parse+0(this=0x0f0cf6de6859 <Parser map = 0xcf5714671c9>, 0x1f06e37f4519 <String[#3]: php>, 0x0c0aaa780471 <undefined>) {
  12:            ~Parser.setLanguage+0(this=0x0f0cf6de68e9 <Parser map = 0xcf571457461>, 0x0f36861160c9 <Language map = 0xcf571459ae9>) {
  13:             ~initializeLanguageNodeClasses+20(this=0x3ac530682409 <JSGlobal Object>, 0x0f36861160c9 <Language map = 0xcf571459ae9>) {
  13:             } -> 0x0c0aaa780471 <undefined>
  12:            } -> 0x0f0cf6de68e9 <Parser map = 0xcf571467211>
  12:            ~Parser.parse+0(this=0x0f0cf6de68e9 <Parser map = 0xcf571467211>, 0x0f0cf6de65e1 <String[169]\: <?php\n$fields = [\n"000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000"\n];\ncheck_fields($fields);>, 0x0c0aaa780471 <undefined>, 0x0c0aaa780471 <undefined>) {
  13:             ~input+0(this=0x3ac530682409 <JSGlobal Object>, 0, 0x0f0cf6df6e39 <Object map = 0xcf571467331>) {
  13:             } -> 0x0f0cf6de65e1 <String[169]\: <?php\n$fields = [\n"000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000"\n];\ncheck_fields($fields);>
  13:             ~input+0(this=0x3ac530682409 <JSGlobal Object>, 169, 0x0f0cf6df6ee9 <Object map = 0xcf571467331>) {
  13:             } -> 0x0c0aaa7806d1 <String[#0]: >
  13:             ~Parser.getLanguage+0(this=0x0f0cf6de68e9 <Parser map = 0xcf571467211>, 0x0c0aaa780471 <undefined>) {
  13:             } -> 0x0f36861160c9 <Language map = 0xcf5714672a1>
  12:            } -> 0x0f0cf6df6f41 <Tree map = 0xcf571467451>
  11:           } -> 0x0f0cf6df6f41 <Tree map = 0xcf571467451>
  10:          } -> 0x0c0aaa780471 <undefined>
   9:         } -> 0x0c0aaa780471 <undefined>
   9:         ~get+0(this=0x0f0cf6de6859 <Parser map = 0xcf5714671c9>, 0x1f06e37f4519 <String[#3]: php>) {
  10:          ~get+0(this=0x0f0cf6df6f41 <Tree map = 0xcf571467451>) {
  11:           ~unmarshalNode+0(this=0x3ac530682409 <JSGlobal Object>, 154, 0x0f0cf6df6f41 <Tree map = 0xcf571467451>, 0x0c0aaa780471 <undefined>, 0x0c0aaa780471 <undefined>) {
  12:            ~getID+0(this=0x3ac530682409 <JSGlobal Object>, 0x21018cdb07d1 <Uint32Array map = 0x39d010500f31>, 0) {
  12:            } -> 0x0f0cf6df72c1 <BigInt 93906111597824>
  12:            new ~SyntaxNode+0(this=0x0f0cf6df72e1 <SyntaxNode map = 0xcf5714674e1>, 0x0f0cf6df6f41 <Tree map = 0xcf571467451>) {
  12:            } -> 0x0c0aaa780471 <undefined>
  11:           } -> 0x0f0cf6df72e1 <SyntaxNode map = 0xcf571467571>
  10:          } -> 0x0f0cf6df72e1 <SyntaxNode map = 0xcf571467571>
   9:         } -> 0x0f0cf6df72e1 <SyntaxNode map = 0xcf571467571>
   8:        } -> 0x0f0cf6df72e1 <SyntaxNode map = 0xcf571467571>
   8:        ~Query._init+0(this=0x0f0cf6df7bf1 <Query map = 0xcf571457971>) {

@maxbrunsfeld
Copy link
Contributor

Thanks for the report. I think it’s a bug. What language are you parsing? Is the grammar open source? It’d be great to get a reproducible script that causes this.

@talbergs
Copy link
Author

talbergs commented Dec 16, 2020

I parse php using this grammar

tree-sitter-php@^0.16.2:
  version "0.16.2"
  resolved "https://registry.yarnpkg.com/tree-sitter-php/-/tree-sitter-php-0.16.2.tgz#15c48dbd44cc56c4660d48ef883c9fc0f3e0d35b"
  integrity sha512-BkewhybED1xRQkDpmXkjpBZ1OdnWcmjyu3tGRsoiFqbeNeDtWxac2wzpBdkQr+aSUKlJoCkpBFytM1KcQa5SoA==
  dependencies:
    nan "^2.14.0"

[EDIT] For the record, tree-sitter-node version in use:

tree-sitter@^0.17.1:
  version "0.17.1"
  resolved "https://registry.yarnpkg.com/tree-sitter/-/tree-sitter-0.17.1.tgz#821c5a4ac1afdb623d63f5ffc7916663e732a95c"
  integrity sha512-obIe804bwfAGFMhTjQz0NXF75GDupCVXo7Sv0NVVdA3s/Q4ZI4mdirIN8cpw6bVhz/K1qgUdEuI3SEoOE/q75A==
  dependencies:
    nan "^2.14.0"
    prebuild-install "^5.0.0"
> node --version
v14.8.0

@talbergs
Copy link
Author

This snippet currently reproduces the error. Delete last element from "keywords" array and error goes away.

const Parser = require('tree-sitter')
const PHP = require('tree-sitter-php')

const parser = new Parser()
parser.setLanguage(PHP)

const tree = parser.parse('<?php //')

const keywords = [
  'empty_statement',
  'named_label_statement',
  'expression_statement',
  'if_statement',
  'switch_statement',
  'while_statement',
  'do_statement',
  'for_statement',
  'foreach_statement',
  'goto_statement',
  'continue_statement',
  'break_statement',
];

const query = keywords.reduce((prev, curr) => {
  return prev + `(${curr}) @statement`
}, '');

(new Parser.Query(PHP, query)).matches(tree.rootNode)

@maxbrunsfeld
Copy link
Contributor

I can't reproduce the problem using this script. What platform are you on?

@talbergs
Copy link
Author

> uname -a
Linux hoste 5.8.3-arch1-1 #1 SMP PREEMPT Fri, 21 Aug 2020 16:54:16 +0000 x86_64 GNU/Linux

Maybe try a bit larger query on your machine, like:

const keywords = [
	'empty_statement',
	'compound_statement',
	'named_label_statement',
	'expression_statement',
	'if_statement',
	'switch_statement',
	'while_statement',
	'do_statement',
	'for_statement',
	'foreach_statement',
	'goto_statement',
	'continue_statement',
	'break_statement',
	'return_statement',
	'throw_statement',
	'try_statement',
	'declare_statement',
	'echo_statement',
	'unset_statement',
	'const_declaration',
	'function_definition',
	'class_declaration',
	'interface_declaration',
	'trait_declaration',
	'namespace_definition',
	'namespace_use_declaration',
	'global_declaration',
	'function_static_declaration',
];

@maxbrunsfeld
Copy link
Contributor

Hmm, I still can't reproduce it. I also tried repeating the entire keywords list until it was ~500 lines long, and substituting some larger PHP source code for the text. Still runs ok on macOS.

If you get a chance, could you rebuild the tree-sitter module in debug mode, and run this script with a debugger?

To rebuild the module:

npm install -g node-gyp
cd node_modules/tree-sitter
node-gyp rebuild --debug

Then, to run:

lldb node -- test.js

Since the trace ends at Query._init, I would recommend setting a breakpoint in Query::GetPredicates, which is the only native function called by _init:

(lldb) breakpoint set -n Query::GetPredicates
(lldb) run

@talbergs
Copy link
Author

talbergs commented Dec 17, 2020

Thank you for helping me out!
Here is the debugging session for the snippet from above:

05:11:00, /tmp/preview
> lldb node -- index.js
(lldb) target create "node"
Current executable set to 'node' (x86_64).
(lldb) settings set -- target.run-args  "index.js"
(lldb) breakpoint set -n Query::GetPredicates
Breakpoint 1: no locations (pending).
WARNING:  Unable to resolve breakpoint to any actual locations.
(lldb) run
Process 107097 launched: '/usr/bin/node' (x86_64)
1 location added to breakpoint 1
Process 107097 stopped
* thread #1, name = 'node', stop reason = breakpoint 1.1
    frame #0: 0x00007ffff46678a4 tree_sitter_runtime_binding.node`node_tree_sitter::Query::GetPredicates(info=0x00007fffffffc3d0) at query.cc:145:48
   142 	}
   143 	
   144 	void Query::GetPredicates(const Nan::FunctionCallbackInfo<Value> &info) {
-> 145 	  Query *query = Query::UnwrapQuery(info.This());
   146 	  auto ts_query = query->query_;
   147 	
   148 	  auto pattern_len = ts_query_pattern_count(ts_query);
(lldb) n
Process 107097 stopped
* thread #1, name = 'node', stop reason = step over
    frame #0: 0x00007ffff46678d5 tree_sitter_runtime_binding.node`node_tree_sitter::Query::GetPredicates(info=0x00007fffffffc3d0) at query.cc:146:8
   143 	
   144 	void Query::GetPredicates(const Nan::FunctionCallbackInfo<Value> &info) {
   145 	  Query *query = Query::UnwrapQuery(info.This());
-> 146 	  auto ts_query = query->query_;
   147 	
   148 	  auto pattern_len = ts_query_pattern_count(ts_query);
   149 	
(lldb) n
Process 107097 stopped
* thread #1, name = 'node', stop reason = step over
    frame #0: 0x00007ffff46678e1 tree_sitter_runtime_binding.node`node_tree_sitter::Query::GetPredicates(info=0x00007fffffffc3d0) at query.cc:148:44
   145 	  Query *query = Query::UnwrapQuery(info.This());
   146 	  auto ts_query = query->query_;
   147 	
-> 148 	  auto pattern_len = ts_query_pattern_count(ts_query);
   149 	
   150 	  Local<Array> js_predicates = Nan::New<Array>();
   151 	
(lldb) s
Process 107097 stopped
* thread #1, name = 'node', stop reason = step in
    frame #0: 0x00007ffff46830cc tree_sitter_runtime_binding.node`ts_query_pattern_count(self=0x0000000000000000) at query.c:2077:24
   2074	}
   2075	
   2076	uint32_t ts_query_pattern_count(const TSQuery *self) {
-> 2077	  return self->patterns.size;
   2078	}
   2079	
   2080	uint32_t ts_query_capture_count(const TSQuery *self) {
(lldb) s
Process 107097 stopped
* thread #1, name = 'node', stop reason = signal SIGSEGV: invalid address (fault address: 0x78)
    frame #0: 0x00007ffff46830d0 tree_sitter_runtime_binding.node`ts_query_pattern_count(self=0x0000000000000000) at query.c:2077:24
   2074	}
   2075	
   2076	uint32_t ts_query_pattern_count(const TSQuery *self) {
-> 2077	  return self->patterns.size;
   2078	}
   2079	
   2080	uint32_t ts_query_capture_count(const TSQuery *self) {
(lldb) s
Process 107097 stopped
* thread #1, name = 'node', stop reason = unknown crash reason
    frame #0: 0x00007ffff46830d0 tree_sitter_runtime_binding.node`ts_query_pattern_count(self=0x0000000000000000) at query.c:2077:24
   2074	}
   2075	
   2076	uint32_t ts_query_pattern_count(const TSQuery *self) {
-> 2077	  return self->patterns.size;
   2078	}
   2079	
   2080	uint32_t ts_query_capture_count(const TSQuery *self) {
(lldb) s
Process 107097 exited with status = 11 (0x0000000b) 
(lldb) s
error: invalid thread
(lldb) 

Also, during issue (once you said you cannot reproduce), I updated nodejs 14 -> 15 and clang 10 -> 11
Since then, error now alternates between the SIGSEGV (more often) and Query error (less often):

05:15:57, last:139, /tmp/preview
> node index.js
fish: “node index.js” terminated by signal SIGSEGV (Address boundary error)
05:15:58, last:139, /tmp/preview
> node index.js
/tmp/preview/index.js:28
(new Parser.Query(PHP, query)).matches(tree.rootNode)
 ^

Error: Query error of type TSQueryErrorNodeType at position 317
    at Object.<anonymous> (/tmp/preview/index.js:28:2)
    at Module._compile (node:internal/modules/cjs/loader:1108:14)
    at Object.Module._extensions..js (node:internal/modules/cjs/loader:1137:10)
    at Module.load (node:internal/modules/cjs/loader:973:32)
    at Function.Module._load (node:internal/modules/cjs/loader:813:14)
    at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:76:12)
    at node:internal/main/run_main_module:17:47

@talbergs
Copy link
Author

So we mark this as bug ?

@cellog
Copy link

cellog commented Jan 29, 2021

I have been able to reliably reproduce a seg fault by passing in a single string to a query instead of an S expression, as in new Query(Ruby, "oops");

I added this to the top of my test:

var SegfaultHandler = require('segfault-handler');
SegfaultHandler.registerHandler("crash.log");
➜  beacon-scripts git:(DASHI-677-linter-part-3) ✗ yarn test nDriver
yarn run v1.19.1
$ jest nDriver

 RUNS  packages/i18n/src/drivers/translationDriverTests.test.js
PID 30110 received SIGSEGV for address: 0x78
0   segfault-handler.node               0x00000001046bbfb0 _ZL16segfault_handleriP9__siginfoPv + 304
1   libsystem_platform.dylib            0x00007fff7043a5fd _sigtramp + 29
2   ???                                 0x0000000000000000 0x0 + 0
3   tree_sitter_runtime_binding.node    0x0000000104e627bd _ZN16node_tree_sitter5Query13GetPredicatesERKN3Nan20FunctionCallbackInfoIN2v85ValueEEE + 61
4   tree_sitter_runtime_binding.node    0x0000000104e5974d _ZN3Nan3impL23FunctionCallbackWrapperERKN2v820FunctionCallbackInfoINS1_5ValueEEE + 189
5   node                                0x0000000100257af8 _ZN2v88internal25FunctionCallbackArguments4CallENS0_15CallHandlerInfoE + 616
6   node                                0x000000010025708c _ZN2v88internal12_GLOBAL__N_119HandleApiCallHelperILb0EEENS0_11MaybeHandleINS0_6ObjectEEEPNS0_7IsolateENS0_6HandleINS0_10HeapObjectEEESA_NS8_INS0_20FunctionTemplateInfoEEENS8_IS4_EENS0_16BuiltinArgumentsE + 524
7   node                                0x00000001002567f2 _ZN2v88internalL26Builtin_Impl_HandleApiCallENS0_16BuiltinArgumentsEPNS0_7IsolateE + 258
8   node                                0x0000000100a6fbb9 Builtins_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_BuiltinExit + 57

@cellog
Copy link

cellog commented Jan 29, 2021

Also note that I arrived here because of jestjs/jest#8769 (comment)

I have a more pernicious seg fault that is ONLY triggered in a Docker environment, which has more limited memory. It's possible that this bug is separate from the one above, and is triggered by an out-of-memory error in Query.

@cellog
Copy link

cellog commented Jan 29, 2021

Reproduce case:

  1. using the Ruby language
  2. try initialize this query:
    (assignment
      left: (_) @var
      right: (method_call
        method: (
          call receiver: (constant) @class
          (#eq? @class "I18n")
          method: (identifier) @method
          (#eq? @method "namespace")
        )
        arguments: (
          argument_list (
            (string) @namespace
          )
        )
      )
    )

as in

new Query(Ruby, `    (assignment
      left: (_) @var
      right: (method_call
        method: (
          call receiver: (constant) @class
          (#eq? @class "I18n")
          method: (identifier) @method
          (#eq? @method "namespace")
        )
        arguments: (
          argument_list (
            (string) @namespace
          )
        )
      )
    )
`)

but ONLY in a Docker context, and only inside a jest test. Note that the seg fault occurs in lib_pthread, so this is a threading issue. I am almost certain the NAPI PR will fix this.

const Parser = require("tree-sitter");
const Ruby = require("tree-sitter-ruby");
const { Query } = Parser;

  describe("createQuery", () => {
    it("?", () => {
      new Query(Ruby, `
      (assignment
        left: (_) @var
        right: (method_call
          method: (
            call receiver: (constant) @class
            (#eq? @class "I18n")
            method: (identifier) @method
            (#eq? @method "namespace")
          )
          arguments: (
            argument_list (
              (string) @namespace
            )
          )
        )
      )
      `
      );
  });
});

@cellog
Copy link

cellog commented Jan 29, 2021

further context: the lines

            call receiver: (constant) @class
            (#eq? @class "I18n")
            method: (identifier) @method
            (#eq? @method "namespace")

are responsible. If I remove either

            method: (identifier) @method
            (#eq? @method "namespace")

or

            call receiver: (constant) @class
            (#eq? @class "I18n")

OR remove both predicates (the (#eq?

then the seg fault disappears.

@talbergs
Copy link
Author

talbergs commented Feb 6, 2021

Thank you @cellog!

I am almost certain the NAPI PR will fix this

What is NAPI PR?

@cellog
Copy link

cellog commented Feb 7, 2021

#52

@talbergs
Copy link
Author

talbergs commented Mar 11, 2021

I wanted to confirm that actually #81 solves this issue for me, and confirm closing this thread, but building the master branch I get this error:

make: Entering directory '/home/ada/any-style-new/any-style/node_modules/tree-sitter/build'
make: *** No rule to make target 'Release/obj.target/tree_sitter/vendor/tree-sitter/lib/src/lib.o', needed by 'Release/obj.target/tree_sitter.a'.  Stop.
make: Leaving directory '/home/ada/any-style-new/any-style/node_modules/tree-sitter/build'
gyp ERR! build error 
gyp ERR! stack Error: `make` failed with exit code: 2
gyp ERR! stack     at ChildProcess.onExit (/home/ada/any-style-new/any-style/node_modules/tree-sitter/node_modules/node-gyp/lib/build.js:194:23)
gyp ERR! stack     at ChildProcess.emit (node:events:378:20)
gyp ERR! stack     at Process.ChildProcess._handle.onexit (node:internal/child_process:290:12)
gyp ERR! System Linux 5.10.13-arch1-1
gyp ERR! command "/usr/bin/node" "/home/ada/any-style-new/any-style/node_modules/tree-sitter/node_modules/.bin/node-gyp" "rebuild"
gyp ERR! cwd /home/ada/any-style-new/any-style/node_modules/tree-sitter
gyp ERR! node -v v15.11.0
gyp ERR! node-gyp -v v6.1.0
gyp ERR! not ok 

This time again I am not sure if that is my fault or not.

When would the #81 be released?

@MichaelBelousov
Copy link

@talbergs

FWIW and in case anyone else runs into this, make sure to clone with submodules:

git clone --recurse-submodules

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants