-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make the module context aware #184
Conversation
290953b
to
44bd12a
Compare
The napi PRs are outdated, and didn't even build for me, so I did this on top of master, Either this or the napi PRs should be merged first, then I can go ahead and either redo this on top of napi or redo the napi work on top of this. The napi PRs also still use NaN/v8 APIs which means they won't work as true napi until those are removed. |
You could make a new branch that starts from |
Well yeah. But I'm not sure from which napi branch or PR to start from, or if I should start from scratch, which is quite a bit of work. As the current napi branches/PRs are not up-to-date...? And also what to do about the remaining issues in those branches, that is, v8 API usage. |
#129 is the most recent work and I think it doesn't use NAN. If we merge this PR will the old compiled language parsers still work or is it completely backwards incompatible until we make that template change and re-compile? |
The napi PR tries to maintain backwards compat by still using v8 API via the util file for grabbing the language. Which makes it backwards compatible but also not a true napi module, which means it won't really reap the benefit of ABI compatibility as long as that's there, it must only use napi for it to truly be a napi module. But I'm not sure if napi is capable of grabbing an internal pointer from a JS object made by v8, I think you must use a napi external, which means changing the language modules. To try to keep them supporting both, the PR tries to add a napi external as an In contrast this PR is standalone and backwards compatible, it can be merged regardless of napi, or it can be redone on top of napi now or later, but to truly use it thread safe you do still need go apply the other change to the language modules, or they will fail to import in threads. So what strategy you think we should take? |
I checked that this is indeed backwards compatible (as in, still works with the old compiled languages) and doesn't make the bindings slower using mostly the instructions from the wiki article like this: mkdir /tmp/tree-sitter-test
cd /tmp/tree-sitter-test
npm init -y
# add `"type": "module",` to package.json
npm install tree-sitter tree-sitter-cli tree-sitter-python
npx tree-sitter build-wasm node_modules/tree-sitter-python git clone https://github.com/python/cpython.git
#cd cpython && checkout f508800 && cd ..
find cpython/Lib/ -type f -name "*.py" -exec cat {} + > sample_python_code.py Then created main.js import Parser from "tree-sitter";
import Python from "tree-sitter-python";
import fs from "fs";
const parser = new Parser();
parser.setLanguage(Python);
const file = fs.readFileSync("./sample_python_code.py", "utf8");
const start = process.hrtime.bigint();
parser.parse(file);
const end = process.hrtime.bigint();
console.log(`node-tree-sitter parse time: ${(end - start) / 1000000n}ms`); ran it, then changed it to use this PR like this cd /tmp
git clone https://github.com/segevfiner/node-tree-sitter.git
cd node-tree-sitter
git checkout 779e0d68a8590f8cf00dc413f715234cd5f5b7a8
npm install
npm run build Then back in the original tree-sitter-test/ directory make this edit to package.json 14c14
< "tree-sitter": "^0.20.6",
---
> "tree-sitter": "file://tmp/node-tree-sitter", then rm -rf node_modules
npm install and run main.js again and got the same performance (4 seconds) I also tried doing 14c14
< "tree-sitter": "^0.20.6",
---
> "tree-sitter": "git://github.com/segevfiner/node-tree-sitter.git#779e0d68a8590f8cf00dc413f715234cd5f5b7a8", but I get this build error when running
The only difference I see is So anyway, since we don't have to recompile all the languages we just have to do that if we want to take advantage of these changes, I think we're good to merge this then unless @maxbrunsfeld you have any objections? |
@verhovsky Might be an issue with Git dependencies in npm not cloning the submodule? |
Just to be clear, I'm quite willing to help also do this for napi, help get napi merging, and add other improvements that I have to the Node.js grammars generation, etc. But there needs to be some sign of life in that the project is merging PRs first, or the work will just go to waste going stale, like what happened to the current napi PR... I haven't seen @maxbrunsfeld responding for quite a while to tree-sitter related stuff, hope he didn't tire our from the project entirely... |
Hey, I've been MIA a while as a maintainer - I'm back now though and I'd love to collaborate on that. I was planning to after #163, but we can definitely just sidestep that and move straight to using napi. That's quite a breaking issue though for every grammar, so I'd like to coordinate when/how to do that. Are you on Matrix or Discord? @verhovsky I'm sure you'd like to join in as well, feel free to Currently working out some kinks in core and I plan to cut a release (0.20.9) very soon. |
If you decide to migrate to napi first, if and when you do have an up-to-date napi branch or have merged it, let me know, and I can try to redo this change on top of it. |
@segevfiner can you rebase onto master and i'll merge it |
779e0d6
to
8cddf18
Compare
@verhovsky Rebased |
We will also need tree-sitter/tree-sitter#2841 to make the actual languages context aware too. Now, I can try and redo or merge into the napi work, but we do need a decision about how to handle the grammars in napi, since to make this a true napi module we have to not use v8/NaN APIs which means we will also need to convert the languages to napi at the same time, which is obviously a breaking change as they will only work with a napi version of the module. |
Thanks @segevfiner, nice work. |
This makes this binding context aware, while still using v8/NaN/Node APIs directly, by moving all global data into an object, and removing global external buffers usage, which leads to seg faults/access violations on unloading the module.
The language bindings will also need to be made context aware which is done here tree-sitter/tree-sitter#2841, and will need to be applied to each language binding.
Dropped supported for some ancient EOL Node.js and Electron versions so we don't have to ifdef for those.
Contributed on behalf of Swimm