-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Preliminary implementation of a tree-sitter backend #471
Conversation
Rev: Removed usage of |
I think the primary thing that appears to be missing is support for layouting, for which we'd have to generate custom scanners. |
Yes, layouting features would require a custom scanner as well. I think that would be an easier starting point for introducing custom scanners for this backend. I would look into it in my spare time. That being said, the current state of the branch already satisfies my current needs thus it would probably be a lower priority item on my list. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution!
There is end-to-end testing for the other backends in the testing
project (unfortunately not run by CI), how would you go about integrating the tree-sitter backend there?
The matrix for these system tests is set up here:
bnfc/testing/src/ParameterizedTests.hs
Lines 348 to 359 in 4272412
parameters :: [TestParameters] | |
parameters = concat | |
[ [] | |
-- OCaml/Menhir | |
, [ ocaml { tpName = "OCaml/Menhir" | |
, tpBnfcOptions = ["--ocaml", "--menhir"] } | |
] | |
-- OCaml | |
, [ ocaml ] | |
-- Functor (Haskell & Agda) | |
, [ haskellAgdaFunctorParameters] | |
-- C++ (extras) |
I just added the parameters for testing tree-sitter backend to the testing parameter matrix. Currently, the backend is only generating Unfortunately, there are a number of test failures. I will use next week to try to fix these failures. Before that, I will mark this PR as draft. |
I am closing this PR since the main commits of it has already been merged (Ref: f1c5101) A few issues are emerging and related larger changes are made to the tree-sitter backend when I am working on the tests. I will include them in a separate PR. |
I have not recollection of merging the branch but maybe I did it by accident when I checkout out your PR. :$ |
Generates
grammar.js
file that could be used with tree-sitter to generate parsers with many different languages. Most of the functionalities of BNFC should be already implemented and working.There are two remaining issues/features that are not implemented yet:
tree-sitter
does not distinguish between a lexer and a parser, this created issues in their keyword recognition (documented here). They use a keyword extraction mechanism to list token rules that could accept a keyword explicitly.The only way to do this cleanly from the BNFC side is to run all user-defined token patterns against all keywords in the input grammar to determine which patterns should be listed to do keyword extraction. Unfortunately, there are no regex facilities in BNFC to do this currently and an external regex engine is needed.
This branch currently just puts all user-defined token patterns into the keyword extraction list for safety. The performance impact of such practice is unknown. How should BNFC approach this problem should be discussed further.
tree-sitter
does not support general set differences. However,tree-sitter
supports a mechanism known as external scanners that could be used to implement this. This is not implemented yet.These are documented in the source code using
TODO
comments as well.Should help #193 and #394