-
-
Notifications
You must be signed in to change notification settings - Fork 143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tree-sitter rolling fixes, 1.119 edition #1028
Tree-sitter rolling fixes, 1.119 edition #1028
Conversation
…so they end before the closing delimiter.
…meaning the `<?php` and `?>` tokens. Instead, ignore the redundant tokens and try to proceed. Fixes pulsar-edit#1032.
…because I made it worse the first time.
…and indent properly after `else`.
This one's ready for review! Only the PHP commit involves any JavaScript changes; the rest of the changes are in query files. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To the extent I know what's going on (see individual comments), looks good to me! Small is good! 🎆
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding 9a1c19a: Glad to see a fix! (Now that I've read the corresponding issue/comments.)
Seems legit!
Just to add something more substantive as commentary: Does often seem that throwing less often is the answer, in JS. Seems also that a suitable alternative was identified and taken. Nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding d7faa84: Not sure I understand the distinction nesting these differently in this file and adding the trailing @fold
marker, but the comments/commit message together make it sound like a good idea... (That's as much as I can say!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imagine this code in JS:
if (foo) {
bar();
}
Here's the corresponding Tree-sitter tree:
We want to fold everything between the {
and }
. Most Tree-sitter parsers are structured so that the thing you want to fold is a node equivalent to the statement_block
node here: it groups a bunch of things, but the first and last children are nearly always the anonymous nodes that act as delimiters. You'll find the same thing with arrays, literal object notation, and so on.
This is so common in Tree-sitter that we treat it as the default for folding purposes. If you do
(statement_block) @fold
in a folds.scm
, we interpret it to mean “fold from the end of the line on which statement_block
starts… up to the position at the start of the last child.” Because the last child is going to be a delimiter 99% of the time.
Python is a special case because it eschews so many delimiters in favor of indentation. So its folds.scm
makes a broad assumption that, when we fold blocks, we'll always want to fold up to the very end. Hence (#set! fold.endAt endPosition)
— tweaking the end position of the fold for everything listed in the first group.
But the assumption was a bit too broad. Tuples, lists, sets, and dictionaries all have delimiters, so they shouldn't have been included in the first group. The default behavior works just fine for these node types, so we just mark them with @fold
without customizing their end position.
(I try to explain these things even if they don't stick — just in case it lowers the bus factor even a little bit, and because PR comments are part of the public record.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding 0070e88:
I hadn't heard of this auto
keyword in C, so I got curious and looked it up. I have found this astoundingly funny StackOverflow answer to explain what the auto
keyword is about, at least in C. To any reading these review comments: Enjoy.
https://stackoverflow.com/a/2192761
(Alright, and some arguably more helpful links about how it can be somewhat actually useful in C++, apparently. https://stackoverflow.com/questions/7576953/what-is-the-meaning-of-the-auto-keyword ... https://learn.microsoft.com/en-us/cpp/cpp/auto-cpp?view=msvc-170)
@@ -3,6 +3,7 @@ | |||
"then" | |||
"do" | |||
"(" | |||
"else" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems legit and popular to indent after else
in bash
/sh
, depending on where in particular you put your newlines. Lotta flexibility in bash/sh
. I can't think of many times when I wouldn't want to newline+indent after an else
even if it might be valid to with the right (or debatably/subjectively: "wrong" per style preferences) semicolon usage.
Just commentary, I guess. I think we probably indent for stuff like this when it's mostly expected even if not strictly required by the language to do so. So this is probabl right as-is? Rambling a bit.
(Welcome to sh
, where the newlines sometimes and do and sometimes don't matter, and everything seems to be made up, but is no doubt documented in some obscure manual somewhere. /standup)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can tell, failing to include else
originally was just an oversight of mine.
@@ -169,6 +169,7 @@ | |||
] @keyword.operator.redirect.shell) | |||
|
|||
(test_operator) @keyword.operator.test.shell | |||
(unary_expression "!" @keyword.operator.unary.shell) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this have any relation to the entry on line 156?:
(negated_command "!" @keyword.operator.unary.shell)
Are these redundant, or it's something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's something else.
if [[ ! -z "foo" ]]; then
echo "aha"
fi
!$(foo)
First !
is (unary_expression "!")
; second !
is (negated_command "!")
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding b5f6635: I recall someone wanted these, great to see them implemented! 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This I'm out of my element looking at. Hopefully this is correct, I don't know how to read it. Given this was a round two after realizing round 1 wasn't working as intended, that hopefully means round 2 is working good.
Rubber-stamping this file on faith, heh.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There were a couple issues here:
-
It's really common to write something like
(foo "bar") @baz
in Tree-sitter query files when you really meant(foo "bar" @baz)
. The first one means “match anyfoo
that has an anonymous"bar"
node as a child.” The second one means “match the anonymous"bar"
node that is a direct child of afoo
.” That's why I had to come back for round 2. -
Round 1 was about another bad habit I have: referring to anonymous nodes without clarifying context. It's fine to refer to
"case"
casually in JS because there's only one scenario in which you'd find an anonymous"case"
node: aswitch
statement. But"default"
is found in more places! It should've occurred to me that it'd match the"default"
node in stuff likeexport default function foo() {
.So I rewrite these queries to match only the
"case"
and"default"
we want (being super paranoid with"case"
just to be thorough).
The @match
capture itself is more complicated, especially with the#set!
predicates and the config flag. Luckily, I don't have to bore you with the details in this case. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the explanation! I hope that these trails of conversations in these Tree-sitter rolling fixes PRs will lead anyone to what they need to know if they're hoping to pick up this kind of knowledge, and I consider it a long-term goal to be reasonably versed in this well enough to start properly self-teaching it or asking questions of the tree-sitter
dev community.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Providing my rubber stamp as it were. This is not my area of expertise, but with really clear comments about the change in behavior seems valid to me.
Basically resulting in us just accepting some invalid syntax and continuing on as if we hadn't seen it, which seems like a valid strategy to me. As really the syntax highlighting is meant to help the user as much as possible, while not ensuring everything is valid.
Now personally I do appreciate syntax highlighting failing, as it alerts me of a mistake I've made, but in this case, as far as I'm aware and recall, this was requested due to usage in some type of templating environment, so in situations like that, this makes perfect sense to me
These PRs contain fewer and fewer fixes each month. This is a good sign.