Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrap class, module, method, and block bodies in a named node #224

Merged
merged 14 commits into from
Aug 15, 2022

Conversation

npezza93
Copy link
Contributor

@npezza93 npezza93 commented Jul 17, 2022

A majority of other languages I noticed wrap class and function bodies in their own named node that can be easily queried(checked javascript, Lua, java, etc). Without this named node it makes targeting the contents of these structures difficult, cumbersome, and realllyyyy slow.

Checklist:

  • All tests pass in CI.
  • There are sufficient tests for the new fix/feature.
  • Grammar rules have not been renamed unless absolutely necessary.
  • The conflicts section hasn't grown too much.
  • The parser size hasn't grown too much (check the value of STATE_COUNT in src/parser.c).

@aryx aryx requested a review from aibaars July 18, 2022 08:57
Copy link
Contributor

@aibaars aibaars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think wrapping class/module/method/(do-)block/begin bodies as a named node makes sense. However, I'm not too keen on having different names for each one of them. In Ruby, the syntax of a class, method or do-block body is the same.

Therefore, I think having one name to wrap all bodies is better.

I noticed that you added a field body in one of the commits but later removed it again. I think adding body fields would actually be quite nice. Would your use-case be addressed by just adding body fields without wrapping the body statements in a named node?

@npezza93
Copy link
Contributor Author

@aibaars Yea I originally had that but it made targeting just the class/module/method, etc. body hard because it was also capturing all other bodies that were a child and such. I reverted back to having just the body named node like before for everything but block but wrapped them in a named field so specific targeting is still preserved. Let me know what you think!

grammar.js Outdated Show resolved Hide resolved
grammar.js Outdated Show resolved Hide resolved
grammar.js Show resolved Hide resolved
grammar.js Outdated
@@ -157,7 +159,7 @@ module.exports = grammar({
seq(
field('parameters', alias($.parameters, $.method_parameters)),
choice(
seq(optional($._terminator), $._body_statement),
seq(optional($._terminator), optional(field("method_body", $.body)), 'end'),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use the field name 'body' instead of including the type of the enclosing node. Same for do_block , class, module, singleton_class, etc.

I can see that having namespace_body makes it easy to capture the bodies of all classes/modules/singleton classes. However, this causes problems for tools that generate data models based on the grammar. For example https://github.com/github/codeql/blob/main/ruby/ql/lib/codeql/ruby/ast/internal/TreeSitter.qll#L1104-L1124 would end up having things likeMethod::getMethodBody() and DoBlock::getDoBlockBody() which look unnecessarily repetitive.

It might be worth trying if defining a rule named _namespace: $ => choice($.module, $.class, $.singleton_class) and adding _namespace to the supertypes list. I don't know much about the queries you are writing, but hopefully the query syntax allows selecting the body field of all nodes of type _namespace.

grammar.js Outdated Show resolved Hide resolved
grammar.js Outdated Show resolved Hide resolved
@aibaars
Copy link
Contributor

aibaars commented Jul 25, 2022

@npezza93 I don't understand 9f6775c . Without that commit this PR looks good to me.

@maxbrunsfeld
Copy link
Contributor

Without that commit this PR looks good to me

I agree, we should revert that revert; the fields should all just be body.

One other change I would suggest is to rename body_statement to something without _statement at the end. To me, the name body_statement makes it sound like this is a type of statement, but I don't think it is. What do you think of just calling it body?

@aibaars
Copy link
Contributor

aibaars commented Jul 25, 2022

One other change I would suggest is to rename body_statement to something without _statement at the end. To me, the name body_statement makes it sound like this is a type of statement, but I don't think it is. What do you think of just calling it body?

@maxbrunsfeld I had suggested the name body_statement, because it matches the name used in Ruby's parse.y (ie body_stmt). In the grammar of the Ruby interpreter a body_stmt is a "body" that may have rescue and ensure clauses. The tree-sitter rule for this type of body already had this name with an _ before this PR.

@maxbrunsfeld
Copy link
Contributor

Ah yeah, I see Ripper calls it that. Good point. Let's keep it.

@olimorris
Copy link

@npezza93 are we ready this awesome pull?

@npezza93
Copy link
Contributor Author

I'll finish this up this weekend

@npezza93
Copy link
Contributor Author

npezza93 commented Aug 15, 2022

@aibaars So changed to use a general body name but I don't think it is working correctly. Using a query of (method) body: (body_statement) @function.inner it seems to be selecting the entirety of inside the class instead of the insides of each method. Maybe my query is wrong but this wasn't the case when they had the different name. Any ideas?

@aibaars
Copy link
Contributor

aibaars commented Aug 15, 2022

@npezza93 I know very little about the query language. @dcreager could you have a look at what's going wrong here?

Copy link
Contributor

@aibaars aibaars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. Thanks a lot @npezza93 !

@maxbrunsfeld could you have a look at this too, please?

@maxbrunsfeld
Copy link
Contributor

Maybe my query is wrong but this wasn't the case when they had the different name. Any ideas?

I think the query would need to have the body field inside of the method node:

(method
  body: (body_statement) @function.inner)

@npezza93 Does that work for you?

@npezza93
Copy link
Contributor Author

@maxbrunsfeld Yep, totally works! Thanks!

@maxbrunsfeld
Copy link
Contributor

Ok, so just checking - this PR is done and working as intended, right?

@npezza93
Copy link
Contributor Author

@maxbrunsfeld Yep! This is ready to go

@maxbrunsfeld maxbrunsfeld merged commit 4c600a4 into tree-sitter:master Aug 15, 2022
@maxbrunsfeld
Copy link
Contributor

Great, thanks for this @npezza93.

@olimorris
Copy link

Thanks @npezza93 and everyone involved. This makes writing Ruby queries so much easier. Some of the things we're doing in Neovim with Treesitter are beyond what I ever thought we'd see.

What are the time frames for it to be reflected in the Treesitter Playground?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants